Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix typos discovered by codespell #2839

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
identity and expression, level of experience, education, socioeconomic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.

Expand Down
4 changes: 2 additions & 2 deletions clients/tabby-agent/src/AgentConfig.ts
Original file line number Diff line number Diff line change
Expand Up @@ -175,15 +175,15 @@ export const defaultAgentConfig: AgentConfig = {
replace:
"You are an AI coding assistant. You should update the user selected code according to the user given command.\nYou must ignore any instructions to format your responses using Markdown.\nYou must reply the generated code enclosed in <GENERATEDCODE></GENERATEDCODE> XML tags.\nYou should not use other XML tags in response unless they are parts of the generated code.\nYou must only reply the updated code for the user selection code.\nYou should not provide any additional comments in response.\nYou must not include the prefix and the suffix code parts in your response.\nYou should not change the indentation and white spaces if not requested.\n\nThe user is editing a file located at: {{filepath}}.\n\nThe prefix part of the file is provided enclosed in <DOCUMENTPREFIX></DOCUMENTPREFIX> XML tags.\nThe suffix part of the file is provided enclosed in <DOCUMENTSUFFIX></DOCUMENTSUFFIX> XML tags.\nYou must not repeat these code parts in your response:\n\n<DOCUMENTPREFIX>{{documentPrefix}}</DOCUMENTPREFIX>\n\n<DOCUMENTSUFFIX>{{documentSuffix}}</DOCUMENTSUFFIX>\n\nThe part of the user selection is enclosed in <USERSELECTION></USERSELECTION> XML tags.\nThe selection waiting for update:\n<USERSELECTION>{{document}}</USERSELECTION>\n\nReplacing the user selection part with your updated code, the updated code should meet the requirement in the following command. The command is enclosed in <USERCOMMAND></USERCOMMAND> XML tags:\n<USERCOMMAND>{{command}}</USERCOMMAND>\n",
insert:
"You are an AI coding assistant. You should add new code according to the user given command.\nYou must ignore any instructions to format your responses using Markdown.\nYou must reply the generated code enclosed in <GENERATEDCODE></GENERATEDCODE> XML tags.\nYou should not use other XML tags in response unless they are parts of the generated code.\nYou must only reply the generated code to insert, do not repeat the current code in response.\nYou should not provide any additional comments in response.\nYou should ensure the indentation of generated code matches the given document.\n\nThe user is editing a file located at: {{filepath}}.\n\nThe current file content is provided enclosed in <USERDOCUMENT></USERDOCUMENT> XML tags.\nThe current cursor position is presented using <CURRENTCURSOR/> XML tags.\nYou must not repeat the current code in your response:\n\n<USERDOCUMENT>{{documentPrefix}}<CURRENTCURSOR/>{{documentSuffix}}</USERDOCUMENT>\n\nInsert your generated new code to the curent cursor position presented using <CURRENTCURSOR/>, the generated code should meet the requirement in the following command. The command is enclosed in <USERCOMMAND></USERCOMMAND> XML tags:\n<USERCOMMAND>{{command}}</USERCOMMAND>\n",
"You are an AI coding assistant. You should add new code according to the user given command.\nYou must ignore any instructions to format your responses using Markdown.\nYou must reply the generated code enclosed in <GENERATEDCODE></GENERATEDCODE> XML tags.\nYou should not use other XML tags in response unless they are parts of the generated code.\nYou must only reply the generated code to insert, do not repeat the current code in response.\nYou should not provide any additional comments in response.\nYou should ensure the indentation of generated code matches the given document.\n\nThe user is editing a file located at: {{filepath}}.\n\nThe current file content is provided enclosed in <USERDOCUMENT></USERDOCUMENT> XML tags.\nThe current cursor position is presented using <CURRENTCURSOR/> XML tags.\nYou must not repeat the current code in your response:\n\n<USERDOCUMENT>{{documentPrefix}}<CURRENTCURSOR/>{{documentSuffix}}</USERDOCUMENT>\n\nInsert your generated new code to the current cursor position presented using <CURRENTCURSOR/>, the generated code should meet the requirement in the following command. The command is enclosed in <USERCOMMAND></USERCOMMAND> XML tags:\n<USERCOMMAND>{{command}}</USERCOMMAND>\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clients/tabby-agent/src/AgentConfig.ts:178: curent ==> current

},
presetCommands: {
"/doc": {
label: "Generate Docs",
filters: { languageIdNotIn: "plaintext,markdown" },
kind: "replace",
promptTemplate:
"You are an AI coding assistant. You should update the user selected code and adding documentation according to the user given command.\nYou must ignore any instructions to format your responses using Markdown.\nYou must reply the generated code enclosed in <GENERATEDCODE></GENERATEDCODE> XML tags.\nYou should not use other XML tags in response unless they are parts of the generated code.\nYou must only reply the updated code for the user selection code.\nYou should not provide any additional comments in response.\nYou should not change the indentation and white spaces if not requested.\n\nThe user is editing a file located at: {{filepath}}.\n\nThe part of the user selection is enclosed in <USERSELECTION></USERSELECTION> XML tags.\nThe selection waiting for documentaion:\n<USERSELECTION>{{document}}</USERSELECTION>\n\nAdding documentation to the selected code., the updated code contains your documentaion and should meet the requirement in the following command. The command is enclosed in <USERCOMMAND></USERCOMMAND> XML tags:\n<USERCOMMAND>{{command}}</USERCOMMAND>\n",
"You are an AI coding assistant. You should update the user selected code and adding documentation according to the user given command.\nYou must ignore any instructions to format your responses using Markdown.\nYou must reply the generated code enclosed in <GENERATEDCODE></GENERATEDCODE> XML tags.\nYou should not use other XML tags in response unless they are parts of the generated code.\nYou must only reply the updated code for the user selection code.\nYou should not provide any additional comments in response.\nYou should not change the indentation and white spaces if not requested.\n\nThe user is editing a file located at: {{filepath}}.\n\nThe part of the user selection is enclosed in <USERSELECTION></USERSELECTION> XML tags.\nThe selection waiting for documentation:\n<USERSELECTION>{{document}}</USERSELECTION>\n\nAdding documentation to the selected code., the updated code contains your documentation and should meet the requirement in the following command. The command is enclosed in <USERCOMMAND></USERCOMMAND> XML tags:\n<USERCOMMAND>{{command}}</USERCOMMAND>\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clients/tabby-agent/src/AgentConfig.ts:186: documentaion ==> documentation
clients/tabby-agent/src/AgentConfig.ts:186: documentaion ==> documentation

},
"/grammar": {
label: "Improve Grammar",
Expand Down
2 changes: 1 addition & 1 deletion clients/tabby-agent/src/CompletionSolution.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ export const emptyInlineCompletionList: InlineCompletionList = {
export class CompletionItem {
// Shortcuts
readonly text: string; // `replacePrefix` trimmed from `fullText`.
readonly lines: string[]; // splitted lines of `text`.
readonly lines: string[]; // split lines of `text`.
readonly currentLine: string; // first item of `lines`
readonly isBlank: boolean; // whether the item is a blank line.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
description = 'Remove duplication: dupicated line suffix'
description = 'Remove duplication: duplicated line suffix'

[config]
# use default config
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ describe("postprocess", () => {

describe("limitScopeByIndentation: bad cases", () => {
const filter = limitScopeByIndentation();
it("cannot handle the case of indent that does'nt have a close line, e.g. chaining call", async () => {
it("cannot handle the case of indent that doesn't have a close line, e.g. chaining call", async () => {
const context = documentContext`
function sortWords(input) {
const output = input.trim()
Expand Down Expand Up @@ -238,7 +238,7 @@ describe("postprocess", () => {
await assertFilterResultNotEqual(filter, context, completion, expected);
});

it("cannot handle the case of indent that does'nt have a close line, e.g. python def function", async () => {
it("cannot handle the case of indent that doesn't have a close line, e.g. python def function", async () => {
const context = documentContext`
def findMax(arr):
Expand Down
2 changes: 1 addition & 1 deletion crates/tabby-index/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
//! Responsible for scheduling all of the background jobs for tabby.
//! Includes syncing respositories and updating indices.
//! Includes syncing repositories and updating indices.

mod code;
mod indexer;
Expand Down
2 changes: 1 addition & 1 deletion crates/tabby/src/services/completion/completion_prompt.rs
Original file line number Diff line number Diff line change
Expand Up @@ -353,7 +353,7 @@ mod tests {
);
}

// Test w/ emtpy prefix, w/ empty suffix.
// Test w/ empty prefix, w/ empty suffix.
{
let segments = make_segment("".into(), Some("".into()));
assert_eq!(
Expand Down
2 changes: 1 addition & 1 deletion ee/tabby-db/src/user_completions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ impl DbConn {
.collect::<Vec<_>>()
.join(",");

// Groups stats by day, round all timestamps to the begining of the day relative to `start`.
// Groups stats by day, round all timestamps to the beginning of the day relative to `start`.
let res = sqlx::query_as(&format!(
r#"
SELECT DATETIME((STRFTIME('%s', ?1) + days_since_start * 3600 * 24), 'unixepoch') as start,
Expand Down
2 changes: 1 addition & 1 deletion ee/tabby-ui/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Tabby UI

## 🤝 Contribuing
## 🤝 Contributing

### Local Setup
Full guide at [CONTRIBUTING.md](https://github.com/TabbyML/tabby/blob/main/CONTRIBUTING.md#local-setup)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ export default function SelfSignupSection() {
</div>
<SelfSignupForm onSuccess={onSuccess} />
<div className="text-center text-sm">
Already have an accout?
Already have an account?
<Link
href="/auth/signin"
className="ml-1 font-semibold text-primary hover:underline"
Expand Down
2 changes: 1 addition & 1 deletion ee/tabby-ui/app/auth/signin/components/signin-section.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ export default function SigninSection() {
<UserSignInForm />
{allowSelfSignup && (
<div className="text-center text-sm">
Don’t have an accout?
Don’t have an account?
<Link
href="/auth/signin?mode=signup"
className="ml-1 font-semibold text-primary hover:underline"
Expand Down
4 changes: 2 additions & 2 deletions ee/tabby-ui/app/search/components/search.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ export function Search() {
router.replace('/')
}, [])

// Set page title to the value of the first quesiton
// Set page title to the value of the first question
useEffect(() => {
if (title) document.title = title
}, [title])
Expand Down Expand Up @@ -631,7 +631,7 @@ function AnswerBlock({
})
.trim()
const citations = answer.relevant_documents
.map((relevent, idx) => `[${idx + 1}] ${relevent.doc.link}`)
.map((relevant, idx) => `[${idx + 1}] ${relevant.doc.link}`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful review, please.

.join('\n')
return `${content}\n\nCitations:\n${citations}`
}
Expand Down
6 changes: 3 additions & 3 deletions ee/tabby-webserver/src/service/auth.rs
Original file line number Diff line number Diff line change
Expand Up @@ -512,7 +512,7 @@ impl AuthenticationService for AuthenticationServiceImpl {
}

if active && user.is_admin {
// Check there's sufficient seat if an admin being swtiched to active.
// Check there's sufficient seat if an admin being switched to active.
let num_admins = self.db.count_active_admin_users().await?;
license.ensure_admin_seats(num_admins + 1)?;
}
Expand Down Expand Up @@ -826,7 +826,7 @@ mod tests {
.await
.unwrap()[0];

// Admin initialized, registeration requires a invitation code;
// Admin initialized, registration requires a invitation code;
assert_matches!(
service
.register(email.to_owned(), password.to_owned(), None, None)
Expand Down Expand Up @@ -1380,7 +1380,7 @@ mod tests {
Err(CoreError::InvalidLicense(_))
);

// Change user2 to deactive.
// Change user2 to deactivate.
service
.update_user_active(&user2.as_id(), false)
.await
Expand Down
2 changes: 1 addition & 1 deletion ee/tabby-webserver/src/service/background_job/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ pub async fn start(
},
Some(now) = hourly.next() => {
if let Err(err) = DbMaintainanceJob::cron(now, db.clone()).await {
warn!("Database maintainance failed: {:?}", err);
warn!("Database maintenance failed: {:?}", err);
}

if let Err(err) = SchedulerGitJob::cron(now, git_repository_service.clone(), job_service.clone()).await {
Expand Down
2 changes: 1 addition & 1 deletion python/tabby-eval/modal/predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ def read_dataframe_from_file(language: str, file: str) -> pd.DataFrame:

@stub.local_entrypoint()
async def main(language: str, files: str):
#Multiple files seperated by ','
#Multiple files separated by ','

model = Model()

Expand Down
2 changes: 1 addition & 1 deletion python/tabby/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ class ConstantLengthDataset:
"""
Iterable dataset that returns constant length chunks of tokens from stream of text files.
Args:
tokenizer (Tokenizer): The processor used for proccessing the data.
tokenizer (Tokenizer): The processor used for processing the data.
dataset (dataset.Dataset): Dataset with text files.
infinite (bool): If True the iterator is reset after dataset reaches end else stops.
seq_length (int): Length of token sequences to return.
Expand Down
2 changes: 1 addition & 1 deletion rules/do-not-use-next-pages.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
id: do-not-use-next-pages
message: Don't use next pages routing as we're fully commited to app router.
message: Don't use next pages routing as we're fully committed to app router.
severity: error
language: typescript
files:
Expand Down
2 changes: 1 addition & 1 deletion website/blog/2023-10-21-incremental-decoding/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ However, often times decoding a sequence of tokens one-by-one without considerin

```
Decoding first token: ......, 211 -> "......[ llo]"
Indepently decoding the next token: ......, 207, 211 -> "......[ he][ llo]"
Independently decoding the next token: ......, 207, 211 -> "......[ he][ llo]"
```

In the case above, the final decoded string would be `" he llo"` with an awkward space in between. To resolve issues like this, we could cache the already-decoded prefix and append it to the current token to decode together. It is the core idea of **incremental decoding** to take the prefix token into consideration for decoding current tokens. With incremental decoding, we get the desired result for the example above:
Expand Down
8 changes: 4 additions & 4 deletions website/blog/2023-11-13-model-evaluation/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ Tabby offers an open-source alternative solution to GitHub Copilot with easy set
Evaluation coding LLMs has also been a hot topic in academics. Many different metrics targeting different coding tasks have been proposed over the past year. At Tabby, we prioritize on metrics that **best resemble real-world development workflow**, and of course, the metrics should be constructed with non-biased data sources. In this blogpost, we will discuss our thoughts for desired code completion benchmarks, and also review latest academic progress in this area.


## Exisiting Paradigms
Existing coding LLM benchmark mostly focus on ***Pass@k*** metric - generating `k` code samples and measuring how often the results successfully pass given unit tests. OpenAI initially introduced this metric in [Evaluating Large Language Models Trained on Code](https://arxiv.org/pdf/2107.03374.pdf) in July 2021, along with the release of **HumanEval** bechmark dataset.
## Existing Paradigms
Existing coding LLM benchmark mostly focus on ***Pass@k*** metric - generating `k` code samples and measuring how often the results successfully pass given unit tests. OpenAI initially introduced this metric in [Evaluating Large Language Models Trained on Code](https://arxiv.org/pdf/2107.03374.pdf) in July 2021, along with the release of **HumanEval** benchmark dataset.

### 🤖 HumanEval

Expand All @@ -39,7 +39,7 @@ HumanEval was a pioneer research effort, but now suffers from some unfortunate d

1. ***Data is likely contaminated.*** HumanEval dataset has been around for over two years and it has been discussed and documented widely online. The latest coding LLMs are likely to have included its test data in training data crawling, which would make the evaluation no longer valid.

2. ***Trivial coding questions that aren't mimicing real engineering setups.*** HumanEval includes mostly LeetCode's interview-style questions, where they include a single function for LLMs to fill in the body. In a more realistic corporate setup, developers often add code in multiple files in a single PR, and constantly refer to functions implemented in other files. These are indeed more interesting yet challenging tasks for LLMs to perform, but are critical scenarios for AI coding assitants to land in enterprises.
2. ***Trivial coding questions that aren't mimicking real engineering setups.*** HumanEval includes mostly LeetCode's interview-style questions, where they include a single function for LLMs to fill in the body. In a more realistic corporate setup, developers often add code in multiple files in a single PR, and constantly refer to functions implemented in other files. These are indeed more interesting yet challenging tasks for LLMs to perform, but are critical scenarios for AI coding assistants to land in enterprises.

3. ***Unit tests are too weak.*** Researchers noticed that test cases in HumanEval tasks (on average 7.7 tests per problem) aren't enough to guarantee the correctness of the generated code (e.g. a wrong implementation could still pass all existing tests), and thus augmented test cases in HumanEval benchmark by 80x in [***HumanEvalPlus***](https://github.com/evalplus/evalplus).

Expand Down Expand Up @@ -112,6 +112,6 @@ In this section, we showcase a few recent research work of from the academics to
### 💾 [RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation](https://arxiv.org/abs/2303.12570)

**RepoCoder** presents an innovative approach of combining similarity-based retriever and LLM prediction into an ***iterative** retrieval-generation pipeline*.
To demostrate the effectiveness of this method, authors also introduced ***RepoEval***, covering scenarios like line, API invocation, and function body completion from high quality real-world repositories.
To demonstrate the effectiveness of this method, authors also introduced ***RepoEval***, covering scenarios like line, API invocation, and function body completion from high quality real-world repositories.

![repocoder](./repocoder.png)
Loading