From 49e576d02fb341d9b4b877ad7d0cb669f6c2a711 Mon Sep 17 00:00:00 2001 From: shenleban tongying Date: Sat, 23 Mar 2024 00:05:13 -0400 Subject: [PATCH] feat: document Index file in architecture.md --- website/docs/architecture.md | 30 ++++++++++++++++++++++++++++++ website/docs/developer.md | 10 +--------- website/mkdocs.yml | 28 ++++++++++++++-------------- 3 files changed, 45 insertions(+), 23 deletions(-) create mode 100644 website/docs/architecture.md diff --git a/website/docs/architecture.md b/website/docs/architecture.md new file mode 100644 index 000000000..1e93337b0 --- /dev/null +++ b/website/docs/architecture.md @@ -0,0 +1,30 @@ +## Index file + +Each index file have 4 sections. + +1. `IdxHeader` +2. `ExtraInfo` (Being used but unnamed in source code) +3. `Chunks` +4. `BtreeIndex` + +The `IdxHeader` are 32bits blocks of various meta info of the index. The most important info are `chunksOffset` and `indexRootOffset` pointing to the starting offset of `BtreeIndex` and `Chunks`. + +Some dicts only have one `ExtraInfo`: the `dictionaryName` which is an uint32 size of a string followed and the string. + +Each chunk contains uint32 size of uncompressed data, uint32 size of zlib compressed data, and the zlib compressed data. + +The `Chunks` maybe used by both `IdxHeader` and `BtreeIndex`. + +By adding new a new chunk to `Chunks` and store an offset to `IdxHeader`, `ExtraInfo` can store arbitrary long information. + +`BtreeIndex` is a zlib compressed typical btree implementation in which each Node will include `word` info and a `offset` that pointing to corresponding `chunk`'s position. + +Note that a `chunk` only includes necessary data to find an article, and it does not contain the `word`. + +The exact data in `chunk` is decided and interpreted by dictionary implementations. For example, the starting and ending position of an article in a dictionary file. + +## What's under the hood after a word is queried? + +After typing a word into the search box and press enter, the embedded browser will load `gdlookup://localhost?word=`. This url will be handled by Qt webengine's Url Scheme handler. The returned html page will be composed in the ArticleMaker which will initiate some DataRequest on dictionary formats. Resource files will be requested via `bres://` or `qrc://` which will went through a similar process. + +TODO: other subsystems. diff --git a/website/docs/developer.md b/website/docs/developer.md index 256bc202a..9b3d4fb34 100644 --- a/website/docs/developer.md +++ b/website/docs/developer.md @@ -16,12 +16,4 @@ Commit messages should follow [Conventional Commits](https://www.conventionalcom Reformat changes with `clang-format` [how to use clang-format](https://github.com/xiaoyifang/goldendict/blob/staged/howto/how%20to%20use%20.clang-format%20to%20format%20the%20code.md) -Remember to enable `clang-tidy` support on your editor so that `.clang-tidy` will be respected. - -## Architecture - -What's under the hood after a word is queried? - -After typing a word into the search box and press enter, the embedded browser will load `gdlookup://localhost?word=`. This url will be handled by Qt webengine's Url Scheme handler. The returned html page will be composed in the ArticleMaker which will initiate some DataRequest on dictionary formats. Resource files will be requested via `bres://` or `qrc://` which will went through a similar process. - -TODO: other subsystems. \ No newline at end of file +Remember to enable `clang-tidy` support on your editor so that `.clang-tidy` will be respected. \ No newline at end of file diff --git a/website/mkdocs.yml b/website/mkdocs.yml index 171fea724..58a347921 100644 --- a/website/mkdocs.yml +++ b/website/mkdocs.yml @@ -1,4 +1,4 @@ -site_name: GoldenDict-NG +site_name: GoldenDict-ng site_description: GoldenDict-ng is a open source, cross platform, multi formats, feature rich dictionary 是一个开源跨平台支持各种格式的字典程序 site_url: https://xiaoyifang.github.io/goldendict-ng/ @@ -36,8 +36,9 @@ nav: - ToolBar & DictBar: ui_toolbar.md - Favorites: ui_favorites.md - Shortcuts: ui_shortcuts.md - - Special Usages: + - Advanced Usages: - Anki Integration: topic_anki.md + - Program dictionary: howto/how to add a program as dictionary.md - Command Lines: topic_commandline.md - Custom Stylesheet & JavaScript: topic_userstyle.md - Portable Mode: topic_portablemode.md @@ -45,17 +46,16 @@ nav: - Customize Dictionary: custom_dictionary.md - OCR Integration: howto/ocr.md - Wayland: topic_wayland.md + - Debug dictionary JS: howto/how to debug dictionary js.md - Report Bugs & Feedbacks: feedbacks.md - - Contributor Guides: - - Developer: developer.md - - How to: - - Build from source: howto/build_from_source.md - - Customize the opencc: howto/how to customize the opencc.md - - Qt version and github action: howto/how to find out the latest qt version and module in github qt action.md - - Use .clang-format: howto/how to use .clang-format to format the code.md - - Breadpad crash analysis: howto/how to use breadpad crash analysis.md - - Build ffmpeg on Windows: howto/how to build ffmpeg for visual studio.md - - How to update the crowdin.ts file: howto/how to update crowdin.ts file.md - - How to debug dictionary js: howto/how to debug dictionary js.md - - How to add a program dictionary: howto/how to add a program as dictionary.md + - Development Info: + - Start develop: developer.md + - Architecture: architecture.md + - Build from source: howto/build_from_source.md + - Customize the opencc: howto/how to customize the opencc.md + - Qt version and github action: howto/how to find out the latest qt version and module in github qt action.md + - Use .clang-format: howto/how to use .clang-format to format the code.md + - Breadpad crash analysis: howto/how to use breadpad crash analysis.md + - Build ffmpeg on Windows: howto/how to build ffmpeg for visual studio.md + - Update the crowdin.ts file: howto/how to update crowdin.ts file.md