-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Content Indexer #337
base: master
Are you sure you want to change the base?
Content Indexer #337
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
initial build script
- Factored out some common db utils - Added appEnvironment to ml-utils
- DB service now queues transactions, to avoid stale data - pages are indexed by url and indicate their locales - yarn build-index runs a full build with a db dump in public/full-db.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First impression is wow.
Lots of stuff; will probably need some pilot/navigator session to understand the entire flow.
However, I did not complete to read the entire thing since I couldn't get it to run properly.
Please add some docs on how to run, and I'll take it to another spin from there.
@@ -4,6 +4,16 @@ | |||
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387 | |||
"version": "0.2.0", | |||
"configurations": [ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some docs on how to run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to document. If you have a reason to run the DB application in debug mode, use the DB App launch config, otherwise don't...
@@ -32,13 +32,18 @@ | |||
"clean": "rm -rf .next", | |||
"dev": "next dev", | |||
"build": "next build", | |||
"prebuild-index": "bash scripts/run-indexer.sh -P 20123", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some docs on how to run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, pre- and post- tasks are run automatically. yarn build-index
will run prebuild-index
before and postbuild-index
after.
@@ -0,0 +1 @@ | |||
{"schemas":[{"articles":{"schema":{"version":0,"title":"Schema for ML Articles","keyCompression":false,"primaryKey":"url","type":"object","properties":{"url":{"type":"string","maxLength":100},"labels":{"type":"array","items":{"type":"string"}},"startDate":{"type":"number"},"endDate":{"type":"number"},"locales":{"type":"array","items":{"type":"string"}}},"required":["url","labels","locales","startDate"]}}}],"data":{"name":"ml","instanceToken":"morvycskkp","collections":[{"name":"articles","schemaHash":"dyk9bi","docs":[{"url":"about","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718891.07},"_deleted":false},{"url":"contribute","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718931.02},"_deleted":false},{"url":"docs","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718594.01},"_deleted":false},{"url":"docs/the-story-of-mel","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718600.01},"_deleted":false},{"url":"docs/the-story-of-mel/codex","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718642.07},"_deleted":false},{"url":"docs/the-story-of-mel/pages/blackjack-writeup","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134716878.01},"_deleted":false},{"url":"docs/the-story-of-mel/pages/mels-hack-the-missing-bits","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134717242.01},"_deleted":false},{"url":"docs/the-story-of-mel/pages/preface","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134717372.02},"_deleted":false},{"url":"docs/the-story-of-mel/pages/resources","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134717399.06},"_deleted":false},{"url":"glossary","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718608.05},"_deleted":false},{"url":"glossary/addressing-scheme","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718640.01},"_deleted":false},{"url":"glossary/assembly-language","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718645.07},"_deleted":false},{"url":"glossary/bit","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718649.07},"_deleted":false},{"url":"glossary/compiler","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718654.07},"_deleted":false},{"url":"glossary/drum-memory","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718666.01},"_deleted":false},{"url":"glossary/fortran","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718670.06},"_deleted":false},{"url":"glossary/friden-flexowriter","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718682.07},"_deleted":false},{"url":"glossary/goto","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718686.07},"_deleted":false},{"url":"glossary/hexadecimal","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718691.07},"_deleted":false},{"url":"glossary/infinite-loop","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718695.01},"_deleted":false},{"url":"glossary/jump-instruction","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718701.07},"_deleted":false},{"url":"glossary/lgp-30","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718705.07},"_deleted":false},{"url":"glossary/loop","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718709.07},"_deleted":false},{"url":"glossary/machine-code","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718713.07},"_deleted":false},{"url":"glossary/magnetic-core-memory","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718722.07},"_deleted":false},{"url":"glossary/operand","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718728.02},"_deleted":false},{"url":"glossary/operation-code","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718734.07},"_deleted":false},{"url":"glossary/optimal-code","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718739.01},"_deleted":false},{"url":"glossary/optimum","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718743.01},"_deleted":false},{"url":"glossary/pascal","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718748.03},"_deleted":false},{"url":"glossary/pessimum","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718762.02},"_deleted":false},{"url":"glossary/port","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718772.04},"_deleted":false},{"url":"glossary/ratfor","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718776.07},"_deleted":false},{"url":"glossary/real-programmer","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718785.01},"_deleted":false},{"url":"glossary/register","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718788.07},"_deleted":false},{"url":"glossary/rpc-4000","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718804.02},"_deleted":false},{"url":"glossary/terminating-condition","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718809.01},"_deleted":false},{"url":"glossary/test-terminating-condition","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718817.07},"_deleted":false},{"url":"glossary/time-delay-loop","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718821.01},"_deleted":false},{"url":"glossary/top-down-design","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718823.07},"_deleted":false},{"url":"glossary/vacuum-tube","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718836.07},"_deleted":false},{"url":"posts/07-04-21-hebrew-edition","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718954.06},"_deleted":false},{"url":"posts/14-01-21-here-we-go","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718958.02},"_deleted":false},{"url":"posts/21-05-22-project-launch","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718961.07},"_deleted":false}]}]}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NICE!
@@ -0,0 +1,62 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need docs on how to run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(see below)
@@ -0,0 +1,37 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, need docs on how to run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomerlichtash
(also relevant to other script(s))
- You don't need to run the scripts, they're not standalone. You only run
yarn build-index
(oryarn dev-index
). - The scripts are documented and print a
usage
message when run with no params.bash <script>
.
@@ -0,0 +1,11 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can have two package.jsons
like that. This calls for either a monorepo, or an external package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Can have" in what sense? Obviously we'll refactor on some level when the system works, hopefully a monorepo will do a better job than scattered package.json
s, but do you see a show-stopper hurdle?
@tomerlichtash , you can now run a build with indexing on this branch. |
6bee681
to
13c4256
Compare
Server side content indexer, designed to run as a build step.