Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content Indexer #337

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from
Draft

Content Indexer #337

wants to merge 12 commits into from

Conversation

imdfl
Copy link
Collaborator

@imdfl imdfl commented Dec 29, 2022

Server side content indexer, designed to run as a build step.

@vercel
Copy link

vercel bot commented Dec 29, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
mels-loop ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 23, 2023 0:47am

@imdfl imdfl self-assigned this Dec 29, 2022
initial build script
- Factored out some common db utils
- Added appEnvironment to ml-utils
- DB service now queues transactions, to avoid stale data
- pages are indexed by url and indicate their locales
- yarn build-index runs a full build with a db dump in public/full-db.json
Copy link
Owner

@tomerlichtash tomerlichtash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First impression is wow.

Lots of stuff; will probably need some pilot/navigator session to understand the entire flow.
However, I did not complete to read the entire thing since I couldn't get it to run properly.

Please add some docs on how to run, and I'll take it to another spin from there.

@@ -4,6 +4,16 @@
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some docs on how to run

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to document. If you have a reason to run the DB application in debug mode, use the DB App launch config, otherwise don't...

@@ -32,13 +32,18 @@
"clean": "rm -rf .next",
"dev": "next dev",
"build": "next build",
"prebuild-index": "bash scripts/run-indexer.sh -P 20123",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some docs on how to run

Copy link
Collaborator Author

@imdfl imdfl Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, pre- and post- tasks are run automatically. yarn build-index will run prebuild-index before and postbuild-index after.

@@ -0,0 +1 @@
{"schemas":[{"articles":{"schema":{"version":0,"title":"Schema for ML Articles","keyCompression":false,"primaryKey":"url","type":"object","properties":{"url":{"type":"string","maxLength":100},"labels":{"type":"array","items":{"type":"string"}},"startDate":{"type":"number"},"endDate":{"type":"number"},"locales":{"type":"array","items":{"type":"string"}}},"required":["url","labels","locales","startDate"]}}}],"data":{"name":"ml","instanceToken":"morvycskkp","collections":[{"name":"articles","schemaHash":"dyk9bi","docs":[{"url":"about","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718891.07},"_deleted":false},{"url":"contribute","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718931.02},"_deleted":false},{"url":"docs","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718594.01},"_deleted":false},{"url":"docs/the-story-of-mel","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718600.01},"_deleted":false},{"url":"docs/the-story-of-mel/codex","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718642.07},"_deleted":false},{"url":"docs/the-story-of-mel/pages/blackjack-writeup","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134716878.01},"_deleted":false},{"url":"docs/the-story-of-mel/pages/mels-hack-the-missing-bits","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134717242.01},"_deleted":false},{"url":"docs/the-story-of-mel/pages/preface","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134717372.02},"_deleted":false},{"url":"docs/the-story-of-mel/pages/resources","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134717399.06},"_deleted":false},{"url":"glossary","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718608.05},"_deleted":false},{"url":"glossary/addressing-scheme","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718640.01},"_deleted":false},{"url":"glossary/assembly-language","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718645.07},"_deleted":false},{"url":"glossary/bit","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718649.07},"_deleted":false},{"url":"glossary/compiler","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718654.07},"_deleted":false},{"url":"glossary/drum-memory","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718666.01},"_deleted":false},{"url":"glossary/fortran","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718670.06},"_deleted":false},{"url":"glossary/friden-flexowriter","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718682.07},"_deleted":false},{"url":"glossary/goto","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718686.07},"_deleted":false},{"url":"glossary/hexadecimal","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718691.07},"_deleted":false},{"url":"glossary/infinite-loop","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718695.01},"_deleted":false},{"url":"glossary/jump-instruction","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718701.07},"_deleted":false},{"url":"glossary/lgp-30","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718705.07},"_deleted":false},{"url":"glossary/loop","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718709.07},"_deleted":false},{"url":"glossary/machine-code","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718713.07},"_deleted":false},{"url":"glossary/magnetic-core-memory","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718722.07},"_deleted":false},{"url":"glossary/operand","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718728.02},"_deleted":false},{"url":"glossary/operation-code","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718734.07},"_deleted":false},{"url":"glossary/optimal-code","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718739.01},"_deleted":false},{"url":"glossary/optimum","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718743.01},"_deleted":false},{"url":"glossary/pascal","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718748.03},"_deleted":false},{"url":"glossary/pessimum","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718762.02},"_deleted":false},{"url":"glossary/port","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718772.04},"_deleted":false},{"url":"glossary/ratfor","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718776.07},"_deleted":false},{"url":"glossary/real-programmer","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718785.01},"_deleted":false},{"url":"glossary/register","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718788.07},"_deleted":false},{"url":"glossary/rpc-4000","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718804.02},"_deleted":false},{"url":"glossary/terminating-condition","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718809.01},"_deleted":false},{"url":"glossary/test-terminating-condition","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718817.07},"_deleted":false},{"url":"glossary/time-delay-loop","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718821.01},"_deleted":false},{"url":"glossary/top-down-design","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718823.07},"_deleted":false},{"url":"glossary/vacuum-tube","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718836.07},"_deleted":false},{"url":"posts/07-04-21-hebrew-edition","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718954.06},"_deleted":false},{"url":"posts/14-01-21-here-we-go","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718958.02},"_deleted":false},{"url":"posts/21-05-22-project-launch","labels":[],"locales":["en","he"],"_meta":{"lwt":1678134718961.07},"_deleted":false}]}]}}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NICE!

@@ -0,0 +1,62 @@
#!/bin/bash
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need docs on how to run

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(see below)

@@ -0,0 +1,37 @@
#!/bin/bash
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, need docs on how to run

Copy link
Collaborator Author

@imdfl imdfl Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomerlichtash
(also relevant to other script(s))

  1. You don't need to run the scripts, they're not standalone. You only run yarn build-index (or yarn dev-index).
  2. The scripts are documented and print a usage message when run with no params. bash <script>.

@@ -0,0 +1,11 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can have two package.jsons like that. This calls for either a monorepo, or an external package.

Copy link
Collaborator Author

@imdfl imdfl Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Can have" in what sense? Obviously we'll refactor on some level when the system works, hopefully a monorepo will do a better job than scattered package.jsons, but do you see a show-stopper hurdle?

@imdfl
Copy link
Collaborator Author

imdfl commented Mar 10, 2023

@tomerlichtash , you can now run a build with indexing on this branch.
after running yarn, you should be able to build the index with yarn build-index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants