Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial implementation of the indexer #7

Merged
merged 31 commits into from
Sep 19, 2024
Merged

feat: initial implementation of the indexer #7

merged 31 commits into from
Sep 19, 2024

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Sep 4, 2024

While working on the initial implementation, I realized that the originally proposed design was a bit difficult to implement. I updated the design document to match the actual implementation.

Also, I decided to use Redis for persistence, in line with what @juliangruber did in spark-rewards.

Links:

Out of scope:

  • Sentry integration
  • REST API component

@bajtos bajtos requested a review from juliangruber September 4, 2024 10:22
@bajtos bajtos removed the request for review from juliangruber September 4, 2024 12:50
@bajtos bajtos changed the title feat: update our state from IPNI state feat: initial implementation of the indexer Sep 4, 2024
indexer/lib/typings.d.ts Outdated Show resolved Hide resolved
Signed-off-by: Miroslav Bajtoš <[email protected]>
@bajtos bajtos marked this pull request as ready for review September 16, 2024 08:22
@bajtos bajtos requested a review from juliangruber September 16, 2024 08:22
@bajtos
Copy link
Member Author

bajtos commented Sep 16, 2024

The CI is failing, I need to setup Redis in GitHub Actions.

Signed-off-by: Miroslav Bajtoš <[email protected]>
docs/design.md Show resolved Hide resolved
indexer/test/advertisement-walker.test.js Show resolved Hide resolved
indexer/bin/piece-indexer.js Outdated Show resolved Hide resolved
providerId,
getProviderInfo,
minStepIntervalInMs: 100
}).finally(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you worried about the maximum concurrency this can run in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! My current thinking is that there are several thousand storage providers now, plus maybe a few hundred non-Filecoin index providers. We should not be running more than 10k concurrent walkers. Walkers spend most of their time waiting for I/O (Redis, HTTP requests to index providers) or sleeping between steps.

I think that Node.js can easily handle such load. WDYT?

But! Maybe this is a sign that we need more visibility into this aspect. What I can do - but preferably in a follow-up pull request, is add an InfluxDB client and periodically write a data point with the number of concurrent walkers.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to let it fail, debug and then see that maximum concurrency is the issue. It would be something else if we knew yes 100% it is, but like this I think it's fine.

What we need monitoring for is whether the indexer is delayed, I believe as long as it's not, no other metric is important.

indexer/bin/piece-indexer.js Outdated Show resolved Hide resolved
indexer/lib/advertisement-walker.js Outdated Show resolved Hide resolved
indexer/lib/typings.d.ts Show resolved Hide resolved
indexer/lib/vendored/multiaddr.js Show resolved Hide resolved
bajtos and others added 5 commits September 18, 2024 14:42
Signed-off-by: Miroslav Bajtoš <[email protected]>
Signed-off-by: Miroslav Bajtoš <[email protected]>
Signed-off-by: Miroslav Bajtoš <[email protected]>
@bajtos bajtos requested a review from juliangruber September 18, 2024 13:56
Copy link
Member

@juliangruber juliangruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For later: We should add monitoring for whether the indexer is behind or in sync with the chain. Eg track the age of the latest advertisement minus the age of the latest ingested advertisement

indexer/package.json Outdated Show resolved Hide resolved
Co-authored-by: Julian Gruber <[email protected]>
@bajtos
Copy link
Member Author

bajtos commented Sep 19, 2024

For later: We should add monitoring for whether the indexer is behind or in sync with the chain. Eg track the age of the latest advertisement minus the age of the latest ingested advertisement

Great idea!

I added a work item to space-meridian/roadmap#143

@bajtos bajtos merged commit ba4d4f1 into main Sep 19, 2024
7 checks passed
@bajtos bajtos deleted the indexer-impl branch September 19, 2024 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants