Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Data Sources #289

Merged
merged 2 commits into from
Apr 4, 2024
Merged

Implement Data Sources #289

merged 2 commits into from
Apr 4, 2024

Conversation

nagakingg
Copy link
Collaborator

@nagakingg nagakingg commented Jan 16, 2024

Description

Addresses #287 -- Implements DataSource, SimAsset, TimeSequence

New classes:

  • Add DataSource, SimAsset, and TimeSequence template classes
  • Add SimAssets: OnChainAsset, OnChainAssetPair
  • Add TimeSequence: DateTimeSequence
  • Add DataSources: CoinGeckoPriceVolumeSource, CsvDataSource

Changes to incorporate new classes:

  • Added get_asset_data() and get_pool_data() convenience functions to pipelines.common
  • Added pool_data.get_pool_assets()
  • Modified interfaces to pipeline functions, get_price_data(), get_pool_volume() for consistency with new class interfaces
  • Moved price/volume data retrieval outside of PriceVolume iterator

Removed because no longer needed:

  • Removed SimAssets type and SimPool.assets property
  • Removed coin_names property from PricingMetrics
  • Removed Coingecko pool_prices and coin_ids_from addresses

See changelog for more details.

Fixes to CoinGecko Data (!):

  • Changed Coingecko price data resampling to hourly samples with 10 minute tolerance. Previously, we sampled at the 30-minute mark between hours to capture late timestamps (e.g. resample to 1:30, 2:30, 3:30 to capture data timestamped at 1:02, 2:17, & 3:06). Coingecko's data timing has become more reliable, so a 10-minute tolerance locked to the hour is more accurate. Note that this breaks the end-to-end tests for this PR, since the resampling of the data is altered
  • Changed end-to-end tests to use more recent data with more reliable timestamps
  • Fixed a bug where volume units were converted incorrectly for Coingecko data. Volumes were assumed to be in units of the queried asset, but I verified that they were actually in dollars. This bug was introduced by me in commit df79810. This has not affected any Curve optimizations since we have always used Kaiko or other custom data since then, but it may have affected some users.

Hygiene checklist

  • Changelog entry
  • Everything public has a Numpy-style docstring
    (modules, public functions, classes, and public methods)
  • Commit history is cleaned-up with minor changes squashed together
    and descriptive commit messages following Tim Pope's style

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

Copy link
Member

@chanhosuh chanhosuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall things are looking very nice. I think there could be more things done on the csv data source, but those can be saved for future PRs.

curvesim/templates/sim_asset.py Show resolved Hide resolved
curvesim/price_data/data_sources/coingecko.py Outdated Show resolved Hide resolved
curvesim/templates/data_source.py Show resolved Hide resolved
- Add DataSource, SimAsset, and TimeSequence template classes
- Add SimAssets: OnChainAsset, OnChainAssetPair
- Add TimeSequence: DateTimeSequence
- Add DataSources: CoinGeckoPriceVolumeSource, CsvDataSource
- Add get_asset_data and get_pool_data pipeline convenience functions
- Add pool_data.get_pool_assets()
- Incorporate new template classes into relevant function signatures
- Make CoinGecko data processing more robust
- Fix error in unit conversion for CoinGecko volume data
@nagakingg
Copy link
Collaborator Author

nagakingg commented Feb 26, 2024

Here is a results comparison using the simple pipeline. Current main (_old.html) vs this PR (_new.html).

Everything looks nearly the same except for triCRV annualized returns (in crvUSD). Examining the pool value time-series, this is just due to the old version including an initial timestamp with significantly lower prices. In other words, this looks like just an artifact of the precise sample timing being slightly different.

The newer timestamp generation (using TimeSequence) is a bit more refined than the old one, so I see no need to try to replicate the old test results explicitly.

results_comparison.zip

@allt0ld allt0ld dismissed chanhosuh’s stale review April 4, 2024 18:46

Naga has addressed the feedback

@allt0ld allt0ld merged commit 8363194 into main Apr 4, 2024
4 of 15 checks passed
@allt0ld allt0ld deleted the data-sources branch April 4, 2024 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants