Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track in stats which fields from Zyte API automatic extraction are not overridden #202

Merged
merged 11 commits into from
Jul 25, 2024

Conversation

Gallaecio
Copy link
Contributor

Requested by @VMRuiz.

It adds stats that indicate which fields from dependency injection come from Zyte API automatic extraction.

When directly requesting an item type, e.g. Product, the entire list of fields is tracked in stats:

 'scrapy-zyte-api/auto_fields/zyte_common_items.items.product.Product': 'additionalProperties '
                                                                        'aggregateRating '
                                                                        'availability '
                                                                        'brand '
                                                                        'breadcrumbs '
                                                                        'canonicalUrl '
                                                                        'color '
                                                                        'currency '
                                                                        'currencyRaw '
                                                                        'description '
                                                                        'descriptionHtml '
                                                                        'features '
                                                                        'gtin '
                                                                        'images '
                                                                        'mainImage '
                                                                        'metadata '
                                                                        'mpn '
                                                                        'name '
                                                                        'price '
                                                                        'productId '
                                                                        'regularPrice '
                                                                        'size '
                                                                        'sku '
                                                                        'style '
                                                                        'url '
                                                                        'variants',

When using a page object that overrides some fields, the stat reflects the corresponding page object, and the override fields are removed, e.g. aggregateRating here:

 'scrapy-zyte-api/auto_fields/zyte_spider_templates_project.pages.books_toscrape_com.BooksToScrapeComProductPage': 'additionalProperties '
                                                                                                                   'availability '
                                                                                                                   'brand '
                                                                                                                   'breadcrumbs '
                                                                                                                   'canonicalUrl '
                                                                                                                   'color '
                                                                                                                   'currency '
                                                                                                                   'currencyRaw '
                                                                                                                   'description '
                                                                                                                   'descriptionHtml '
                                                                                                                   'features '
                                                                                                                   'gtin '
                                                                                                                   'images '
                                                                                                                   'mainImage '
                                                                                                                   'metadata '
                                                                                                                   'mpn '
                                                                                                                   'name '
                                                                                                                   'price '
                                                                                                                   'productId '
                                                                                                                   'regularPrice '
                                                                                                                   'size '
                                                                                                                   'sku '
                                                                                                                   'style '
                                                                                                                   'url '
                                                                                                                   'variants',

@Gallaecio Gallaecio requested review from kmike and wRAR June 11, 2024 19:23
Copy link

codecov bot commented Jun 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.67%. Comparing base (cd0aead) to head (c64d781).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #202      +/-   ##
==========================================
+ Coverage   97.62%   97.67%   +0.04%     
==========================================
  Files          14       14              
  Lines        1517     1546      +29     
  Branches      320      327       +7     
==========================================
+ Hits         1481     1510      +29     
  Misses         15       15              
  Partials       21       21              
Files Coverage Δ
scrapy_zyte_api/providers.py 94.15% <100.00%> (+1.19%) ⬆️

docs/reference/settings.rst Outdated Show resolved Hide resolved
docs/reference/settings.rst Outdated Show resolved Hide resolved
CHANGES.rst Show resolved Hide resolved
@Gallaecio
Copy link
Contributor Author

@asadurski All good on your end? Shall we merge and release?

@asadurski
Copy link

@asadurski All good on your end? Shall we merge and release?

Yes, this has all the things I need and looks great!

@Gallaecio Gallaecio merged commit 055a5a6 into scrapy-plugins:main Jul 25, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants