Skip to content

Releases: NYPL/drb-etl-pipeline

v0.14.5

14 Jan 20:30
bd3c774
Compare
Choose a tag to compare

bd3c774 - SFR-2478: Make MUSE urls constant (#519) (Kyle Villegas, 2025-01-14)
3f7b7cf - NO-REF: set secret env param only if it's not null (#521) (Kyle Villegas, 2025-01-14)
6f4bbf7 - SFR-2478: Make Hathi urls constant (#520) (Kyle Villegas, 2025-01-14)
1be0f7e - SFR-2478: Make doab base url constant (#517) (Kyle Villegas, 2025-01-14)
88c95d4 - SFR-2478: Make webpub pdf profile a constant (#516) (Kyle Villegas, 2025-01-14)
620adc1 - removing tugboat qa config (#518) (Kyle Villegas, 2025-01-14)
d76670f - SFr-2478: Setting github api url as a constant (#515) (Kyle Villegas, 2025-01-14)
b85d0bf - SFR-2478: Setting WEBPUB_CONVERSION_URL as a constant (#514) (Kyle Villegas, 2025-01-14)
9247841 - SFR-2478: Remove frontend-ci config (#513) (Kyle Villegas, 2025-01-14)
b89b19b - SFR-2472: Load secrets when loading env (#508) (Kyle Villegas, 2025-01-13)
f9f31ea - NO-REF: Reverting actions/cache to v2 (#512) (Kyle Villegas, 2025-01-13)
89ad41f - NO-REF: Fix CI workflow call (#511) (Kyle Villegas, 2025-01-13)
88b66eb - SFR-2741: Run CI via Github Actions (#505) (Kyle Villegas, 2025-01-13)
30ddc98 - SFR-2453: CLACSO Ingest Process (#507) (Dmitri Slory, 2025-01-10)
93c3d4d - SFR-2440/search_sort_desc_date (#500) (Shejanul Ayan Islam, 2025-01-08)
2df19a2 - SFR-2425: Fixing OCLC catalog response parsing (#503) (Kyle Villegas, 2025-01-08)
e06e981 - SFR-2426: Increasing OCLC query catalog timeout to 5 seconds (#502) (Kyle Villegas, 2025-01-08)
628ae35 - SFR-2430: Set flags based on limited access permissions (#494) (Lyndsey M., 2025-01-08)
281f11e - SFR-2452: CLACSO Mapping (#501) (Dmitri Slory, 2025-01-08)
a240de4 - SFR-2438: Fix Fulfill URL Manifest Process (#495) (Kyle Villegas, 2025-01-07)
1da63e6 - NO-REF: Refactoring main.py (#497) (Kyle Villegas, 2025-01-07)
5838e74 - SFR-2445: Aggregrate Download Analytics (#499) (Dmitri Slory, 2025-01-06)
8be6442 - NO-REF: Remove the development config (#498) (Kyle Villegas, 2025-01-06)
4488bfc - SFR-2444: Updating analytics script to use new S3 log bucket paths (#496) (Kyle Villegas, 2025-01-06)
2973981 - SFR-2419: Deprecate singular noun endpoints (#493) (Jackie Quach, 2025-01-02)
b9c2b63 - [SFR-2434] Simplifying delete record implementation (#491) (Kyle Villegas, 2024-12-24)
3814252 - Sfr 2390 download files from Drive and upload them to s3 as part of publishers project process (#492) (Lyndsey M., 2024-12-24)
624c229 - Refactor Drive service into a class (#489) (Lyndsey M., 2024-12-24)
a77628e - [SFR-2433] Adding record id to items table (#490) (Kyle Villegas, 2024-12-23)

v0.14.4

23 Dec 16:32
5a2b372
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.14.3...v0.14.4

v0.14.3

17 Dec 16:06
300ab1b
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.14.2...v0.14.3

v0.14.2

15 Nov 15:02
d7b19f1
Compare
Choose a tag to compare

d7b19f1 - NO-REF: Fix and simplify classify record updates (#447) (Kyle Villegas, 2024-11-14)
061d046 - NO-REF: Removing RabbitMQManager as a CoreProcess ancestor (#444) (Kyle Villegas, 2024-11-13)
a2f3991 - NO-REF: Remove RedisManager as an ancestor of CoreProcess (#443) (Kyle Villegas, 2024-11-13)
2561911 - NO-REF--Modify-fulfill-test-to-snake-case (#442) (Dmitri Slory, 2024-11-12)
2a88966 - NO-REF: Rename base mapping classes (#440) (Kyle Villegas, 2024-11-12)
e5b5214 - NO-REF: Fix local dev setup process (#439) (Kyle Villegas, 2024-11-12)
9f255f8 - NO-REF: Rename Oclc catalog manager dep (#438) (Kyle Villegas, 2024-11-12)
4088e06 - NO-REF: Removing redundant comments (#441) (Kyle Villegas, 2024-11-12)
6493b4a - NO-REF: add error handling error process init (#437) (Kyle Villegas, 2024-11-08)
af95c3d - NO-REF: Remove Elastic Search Manager Ancestor (#435) (Kyle Villegas, 2024-11-07)
c92de8f - SFR-2099: Edge Case API Tests for get/collection (#436) (Shejanul Ayan Islam, 2024-11-07)
06d63b2 - NO-REF: Remove kubernetes manifests (#429) (Kyle Villegas, 2024-11-06)
454b342 - NO-REF: Refactor NYPL Process (#433) (Kyle Villegas, 2024-11-06)
b5381ce - NO-REF: Fix CI (#434) (Kyle Villegas, 2024-11-06)
44ce645 - NO-REF: Updating release notes (#425) (Kyle Villegas, 2024-11-06)
4a5e3f3 - NO-REF: Refactoring GH Workflows (#430) (Kyle Villegas, 2024-11-01)
3dfa803 - NO-REF: Only run unit tests if requirements or source code changes (#428) (Kyle Villegas, 2024-10-31)
17ae067 - NO-REF: Pin werkzeug to 2.2.2 (#427) (Kyle Villegas, 2024-10-31)
68587d5 - NO-REF: Refactoring api integration tests (kyle, 2024-10-31)

v0.14.1

31 Oct 17:16
ee85835
Compare
Choose a tag to compare

ee85835 - SFR-2280: Limit Hathi Records Ingested (#414) (Dmitri Slory, 2024-10-31)
e425444 - SFR-2294: Only cluster records that have a title (#424) (Kyle Villegas, 2024-10-30)
1a28eae - SFR-2284: Refactoring classify process (#421) (Kyle Villegas, 2024-10-30)
8f62634 - SFR-2291: Throw error if we fail to generate webpub (#422) (Kyle Villegas, 2024-10-30)
66faee2 - SFR-2292: Skip clustering records with no title (#423) (Kyle Villegas, 2024-10-30)
d1b9e2e - SFR-2265: Refactored Chicago ISAC mapping and processes (#411) (Dmitri Slory, 2024-10-29)
97a2c83 - SFR-2285: Fix has part item build (#420) (Kyle Villegas, 2024-10-28)
138411f - SFR-2283: Refactoring catalog process (#418) (Kyle Villegas, 2024-10-28)
3da2045 - SFR-2289: Reorganizing process file/folder structure (#417) (Kyle Villegas, 2024-10-28)
d2778ae - SFR-2285: Fix duplicate works (#416) (Kyle Villegas, 2024-10-25)
1e23df6 - SFR-2284: Improve Hathi Trust logging (#413) (Kyle Villegas, 2024-10-24)
3ca601b - SFR-2278: Improving cluster process logging (#412) (Kyle Villegas, 2024-10-23)
2ec6c38 - Classify logging improvements (#410) (Lyndsey M., 2024-10-23)
3d98887 - SFR-2262: Refactoring cluster process (#405) (Kyle Villegas, 2024-10-22)
5396107 - SFR-2277: Renaming process files and refiling ingest processes (#408) (Kyle Villegas, 2024-10-22)
2f72220 - SFR-2276: Deprecating ingest report (#407) (Kyle Villegas, 2024-10-22)
794ab52 - SFR-2267: Fixing duplicate work bug (#406) (Kyle Villegas, 2024-10-21)
2386334 - SFR-2256: Refactoring API Process (#401) (Kyle Villegas, 2024-10-21)
7bd2981 - SFR-2260: Refactoring DB Maintenance Process (#403) (Kyle Villegas, 2024-10-18)
e086f24 - SFR-2261: Refactoring db migration process (#404) (Kyle Villegas, 2024-10-18)
2e93c41 - NO-REF: Separating dev setup process and seed data process (#402) (Kyle Villegas, 2024-10-17)
2f35903 - SFR-2188: Removed Metrics_Type Column + Updated File Names (#400) (Fatima Rahman, 2024-10-16)
49c98e4 - SFR-2249: Refactor and clean up s3 file process (#399) (Kyle Villegas, 2024-10-16)
fc1ada9 - NO-REF: Improving report scalability (#366) (Kyle Villegas, 2024-10-15)
c081a15 - SFR-2220: Adding MET process ingestion count log (#398) (Kyle Villegas, 2024-10-15)
7dd7f35 - Changed deep copy to shallow copy of WorkIdentifiers array (#397) (Dmitri Slory, 2024-10-15)
eb3f816 - SFR-2240: Cleaning up NYPL ingestion process (#395) (Kyle Villegas, 2024-10-10)
8f54078 - SFR-2245: Refactor dev setup process and fix infinite cluster loop (#396) (Kyle Villegas, 2024-10-10)
2beac30 - SFR-2141: Delete Duplicate Work Identifiers (#356) (Dmitri Slory, 2024-10-09)
58dfe75 - SFR-2234 SFR-2235: Fix bugs in DOAB ingestion process (#394) (Kyle Villegas, 2024-10-07)
daf7bba - SFR-2216: Fixing NYPL ingest process locally (#393) (Kyle Villegas, 2024-10-07)
4d7a0aa - SFR-2216: Fixing LOC process ingestion (#392) (Kyle Villegas, 2024-10-07)
596efd4 - SFR-2216: Adding Gutenberg logging (#391) (Kyle Villegas, 2024-10-07)
a74a948 - SFR-2216: Improving DOAB logging and error handling (#390) (Kyle Villegas, 2024-10-07)
9d6508f - SFR-2216: Adding ingest limit and logging to HathiTrust ingest (#389) (Kyle Villegas, 2024-10-07)
89eef18 - SFR-2192: Adding ingest limit for MUSE process (#388) (Kyle Villegas, 2024-10-07)
7e8f279 - SFR-2214: get work api test (#387) (Shejanul Ayan Islam, 2024-10-03)
ae9a953 - SFR-2217 Fixing MUSE Mapping (#385) (Kyle Villegas, 2024-10-03)
edfb351 - SFR-2219: Improving cluster error handling and logging (#386) (Kyle Villegas, 2024-10-03)
52b34ba - NO-REF: Using exception for logging API errors (#381) (Kyle Villegas, 2024-10-02)
80b8b29 - SFR-2180: Get link for a single id (#383) (Shejanul Ayan Islam, 2024-10-01)
3466a2c - SFR-2181: get edition id (#384) (Shejanul Ayan Islam, 2024-10-01)
1192f48 - SFR-2052: Replace OCLC Worldcat API v1 calls with v2 calls (#382) (Lyndsey M., 2024-10-01)
b843107 - NO-REF: Ugrading Docker ElasticSearch Container (#380) (Kyle Villegas, 2024-10-01)
3de28bf - NO-REF: Changing print statements to New Relic logging (#376) (Kyle Villegas, 2024-09-27)
9014cab - NO-REF: Remove changelog (#378) (Kyle Villegas, 2024-09-27)
c94b083 - SFR-2187: Naming convention and column ordering updates to Counter 5 reporting (#377) (Fatima Rahman, 2024-09-27)
67b7924 - SFR-2105: Deprecate OCLC Classify Manager (#375) (Lyndsey M., 2024-09-27)

v0.14.0

23 Sep 18:41
cda3d27
Compare
Choose a tag to compare

Added

  • Added auxiliary functions to build queries for OCLC search endpoints
  • Removed aggregation result print statement
  • Created local.yaml file to setup environment variables when running processes locally
  • Implemented OCLC other editions call
  • Added functionality for locally generating Counter 5 downloads reports to analytics folder
  • Readded enter and exit functions to API DB client
  • Updated DevelopmentSetUpProcess with database migration method
  • Refactored info API
  • Refactored links API and added error handling
  • Refactored works and editions APIs and added error handling
  • Generalized data aggregation within analytics folder
  • Updated README release steps
  • Added local S3 docker container via localstack
  • Added error handling to citation API
  • Implemented Counter 5 reporting for view counts
  • Added error handling to GET collection endpoints
  • Added error handling for utils API
  • Added error handling to search API
  • Refactored analytics report code
  • Implemented country-level analytics report
  • Implemented total-usage analytics report
  • Added error handling to fulfill API
  • Added uuid API validation
  • Implemented OCLC Classify Process v2
  • Added integration tests for UofM ingestion process
  • Moving up db migration in dev setup
  • Address Counter 5 report feedback from business analysts
  • Updated fulfill process to check rights status before updating manifest
  • Deleted fulfill script due to no longer being necessary
  • Updating report book ID
  • Adding OCLC query bibs call
  • Finalizing OCLC implementation
  • Implemented script to aggregate access logs
  • Switching over to classify record by metadata v2
  • Adding more specific logging and exception handling around OCLC manager errors
  • Upgrading flask-cors
  • implement restApi testing using pytest
  • add search-a-collection test to rest_api tests
  • Improving error handling and logging to OCLC classify process

Fixed

  • Changed HATHI_DATAFILES outdated link in development, example, and local yaml files
  • Resolved the errors when running the FulfillProcess on a daily and complete ingest time
  • Changed HATHI_DATAFILES outdated link in development, example, and local yaml files
  • Fixed edition API ID param
  • Fixed usage type bug
  • Fixed OCLC bib author mapping
  • Fixed OCLC catalog query attempts bug

v0.13.1

06 Aug 19:22
6453fbf
Compare
Choose a tag to compare

2024-08-06 -- v0.13.1

Added

  • New analytics folder for University Press project code. Contains methodology for generating Counter 5 reports
  • New script to update current UofM manifests with fulfill endpoints to replace pdf/epub urls
  • Updated README with appendix and additions to avaliable processes
  • New process to add fulfill urls to Limited Access manifests and update fulfill_limited_access flags to True
  • Updated README and added more information to installation steps
  • Added Rights status to UofM mapping and Rights conditionals to UofM process
  • Deprecated datetime.utcnow() method
  • Added new field (publisher_project_source) to the records and items data models
  • Ran database migration to add publisher_project_source field to records and items tables
  • Filled out publisher_project_source field for UofM books
  • Added editionID validation to editions API
  • Updated README with steps on retrieving local-compose.yaml file and credentials
  • Added more logging to proxy API
  • Implement call to Worldcat to retrieve OCLC number
  • Refactor OCLC Catalog Manager
  • Added make integration command
  • Upgrade RabbitMQ Docker image to 3.13
  • Updated README with steps on running the processes locally

Fixed

  • Resolved the format of fulfill endpoints in UofM manifests
  • Added additional logging to the editions endpoint to debug
  • Renamed Docker API container name to drb_local_api
  • Renamed hosts for services in sample-compose file from docker bucket names to localhost

v0.13.0

21 Mar 14:26
4526b29
Compare
Choose a tag to compare

Enables features allowing for "limited access" to DRB titles (e.g. titles are accessible behind NYPL login).
Small fixes made to some ingest processes.

v0.12.3

05 Sep 16:29
d14eab0
Compare
Choose a tag to compare

Fixes an issue where a high volume of wildcard searches created unrecoverable queues/timeouts.

v0.12.2

31 Aug 15:26
Compare
Choose a tag to compare

This release upgrades sqlalchemy to fix a production issue caused by a potential race condition in our current version of sqlalchemy.

It also provides some helpful scripts for a couple of our data ingests.