-
Notifications
You must be signed in to change notification settings - Fork 1
March 2, 2023
alejandropaz edited this page Mar 2, 2023
·
2 revisions
- check on logs to see which domains giving us the rejections
- slow down restarted crawl
- Postprocessor
- look into email function
- add Irfan's key-pair and Alejandro's
- Alejandro: change password
- add more logging of errors to postprocessor
- Alejandro come up with domain crawl scope
- Alejandro: write to security person about adding Shawn to Graham
- did changing password do anything?
- can't see dashboard but can ssh into Graham
- Irfan and Alejandro added to Arbutus
- issues: tabletmag (403) is causing issues; occasinally electronicintifada (doesn't give 403)
- aljazeera.com: giving time out issues (page not loading)
- slowed down: 1 url per minute on average, seems to be better
- email issue: authentication tokens needed to send email
- seems to know the reason: tokens get invalidated somehow
- more complicated than realized
- tried different way of combining crawl output; new error
- issue seems to be bigger file
- more logging of errors: revisit once get it working
- will try and contact Shengsong
- add Irfan and Alejandro key pairs to Graham
- Delete older key pairs from earlier devs
- Alejandro: excel sheet that Shawn sent
- remove tabletmag from small domain crawl, and update index to reflect and make separate pathway in storage
- contact Shengsong about preparing twitter crawl for postprocessing
- set up Israeli newspaper crawl
- email issue for domain crawl break.