Skip to content

A webpage which searches through the most popular manga readers in Brazil to find the most updated one for a specific manga.

License

Notifications You must be signed in to change notification settings

Potato-29/manga-br-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Manga BR Search Engine


An webapp which searches through the most popular manga, anime, novel and webtoon readers in PT-BR/EN-US to find the most updated one for a specific series.

Made by Eduardo Henrique (BACKEND) and Rosialdo Vidinho (FRONTEND). Documentation and project management support by Guilherme Bernardo;


Table of Contents


Goals and Limitations

✅ means done.

🚧 means doing.

❌ means won't do.

 

[Project Related]

  • 🚧 Migrating from JS to TS.
  • 🚧 Adopt micro commits strategy.
    • Commit for atomic changes so that errors and bugs can be resolved faster.
  • 🚧 Integrate backend and frontend.
  • 🚧 Separate app in 3 Docker containers: one for the DB, one for the API and one for the crawlers.
  • 🚧 Creating an RESTful API using Express so the user can read from DB without having access to it.

 

[Proxy Related]

  • ✅ Use proxies.
  • ✅ Creating a proxy pool with auto renew.

 

[Crawler Related]

  • 🚧 Create crawlers for the following sites:
  • ✅ Send multiple requests at once.
  • ✅ Using Puppeteer for JS Rendering.
  • ✅ Created list of relevant sites to use.
  • 🚧 Change crawler structure to adopt crawlee.
  • 🚧 Creating crawlers for a lot of sites.

 

[DB Related]

  • ✅ Created DB (MongoDB).
  • 🚧 Create backup DB.
  • 🚧 Update DB on demand.

 

[Frontend Related]

  • 🚧 Create search bar to take user input.
  • 🚧 Creating frontend for the site using React.
  • 🚧 Create responsive dropdown menu for series type selection.

 

[Limitations]

  • ❌ Crawl websites with Cloudflare anti-bot features.
    • Sites such as mangalivre won't be crawled for the time being.
  • ❌ Read series through site.
    • It will only redirect the user to a site with said series.
  • ❌ Update DB daily.
    • We are deciding if this is viable right now, but as it is, it will hinder our progress, so we will postpone this feature.
    • Maybe this will be possible since we discovered a way to request multiple pages at once.

Problems Encountered

Problems

  1. Updating DB daily;
  2. Updating DB without reconstructing it from scratch;

Possible Solutions:

  1. Make the update process faster and less resource-heavy;
  2. Only update series that changed values;

Solved Problems

  1. Only update DB after going through every series in a site;
  2. Create proxy pool with auto renew;

Solutions:

  1. Update for every visited page in the site;
  2. Use Webshare builtin proxy renewing tool;

Demo

WIP 😎


Translations

portuguese translation

About

A webpage which searches through the most popular manga readers in Brazil to find the most updated one for a specific manga.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published