Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Extractor database #35

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

[WIP] Extractor database #35

wants to merge 17 commits into from

Conversation

PROxZIMA
Copy link
Owner

@PROxZIMA PROxZIMA commented Feb 26, 2023

Description

A database manager for the Crawler and the Extractor modules. Neo4j is the most suitable candidate of all as it's fast, scalable and is capable of storing huge amount of data; that too in graphical manner. This makes it easier to understand and manipulate as per our need.

Motivation and Context

The goal is to use a unified platform to store all the information for both crawler and extractor. Fixes #32

How Has This Been Tested?

TODO

Screenshots (if appropriate):

Neo4j database for depth 1 crawl

Screenshot_2023-02-26_23-40-56

Some logs

Screenshot_2023-02-26_23-42-19

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@PROxZIMA PROxZIMA self-assigned this Feb 26, 2023
@PROxZIMA PROxZIMA added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 26, 2023
@PROxZIMA PROxZIMA marked this pull request as draft February 27, 2023 15:43
@PROxZIMA PROxZIMA marked this pull request as ready for review August 9, 2024 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

Extractor should use proper mechanism to extract and store URLs
1 participant