Create a huge collection of all tamil books metadata #228

I am scraping data from Panuval book store.
Posted my code to git.
GitHub link is https://github.com/rajkannan1978/web-scraping.git
Please let me know if any bugs there.
I welcome any suggestions to improve the code.

rajkannan1978 · 2024-09-05T16:23:50Z

Hello,
Posting some of my python projects are here.
Web Scraping https://github.com/rajkannan1978/web-scraping.git
Grocery https://github.com/rajkannan1978/grocery.git
Number Guess Game https://github.com/rajkannan1978/number_guess_game.git

Thanks.

rajkannan1978 · 2024-09-11T11:37:58Z

Hi,
Got 18708 books from panuval.com
Web Scraping https://github.com/rajkannan1978/web-scraping.git

amotbeli · 2024-09-12T01:45:00Z

Got 15845 books from the Anna Centenary Library catalogue.

See here.

rajkannan1978 · 2024-09-12T11:28:48Z

Super. I visited the website. Also learned from your code. It is neat and clean. Thanks.

…

On Thu, Sep 12, 2024 at 7:15 AM amotbeli ***@***.***> wrote: Got 15845 books from the Anna Centenary Library catalogue. See here. <https://github.com/amotbeli/acl_data/blob/main/acl_data.json> — Reply to this email directly, view it on GitHub <#228 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BKZNVCPFE2TGLSQWIZJRIW3ZWDW3FAVCNFSM6AAAAABM4G32EWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBVGA4DSMZSGA> . You are receiving this because you commented.Message ID: ***@***.***>

amotbeli · 2024-09-12T16:41:17Z

Thank you, rajkannan1978!

tshrinivasan · 2024-10-05T16:05:41Z

We need a central place to store and display all the books metadata.

Explored Omeka https://omeka.org/ and Islandora https://www.islandora.ca
Installed both as https://omeka.kaniyam.cloudns.nz/ and https://islandora.kaniyam.cloudns.nz/

Omeka is missing features of customizing themes and appearance. custom fields is tough. can not store the files on a remote server. can not translate easily. less plugins.

Islandora has more features, we can customise the display, it has bilingual capacity. As it is a drupal based project, all the drupal's powers comes along, with tons of drupal plugins.

Here is the doc for lightweight islandora installation.
https://github.com/digitalutsc/islandora_lite_docs/wiki/7.-Installation

Thanks to @Natkeeran for the guidance on the setup.

Hence, going with islandora. seeing this video to learn the basics - https://www.youtube.com/watch?v=dfc7WUGAmow

Will explore on how to add data .

kamalaak · 2024-11-08T11:19:07Z

I’ve scraped 50,000+ book details from the www.noolulagam.com
Here’s the repository: https://github.com/kamalaak/noolulagam_books_scraping.git

tshrinivasan · 2024-11-08T12:12:09Z

Wonderful. thanks. Please scrap the cover images also.

kamalaak · 2024-11-11T05:17:52Z

I scraped 50,000+ books with their images. The images are stored in a separate folder, and the CSV file includes the paths to those images.

Here’s the link to the images folder and the repository with the CSV:

CSV: https://github.com/kamalaak/books_scraping/blob/main/with_books.csv

Images: https://drive.google.com/file/d/19z8WCYSoNIxo1nOVzHJtEttFpKDHkeET/view?usp=drivesdk

kamalaak · 2024-11-12T09:59:28Z

I scraped details of 20,000+ books, including images, from https://dialforbooks.in. However, some books are missing images. Here are the CSV and image links.

https://github.com/kamalaak/books_scraping/blob/main/cleaned_file.csv
https://drive.google.com/drive/folders/17wB8tsKXU6tYPmEebn5sXwXnsbco_PaO?usp=drive_link

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a huge collection of all tamil books metadata #228

Create a huge collection of all tamil books metadata #228

tshrinivasan commented Aug 21, 2024 •

edited

Loading

Natkeeran commented Aug 21, 2024

kjanani27 commented Aug 21, 2024 •

edited

Loading

amotbeli commented Aug 21, 2024

HariharanUmapathi commented Aug 27, 2024 •

edited

Loading

rajkannan1978 commented Aug 27, 2024

amotbeli commented Aug 28, 2024

rajkannan1978 commented Aug 30, 2024

rajkannan1978 commented Sep 5, 2024

rajkannan1978 commented Sep 11, 2024

amotbeli commented Sep 12, 2024

rajkannan1978 commented Sep 12, 2024 via email

amotbeli commented Sep 12, 2024

tshrinivasan commented Oct 5, 2024

kamalaak commented Nov 8, 2024

tshrinivasan commented Nov 8, 2024 via email

kamalaak commented Nov 11, 2024

kamalaak commented Nov 12, 2024 •

edited

Loading

Create a huge collection of all tamil books metadata #228

Create a huge collection of all tamil books metadata #228

Comments

tshrinivasan commented Aug 21, 2024 • edited Loading

Natkeeran commented Aug 21, 2024

kjanani27 commented Aug 21, 2024 • edited Loading

amotbeli commented Aug 21, 2024

HariharanUmapathi commented Aug 27, 2024 • edited Loading

rajkannan1978 commented Aug 27, 2024

amotbeli commented Aug 28, 2024

rajkannan1978 commented Aug 30, 2024

rajkannan1978 commented Sep 5, 2024

rajkannan1978 commented Sep 11, 2024

amotbeli commented Sep 12, 2024

rajkannan1978 commented Sep 12, 2024 via email

amotbeli commented Sep 12, 2024

tshrinivasan commented Oct 5, 2024

kamalaak commented Nov 8, 2024

tshrinivasan commented Nov 8, 2024 via email

kamalaak commented Nov 11, 2024

kamalaak commented Nov 12, 2024 • edited Loading

tshrinivasan commented Aug 21, 2024 •

edited

Loading

kjanani27 commented Aug 21, 2024 •

edited

Loading

HariharanUmapathi commented Aug 27, 2024 •

edited

Loading

kamalaak commented Nov 12, 2024 •

edited

Loading