-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chapter sorting #17
Comments
It would be pretty helpful if you could provide us with an example. A simple URL will be enough... But I am not sure if this is possible. Are you asking to extract page numbers out of the PDFs? |
ok, sorry for that. here an example, 978-3-540-23957-4 this is the ISBN of the book Springer Handbook of Robotics, it has 66 chapters and when i download it with the script the are not in the correct order. But i was browsing the contents page of the book, this url: http://www.springerlink.com/content/978-3-540-23957-4/contents/ , and next to the chapters are the pagenumbers of the chapter so i thought i shouldn't be to difficult to make the ordering based on this numbers. |
I have implemented something that might handle this... please give it a try and report back if it is what you intended. |
The sorting seems to work, but only tryed it with one example so far, but if i try without sorting, I get now following error:
|
As already commented inline your modification can not handle front-the matter because of it's roman pagenumbers. Additionally there are back-matters with pagenumbers starting at 1. E.g. www.springerlink.com/content/978-3-540-25202-3/ |
Sometimes, the chapters of the books are sorted alphabetically on the contents page of springerlink, as the script only uses this information for its list order, the chapters are mixed up which isn't very nice.
Maybe there could be a sorting, based on the page numbers of the chapters. I think it should be possible, but I'm not very good on regex, so I can't present a solution myself.
The text was updated successfully, but these errors were encountered: