Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SILE resilient master files #3

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Omikhleia
Copy link

@Omikhleia Omikhleia commented Nov 30, 2023

Closes #2

Work in progress, to check how good is the USX support in the re·sil·ient collection, and what needs to be fixed there:

  • Master documents
  • Style files (these might be later replaced)

EDIT: As part of this PR, I am also adding the TGCNT, see comment.

EDIT: Issues noted in comment below addressed resilient 2.2.1.

@Omikhleia Omikhleia marked this pull request as draft November 30, 2023 20:11
@Omikhleia
Copy link
Author

With provided settings, these compiled without crashing (but of course the results have to be checked):

  • KJB (~ 1860 pages, PDF 17.8 MB)
  • BSB (~ 1930 pages, PDF 15.5 MB)
  • WEB (~ 2050 pages, PDF 18.3 MB)

Failing:

ULT. It was still running so I stopped after some 3900 pages (41 MB) and it appears it contains sequences that are not encoded in USX but as raw text. That's a problem in the source USX file.

image

<para style="zaln-s" status="unknown">|x-strong="b:H7225" x-lemma="רֵאשִׁית" x-morph="He,R:Ncfsa" x-occurrence="1" x-occurrences="1" x-content="בְּ⁠רֵאשִׁ֖ית"<unmatched marker="*" /><char style="w" x-occurrence="1" x-occurrences="1">In</char> <char style="w" x-occurrence="1" x-occurrences="3">the</char> <char style="w" x-occurrence="1" x-occurrences="1">beginning</char></para>

Some conversion artifacts?

@Omikhleia Omikhleia force-pushed the resilient-master-files branch from 324f7fd to e64320c Compare November 30, 2023 20:40
@Omikhleia
Copy link
Author

Added French bibles...

  • NCL = fails around p. 72 - we have to investigate it.

    error summary:
      Processing at: SILE/NCL/../../USX_test_versions/French/NCL/002EXO.usx:0:0: in \range-reference
      Using code at: ...share/lua/5.4/sile/packages/resilient/bible/usx/init.lua:194: attempt to concatenate a nil value (field 'book')
    
  • FOB (~ 1410 pages, PDF 14.6 MB)

@Omikhleia Omikhleia force-pushed the resilient-master-files branch from 5fa762d to 2fd8494 Compare December 1, 2023 08:00
@Omikhleia
Copy link
Author

Omikhleia commented Dec 1, 2023

More French:

  • SBL (~2060 page, 18.7MB): compiled without error, but it has punctuation issues:
    • it uses non-breaking spaces, defeating the French support in SILE. This could be addressed there.
    • It uses straight (non typographic) double and single quotes in many places. These could be hard to get right ("smart typography" is not obvious when the occur as they do here)

EDIT: I am tempted to say that punctuation encoding in the source is quite lame, though.

@Omikhleia
Copy link
Author

Greek

  • UGNT (~ 347 p, 3.3MB)
    • uses a non-standard paragraph style <para style="usfm">3.0</para> which would have to be ignore rather than rendered.

@Omikhleia
Copy link
Author

  • NCL = fails around p. 72 - we have to investigate it.

There are two books in NCS with empty headers <para style="h" />: EXO and JAS.

While we could make the code not to crash, how are we going to guess the running header?

@Omikhleia
Copy link
Author

While we could make the code not to crash, how are we going to guess the running header?

This whole USX, USFM, USFX etc. is unsane. If supporting poorly designed XML schema (and I weight my words here) is a necessity for SILE 1.0, we are going to wait long. And I thought TEI XML was complex -- it's a pleasure compared to this...

@Omikhleia
Copy link
Author

NCL, with a hack for missing headers = 1770 pages without any other crash.

Didier Willis and others added 2 commits December 15, 2023 00:53
Interesting USX example:
- It's public domain
- The sheer amount of notes is a challenge
@Omikhleia
Copy link
Author

As part of this PR, I am also adding the TGCNT (Text-Critical Greek New Testament):

  • It's public domain
  • It's shorter than the other USX samples (~ 300+ pages) so easier to compile ;)
  • It has a sheer amount of notes

The latter point is interesting, as the current USX support in re·sil·ient just collates the notes in the page margin, but this is not using "true" insertions, so the amount of notes here cannot fit and overflows:

image

There are interesting challenges here to address, and this "small" bible is a perfect use case (besides Greek being nice, heh).

@Omikhleia
Copy link
Author

Aside note: Of course, we could use a much smaller font size and try to make all notes fit, but that wouldn't be fair ;)

@Omikhleia
Copy link
Author

N.B. My "task" here is done -- I'll remove the "Draft" status on the PR when resilient.sile 2.2.1 is released with the fixes identified here. I'd expect this to occur by the end of 2023.

@Omikhleia
Copy link
Author

Omikhleia commented Dec 28, 2023

Minimum supported versions (as of now):

  • SILE 0.14.13 -- ideally a LuaJIT-enabled build for faster compilation. EDIT: Minimum recommended is 0.14.14 (for the SBL and its non-breaking spaces)
  • resilient.sile 2.2.1

Merry Christmas!

@Omikhleia Omikhleia marked this pull request as ready for review December 28, 2023 20:42
@Omikhleia
Copy link
Author

Ping.

Well, Bibles don't contain a lot of italicized text, but since this nice
feature is now available with SILE 0.15.x, there's no reason to leave it
commented out.
Since I made a quick pass on these files as part of the regression
testing of resilient.sile 2.5 with SILE 0.15.5, let's take this
opportunity to use these new versions and add simple book matters.
The Book, after all, shall have covers, even if minimalist for now.
@Omikhleia
Copy link
Author

Just used those files for quick regression testing phase with SILE 0.15 and resilient.sile 2.5 = An occasion to enable new features that were unavailable in 2023...
Hence, minimum supported versions (as of now):

  • SILE 0.15.5
  • resilient.sile 2.5.0

On my way, Just noticed after visiting this repository (but probably I also got a mail which I missed among all the GitHub notifications, my bad):
image

I'm afraid I have to reject the offer. I believe we are all walking a narrow road when it comes to making choices that align with our priorities and values, and I must stay true to my current commitments. It seems I lack the faith required to take this leap right now—but I admire your conviction and wish you the best of luck...

@Omikhleia
Copy link
Author

It seems I lack the faith required to take this leap right now—but I admire your conviction and wish you the best of luck...

To paraphrase one of your own message:
I might be wrong, but from my following of the issues and discussions, I got the feeling that the world of digital Bible editions in a constant state of flux concerning the right way to do things, from USFM to USFX, USX and now USJ...
It sure doesn't help one keeping interest and staying invested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Next steps?
1 participant