Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add bibliography support #209

Merged
merged 2 commits into from
Jul 13, 2024
Merged

Conversation

acmepjz
Copy link
Contributor

@acmepjz acmepjz commented Jul 12, 2024

This PR adds bibliography support to doc-gen4. It includes:

  • DocGen4/Output/References.lean which contains generic function support for bibliography. It is independent of the actual implementation of bib file formatting.
  • DocGen4/Output/Bibtex.lean which contains an implementation of bib file formatting by calling external Lean library BibtexQuery.
  • Changes in DocGen4/Output/DocString.lean which analyzes citations in markdown text, inserting and modifying citation links, and collect back references.
  • Changes in various output files to add back references collecting feature.
  • Add a subcommand bibPrepass and change the build file accordingly.

This PR has all commit history squashed. To see more history, see the previous obsolete PR #200 which also includes an optional bibtex parser by calling external program pybtex.

Closes #147.

Copy link
Collaborator

@hargoniX hargoniX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this looks good to me now apart from the one nitpick. I'll verify myself that this works on mathlib and then merge this.

Note that because this does not collect references.bib of subprojects this won't work in the mathlib4_docs CI without changes, I'll add a temporary hack to make this work out but I hope that we can add collecting subproject references.bib in the future?

DocGen4/Output/Module.lean Show resolved Hide resolved
Copy link
Collaborator

@hargoniX hargoniX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running this on mathlib almost instantly throws:

info: stderr:
uncaught exception: failed to run texDiacritics on '\url{https://leanprover.github.io/theorem_proving_in_lean/}' at pos 4: unsupported command: '\url'
error: external command './././../doc-gen4/.lake/build/bin/doc-gen4' exited with code 1

Please verify that your branch actually works on mathlib.

@acmepjz
Copy link
Contributor Author

acmepjz commented Jul 12, 2024

Please verify that your branch actually works on mathlib.

Ah, sorry for that. I forgot I have slightly local modifications for the test file. Specifically, \url and \_ are the commands in that which are not supported by current parser.

  • For the \url support an upgrade to the parser is necessary; I'll work on it later. Meanwhile I think it's better to use the standard url field of the bib file.
  • For the \_ it can be added quickly to BibtexQuery (for a proper fix it's best to upgrade to the parser). But In my opinion it's inappropriate to use this command in the url and doi field. In some other entries, there are also _ but which are not escaped.

I'll open a PR to mathlib (leanprover-community/mathlib4#14689). What's your opinion?

@acmepjz
Copy link
Contributor Author

acmepjz commented Jul 12, 2024

I hope that we can add collecting subproject references.bib in the future?

I think it's possible, but I think this needs to be done mostly in build script. The build script collects all paths of bib files, feed them to the doc-gen4. I'm not familiar with that part.

mathlib-bors bot pushed a commit to leanprover-community/mathlib4 that referenced this pull request Jul 13, 2024
This PR removes `\url` and `\_` commands in `references.bib`. The existence of them breaks leanprover/doc-gen4#209.

- The first is `howpublished = {\url{...}}`. I think it's better to use the standard `url` field. Meanwhile `howpublished` field is preserved since some bibtex parser complains about missing that field.
- There are some `\_` in `url` and `doi` fields. I think it's better to replace them with `_`; there are already some `_` in `url` and `doi` fields of other bibitems.
- There is `$K\_X$`, but which, after checking the original article, should be `$K_X$`.
@acmepjz
Copy link
Contributor Author

acmepjz commented Jul 13, 2024

The PR on mathlib was merged, so the current code should work on mathlib again. Meanwhile I'll submit a PR to BibtexQuery to refactor the TeX command processing. This will take some days. Do you think it's good to merge the current code to doc-gen4?

@hargoniX
Copy link
Collaborator

Verified to work with mathlib, will take a little change to mathlib4_docs CI to actually deploy.

@hargoniX hargoniX merged commit 5b01e7c into leanprover:main Jul 13, 2024
1 check passed
@acmepjz acmepjz deleted the acmepjz_bib_1 branch July 13, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement references / bibliography
2 participants