Releases · cu-mkp/manuscript-object

16 Mar 19:24

njr2128

807b933

v1.0-ronikaufman 2021-03-16 Latest

Latest

This branch was created from the "context" branch of Roni Kaufman's fork of manuscript-object (https://github.com/ronikaufman/manuscript-object/tree/context). Last commit was on Aug 3, 2020: ronikaufman@3a0aeab https://github.com/ronikaufman/manuscript-object.

In spring 2020, Roni joined the Making and Knowing Project as an intern. This work was prepared to explore visualizations of data from https://github.com/cu-mkp/m-k-manuscript-data.

Assets 2

16 Mar 19:53

njr2128

v1.0-danachaillard

5abf324

v1.0-danachaillard 2021-03-16

This branch was created from the "master" branch of Dana Chaillard's fork of manuscript-object (https://github.com/danachaillard/manuscript-object). Last commit was on Aug 21, 2020: danachaillard@5fa06a4.

In spring 2020, Dana joined the Making and Knowing Project as an intern. This work was prepared to explore visualizations of data from https://github.com/cu-mkp/m-k-manuscript-data.

Assets 2

16 Feb 20:04

njr2128

v2.0.1

7093ddd

v2.0.1 2021-02-16

This release contains some bug fixes and usability improvements to the core derivative-generating code.

Most importantly:

Improved documentation and comments
Manuscript class constructors are more flexible in what inputs it can take (arbitrary versions)
Manuscript object can be mutated to include extra folios and entries after being initially created
Update outdated property tag names
Add tag to list of properties
Fix bug where derivative files were named incorrectly
Fix bug where divs with no id attribute would cause error
Fix major bug in update.py logic
Fix major bug where existing derivatives were not removed before writing new ones

Next steps/to do:

improved automated testing (see #51)
revisiting high-level design (see #76)
generally, moving on from derivative generation to data analysis and visualization

See also any open issues: https://github.com/cu-mkp/manuscript-object/issues

Assets 2

29 Oct 18:41

njr2128

v2.0.0

94d158d

v2.0.0 2020-10-29

Second major release of manuscript-object with significant changes to the core object code.

All of the central code is now in two files, with one auxiliary file (utils.py).
update.py usage remains unchanged
The titular "manuscript object" is now a class called Manuscript inside a file called manuscript.py. Each entry is turned into an object of the Entry class inside a file called entry.py.

Some highlights:

much faster (update_entries() is, as before, the longest step)
increased verbosity during generation
manuscript and entry modules are importable and interactable
Manuscript and Entry classes control their own behavior
- e.g. generating and updating derivatives happens inside the Manuscript class
- update.py works as before, but simply calls the update methods inside Manuscript
- this means if you want to generate the derivative output in a Python shell and interact with it as a string or table, you can do so by importing manuscript and running one of the derivative generation methods
- derivative generation takes place in 2 steps: generation and then writing. This enables checks for correctness before writing to disk
All xml is converted to lxml.etree objects for easier and more consistent parsing
text renditions of editorial tags are created using an XSLT stylesheet
- this stylesheet takes parameters, so if you don't want to render del tags as <-TEXT->, for example, you can just set that to "false()"
As possible, functions are reused rather than duplicated in order to facilitate bug checks, e.g., there's only one function which tells you how to convert a string to an lxml.etree Element.
the Entry class is very flexible:
- there are different methods to take a valid lxml.etree Element, a string of well-formed XML, or a filepath to a valid XML file
- folio and identity arguments are optional
- only one version of each entry is given at a time (handling tc, tcn, and tl versions is done by the Manuscript object, not the Entry)
- if it is desired to test or inspect the contents of a txt or xml file -instead of manually opening a file - it can simply be loaded as an Entry object in a Python shell and look at the text and the properties that way

To do:

implementing more automated spot- and unit-tests
sophisticated search function for Manuscript
type annotations are useful and correct (e.g., specificity of "xml") - see use in
https://github.com/cu-mkp/manuscript-object/blob/94d158d814bf9a62071a11845a9b2938d561ab3e/entry.py#L10
optional arguments to Manuscript specifying which entries you want to generate
function to inspect the context around a particular term
visualization engine
thesaurus

see also any open issues: https://github.com/cu-mkp/manuscript-object/issues

Assets 2

28 Sep 20:45

tcatapano

v1.0.1

efd4b22

2020-09-28 v1.0.1

Patch release fixing bug described in issue #42 "tail text in entry-metadata.csv" (see also: cu-mkp/m-k-manuscript-data#1909, "Exclude lxml tails from find_tagged_terms outputs")

https://github.com/cu-mkp/manuscript-object/blob/efd4b221edccc5e7b01e0c494dd02b71bb8f1025/recipe.py#L106

        return [et.tostring(tag, method="text", encoding="utf-8", with_tail=False).decode().replace("\n", " ") for tag in tags]

Assets 2

14 Sep 21:50

gschare

v1.0.0

01d3397

2020-09-14 v1.0.0

Major release of manuscript-object repository.

The BnF class represents a Python object version of BnF Ms 640. It contains a list of Recipe objects, which hold the raw XML data from each entry along with some other data such as length and properties.

When BnF is instantiated, it loads every folio in ms-xml and processes it into its component entries, each of which becomes a Recipe object. ms-xml is folder in the repository containing the data, m-k-manuscript-data.

update.py is a script that generates the BnF object and then writes derivative forms and the entry-metadata table to the m-k-manuscript-data repository.

Known issues:

del tags are unmarked in derivative files [issue]
- see m-k-manuscript-data issue tracker for other issues related to derivative files
test/ folder is just a dumping ground for entry-metadata.csv files; it can be removed

See issues tracker for other ongoing issues and feature requests.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: cu-mkp/manuscript-object

v1.0-ronikaufman 2021-03-16

v1.0-danachaillard 2021-03-16

v2.0.1 2021-02-16

v2.0.0 2020-10-29

2020-09-28 v1.0.1

2020-09-14 v1.0.0