Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull dictionaries out of repo #23

Open
ahalterman opened this issue Jul 11, 2016 · 2 comments
Open

Pull dictionaries out of repo #23

ahalterman opened this issue Jul 11, 2016 · 2 comments

Comments

@ahalterman
Copy link
Member

Petrarch2's code should be distinct from the dictionaries it uses. To make changes to the dictionaries more visible and to make it easier to switch in custom dictionaries, take the built-in dictionaries out of Petrarch2 and have it look for them at a location defined in the config file. Move the built-in dictionaries to the Dictionaries repo.

This should help clear up questions like #19.

@johnb30
Copy link
Member

johnb30 commented Jul 11, 2016

This already exists at https://github.com/openeventdata/petrarch2/blob/master/petrarch2/data/config/PETR_config.ini. Additionally, the deployed version of the pipeline has always specified a separate config file for PETR with the dictionaries living in a top-level directory. You can see how you'd hit that at https://github.com/openeventdata/petrarch2/blob/master/petrarch2/petrarch2.py#L530. The built in dictionaries are included so people can download and run as-is without having to hunt for dictionaries as well.

If this isn't clear in the documentation/user guides then it should be called out in greater detail. In other words, I think this is a docs issue rather than a feature issue.

@philip-schrodt
Copy link
Contributor

But there's a bigger issue here that goes way beyond just the dictionaries: there's quite a bit of CAMEO-specific code in Petrarch-2, e.g. the fairly complex system by which certain word combinations modify the event code, then the conversion of the internal representation of the code to CAMEO via the utilities.convert_code() function. This was not the case in Petrarch-1, TABARI or KEDS, but was the case in the VRA-Reader and, as far as I know, ICEWS/Accent. Which is to say, the problem has been solved both ways.

My inclination (obviously...) is that the coder should be completely neutral to the coding system, but that's not what we've got at the moment in Petrarch-2, and getting there would require an extensive rewrite, and probably would best be done by adding some sort of macro facility to the dictionaries . That would be a "good thing" (TM) and TABARI came pretty close to having this, though it was never really used in working dictionaries, and Petrarch-2 already has a limited version as well.

Possibly the place to figure this out, however, is in deciding how the dependency-based version of the program is going to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants