Skip to content
This repository has been archived by the owner on Feb 8, 2021. It is now read-only.

Auto-install, auto-filter, and dependencies #9

Open
3 tasks
ickc opened this issue Jan 14, 2017 · 4 comments
Open
3 tasks

Auto-install, auto-filter, and dependencies #9

ickc opened this issue Jan 14, 2017 · 4 comments
Labels

Comments

@ickc
Copy link
Member

ickc commented Jan 14, 2017

As we'll see below, "Auto-install, auto-filter, and dependencies" are closely related issues.

Split off from #2:

Relationship between Panzer, Pandocpm, Panflute, Pandocfilters

These are the potential dependencies:

  • panflute
  • pandocfilters
  • panzer

From #2 (comment)

  • pandocpm can host panzer style files in the same way it hosts filters and templates.
  • users of panzer could use pandocpm to ensure their filters are installed.
  • pandocpm makes no difference between panflute and pandocfilters et al, which is a plus.

auto-filter and auto-install

auto-filter: should it fall under panflute or pandocpm?

From #2 (comment):

An alternative approach of auto-filter would be, rather than having an auto-filter in panflute and panflute calling pandocpm, may be the auto-filter can be in pandocpm instead, where pandocpm lists panflute (and possibly pandocfilters) as dependency. Then the pandocpm as a filter can do everything under the hood:

pip install pandocpm
add the filter names in the YAML of the markdown
pandoc -F pandocpm ...

Edit: a way to circumvent the main function problem is to embed the name of the main/action function in the yaml formula.

Additional notes:

This way, pandocfilters will have a more equal ground to panflute in terms of auto-filter and auto-install, which will be easier for adoption.

This was referenced Jan 14, 2017
@sergiocorreia
Copy link
Contributor

As a user, if all of your filters are panflute, it means we can run them way faster, because behind the hood autofilter avoids loading from stdin and converting to json (and same for dumping). EG:

Standard workflow, slow

pandoc -F filter1.py -F filter2.py ...

  1. pandoc reads the document, creates an AST, dumps it to stdout as json
  2. filter1 reads stdin, converts the JSON into an Doc() object
  3. filter1 runs the action() function
  4. filter1 dumps the new Doc() into JSON in stdout
  5. filter2 reads..
  6. filter2 runs..
  7. filter3 dumps...

autofilter workflow, fast

pandoc -F autofilter ...

  1. pandoc reads the document, creates an AST, dumps it to stdout as json
  2. panflute filter1 reads stdin, converts the JSON into an Doc() object
  3. panflute calls main() in filter1, which just runs action
  4. panflute calls main() in filter2, which just runs action
  5. panflute dumps the new Doc() into JSON in stdout

This means that once you are running at least one filter, running more is fast even in large documents.

Now, this can't be replicated with pandocfilters because there is no Doc() object. Sure, you could do externall calls, but then it would do exactly the same as the initial pandoc call, with the only gain being a faster--to--type command (in which case you can just use panzer)

@ickc ickc mentioned this issue Jan 15, 2017
@ickc
Copy link
Member Author

ickc commented Jan 15, 2017

Oh, so there's no way to have a "panflute-style auto-filter" in the case of pandocfilters, even with the main function (or a lookup of the "main function" as mentioned in #7)?

I am not familiar with the pandocfilters design so what I'm going to say might not make sense: e.g. can't we collect all the functions that needed to pass to toJSONfilters and passing all functions as a list?

If there's no way to make the above work, there might still be an advantage of providing a shortcut for all filters. (this won't be as fine-grained as panzer's option.) e.g. filters: [filter1, filter2]. So pandocpm can still auto-install santinized filters for the end-users in the first run.

The current panflute-filters key can remains. As far as pandocpm is concerned, anything in panflute-filters will be auto-installed by pandocpm (when pandoc -F pandocpm is used), then it will pass these to panflute and let panflute do its magic. i.e. in this case, pandocpm and panflute can both be used as a filter. pandocpm as a filter will recognize both filters and panflute-filters and auto-install them. filters will be run by pandocpm, and panflute-filters will be passed to panflute. panflute as a filter only recognize panflute-filters, and will only run them, but not auto-install them.

@ickc
Copy link
Member Author

ickc commented Jan 15, 2017

Auto-install can potentially has a couple of problem:

  • simple packages are "sandboxed", so it won't be a problem. But packages through external package managers can be a problem, since auto-install will contaminate people's system. This will be extra-problematic for those that are infamous for dependecy hell (e.g. cabal).
  • if we only do auto-install for simple packages, then people will complain it sometimes works, sometimes don't.
  • because of these, we should label the auto-install as experimental feature

I think auto-install should install simple packages only, and print out error message to direct the users to install complex packages.

sidenote: filter arguments

If so, this will give people more incentive to keep their filter "simple". One problem I'm personally facing is my pantable is growing in complexity. I asked around in https://groups.google.com/forum/m/#!topic/pandoc-discuss/LIAfgkZKUiE about filter arguments. And currently panzer already allow filter arguments. What's your view on this?

And in terms of panflute's autofilters, it seems filter-arg won't work very well. But such feature (auto-filter that offer speed up) seems too good to give up.

@ickc
Copy link
Member Author

ickc commented Jan 21, 2017

@jgm had considered deprecating pandocfilters to favor panflute (from pandoc-discuss), but when I asked him again he didn't reply. I assume that he will continue to maintain it. (probably because a lot of users are still using it. Porting has incredible friction by just looking at py2 to py3 transitions.)

I wonder it will be worth the effort to convince him to let me maintain pandocfilters, so that necessary changed can be made into it to support some kind of auto-filters better, perhaps even follow panflute's design to make pandocfilters also works as a filter. What do you think?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants