Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing of Separators #1

Closed
znjameswu opened this issue Dec 29, 2020 · 2 comments
Closed

Parsing of Separators #1

znjameswu opened this issue Dec 29, 2020 · 2 comments

Comments

@znjameswu
Copy link

First of all, this is really a great library! According to my knowledge, there are very few open-source attempt at UnicodeMath. Excellent job!

I'm also writing my own UnicodeMath parser in Dart. One problem I found within UnicodeMath, A Nearly Plain-Text Encoding of Mathematics, Version 3.1 Appendix A. UnicodeMath Grammar is that, it fails to account for separators in the grammar for expBracket.

Example

(1/2 \mid 3)

The \mid here needs to be identified as a separator to stretch.

Solution

We may need to change the grammar from

expBracket ← opOpener exp opCloser
← ‘||’ exp ‘||’
← ‘|’ exp ‘|’

to

expBracket ← opOpener exp (opSeparator exp)+? opCloser
← ‘||’ exp ‘||’
← ‘|’ exp ‘|’

However this introduces backtracking problems. Since | U+007C can at the same time be opener, separator or closer. I haven't got a good solution to this now.

Meanwhile it would be great if we can collaborate on this matter. I have been documenting my findings in znjameswu/flutter_math#2

@doersino
Copy link
Owner

doersino commented Jan 3, 2021

Hiya! I'm glad someone out there is finding my work useful! 😄

I'm also writing my own UnicodeMath parser in Dart. One problem I found within UnicodeMath, A Nearly Plain-Text Encoding of Mathematics, Version 3.1 Appendix A. UnicodeMath Grammar is that, it fails to account for separators in the grammar for expBracket.

Yeah, the given grammar is rather incomplete – during the process of implementing of my parser, I had to piece together many details based on the text, which has likely led to a number of subtle incompatibilities with the canonical implementation in Microsoft Office (which I don't currently have access to).


Addressing your specific issue, Sargent's document contains the following description of separators:

Screenshot 2021-01-03 at 18 06 53

With the most common use cases and a desire to end up with a comprehensible parsing grammar in mind, I've interpreted/implemented this basically as follows:

  • The pipe symbol (|, U+007C) is interpreted as a special type of bracket that must be paired with itself (to denote absolute values, supposedly), as is two pipe symbols or a double vertical line (‖, U+2016).
  • The vertical bar separator \vbar (│, U+2502) is to be used for separatos in expressions like (a|b). Sargent writes about using the pipe symbol for this in some cases, but admits that "the resulting ambiguities are insurmountable in general", which I took as a signal not to bother trying to make it work in this context.
  • The mid character \mid (∣, 2223) is for expressions like {x | f(x) = 0}. Note that you seem to be conflating it with the pipe symbol U+007C, which might not be wrong: both of which may indeed work in Office.

I hope that's somewhat helpful!


Taking a look at your notes, it strikes me that you're also looking to implement the "input method" part of UnicodeMath, with successive (sub)equation build-up. I suspect that in this environment, you might be able to derive some context from the already-built-up portion of the expression that would e.g. allow using the pipe symbol for separators without the insurmountable ambiguities.

@znjameswu
Copy link
Author

Thank you for the comment. I might have misintepreted your code when I opened up this issue (didn't see the expBracketContent).


Also thanks for your advice on the interactive build-up strategies. In general, I found it quite hard to speculate MS Office's implementation. I still don't have a solid speculation and still have to try some solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants