-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST for the content #159
Comments
+1 on using goldmark's AST, seems like a nice data model (better than plain Markdown strings) we can build upon |
@traut |
Hmm, okay. Do you have any suggestions on what data model to use here for the content trees? I would love to use an existing one instead of building one, and if it comes from the lib we'll be using -- it's a bonus. Markdown strings on their own are quite versatile, being a serialized version of a data model -- whatever the plugin returns, we can parse and adjust. |
Yes, markdown strings are versatile but it creates a problem that we must do a a lot of string manipulation and sanitization at plugin level and this creates bugs. Using existing library AST for this might be also tricky since we would need to encode/decode this tree using protobufs and it would be pretty hard to extend it since the focus for that tree is just one output - Markdown. But that doesn't work well since we also have a concept of If we want to extend it then it's best just to make a copy of their AST tree nodes that we could convert to goldmark's AST when rendering markdown. We don't really need all functionality of goldmark AST we just need similar data structure for core content elements. |
I like that this is a plugin's problem — Fabric expects a valid Markdown string, so the plugin can do whatever it wants to produce it. It does limit us, though, and is a ticking trouble :) It's too flexible, so the validation of the content is difficult (what if the plugin returns a huge, complex Markdown that will mess up the whole document?) but I'm not too worried about that just yet.
I agree, I think you are right. Our model can be more straightforward and should not be tied to Markdown explicitly -- we're going to support other serializations anyway. On that note, we can use our template data model as an inspiration for our content data model, adopting a concept of block with a content type. Even if it's just to wrap a markdown strings into nodes with metadata - it will still be very useful, I think |
I guess this can be closed @Andrew-Morozko? We'll continue working on the migration to the new structure |
@traut I'd rather close it after part two of AST is merged in, but it's up to you! |
Right now each content block produces markdown strings, which requires a lot of string manipulation. For example, the
title
block replaces newlines in its value, otherwise, the title would be broken:# Tiltle with newline is half-broken
We should use an AST structure closely resembling markdown and use it instead:
And use it to avoid repeated validation:
This also decouples us from markdown, making it possible for data sinks to render a document in a custom format (
html
,latex
, or, of course,markdown
...) from our ContentNode structure. Also, this makes the content more easily inspectable and traversable for meta-content providers such astoc
.I considered using goldmark's AST instead of creating our own, but their AST types are assumed to be valid markdown (since they usually come out of their markdown parser), while ours need to be validated/adjusted before we can pass them as a valid markdown.
Since our AST would be based on markdown it would be straightforward to implement
ContentNode
->goldmark/ast.Node
converter, after which we can use goldmark-markdown to render goldmarkAST
back into markdown, built-in renderer to produce HTML or goldmark-latex to produce LaTeX code.The text was updated successfully, but these errors were encountered: