Async processing of markdown #121
Replies: 3 comments 12 replies
-
Heya! Thanks for your kind words :) The thing with markdown (and HTML) is that it actually cannot stream. Definitions can exist at the end of a document. Whether they exist or not affects how earlier markdown is parsed. Whether something is a link or not affects more than just emitting ![[*x*][]](asd)
![[*y*][]](asd)
[*x*]: https://example.com Yields: <p><img src="asd" alt="x" />
<img src="asd" alt="[y][]" /></p>
I am not sure that it is useful to control this. The alternative threads/workers, seems like a better abstraction level, that is already able to control this.
That is because ASTs allow arbitrary processing. Which is very easy when things are in their entirety as JSON. And very complex otherwise. The entire ecosystem is built around whole ASTs. It would mean none of the utilities or plugins work. And there’s no standard for streaming ASTs?
I am under the impression that GC always works like that?
I mean, there are costs with everything. Expensive is subjective. A whole pause/resume process seems expensive too?
I think you can send things to and from workers?
What is the use case? Why do you want this? |
Beta Was this translation helpful? Give feedback.
-
There's a small misunderstanding. I do not want to stream through the entire process. I know from prior discussions that making the AST requires the full context. What I'd like is similar to the streaming API of micromark. Stream into a buffer, then create the AST from the buffer. But, remark has no streaming API nor does it export the compiler to make the AST like micromark does. So I can't make my own streaming API with the library (unless I pull the code out wholesale, but I'd prefer to avoid doing that if I can)
It is if you want to avoid majorgcs. Even in an alternative thread, we hit them since we aren't letting go of the thread while building the AST.
I think my clarification at the beginning addresses this portion
The garbage collector only stops the world when it runs out of memory. Otherwise it it goes through phases/generations. Nursery -> Intermediate -> Old In an over simplification, it is faster to reclaim objects from the Nursery than from the Old generation.
A pause/resume process would take longer in real time, but use less memory and CPU time.
One worker would not be able to share CPU time to create an AST. I could send another file to it, but until the first is done the second would wait.
Sharing markdown between users in a P2P web context, viewing them as HTML rather than text. Utilizing unified, remark, and rehype plugins to manipulate the tree after it's created. |
Beta Was this translation helpful? Give feedback.
-
In case y'all are interested. Flame graphsRemark - Basic (no plugins, readme.md) x500Remark - Basic (no plugins, 50x readme.md) x10Remark - Plugins (readme.md) x500Remark - Plugins(50x readme.md) x10NotesProfiles made on Firefox Noteable Plugins:remark-directive |
Beta Was this translation helpful? Give feedback.
-
Goals
Non-Goals
How
Why
Both micromark and mdast-util-from-markdown parse the markdown file in one go. micromark exports a streaming option but that utilizes node duplex streams and only converts markdown to html rather than mdast.
By exporting the compiler from mdast-util-from-markdown, devs will be able to control the steps and timings of the processing.
Event Loop
Yielding to the event loop allows for the render process to proceed, garbage collection to occur (minor and major), and other events to occur.
Parity with micromark
micromark exports its compiler but mdast-util-from-markdown does not
Alternatives Considered
The usual suggestion is to parse markdown in a separate thread. This produces other problems and doesn't address others.
Even if the parser is in another thread.
Related Prior Discussion
syntax-tree/mdast-util-from-markdown#29 (different use case, but same result)
PS
The unified ecosystem is amazing. Thank you all for your dedication and work to make, maintain, responding to discussions, and dealing with issues.
Beta Was this translation helpful? Give feedback.
All reactions