Examples › Track 4 — Integrations › Example 16
Streaming LLM output
When LLMs stream tokens, you typically want them to render as markdown — code blocks, tables, lists — not as raw text. Re-parse the entire buffer with quikdown(buffer) on every chunk. The parser is small and fast enough that you can't see the re-renders.
Live simulated stream
The pattern
import quikdown from 'quikdown';
async function streamFromLLM(prompt, target) {
let buffer = '';
const response = await fetch('/api/chat', {
method: 'POST',
body: JSON.stringify({ prompt }),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
target.innerHTML = quikdown(buffer); // ← re-render on each chunk
}
}
Why this works for streaming
- quikdown is small. ~9 KB of parser, no virtual DOM, no diffing. Re-parsing a 4 KB buffer takes microseconds.
- setting innerHTML is cheap. Modern browsers handle it in a single layout pass when the markup hasn't changed structurally.
- Mid-token markdown is fine. If a chunk arrives partway through a code fence (\`\`\` followed by no closing), quikdown still produces sensible HTML — it just renders the partial code as a code block. When the closing fence arrives, the next re-render snaps into place.
- XSS-safe. LLM output is untrusted. quikdown escapes HTML entities and sanitizes
javascript:URLs before they hit the DOM.
This is the pattern quikchat uses to render markdown in streaming chat bubbles.