← Docs
Streaming
Stream tokens over Server-Sent Events. Use it when users benefit from seeing output as it is generated, such as chat, coding, and long-form responses.
Examples
Python
from openai import OpenAIclient = OpenAI(base_url="https://test.sealink.io/v1", api_key="<your-sealink-key>")stream = client.chat.completions.create(model="qwen3-max",messages=[{"role": "user", "content": "Stream a 200-word reply."}],stream=True,)for chunk in stream:delta = chunk.choices[0].delta.contentif delta:print(delta, end="", flush=True)print()
Node.js
import OpenAI from "openai";const client = new OpenAI({baseURL: "https://test.sealink.io/v1",apiKey: process.env.SEALINK_API_KEY,});const stream = await client.chat.completions.create({model: "qwen3-max",messages: [{ role: "user", content: "Stream a 200-word reply." }],stream: true,});for await (const chunk of stream) {const delta = chunk.choices[0]?.delta?.content;if (delta) process.stdout.write(delta);}
cURL (-N keeps stream open)
curl https://test.sealink.io/v1/chat/completions \-H "Authorization: Bearer $SEALINK_API_KEY" \-H "Content-Type: application/json" \-N \-d '{"model": "qwen3-max","messages": [{"role":"user","content":"Stream please"}],"stream": true}'
When to enable streaming
- Chat / customer-support UIs where users need visible progress
- Code generation and code explanation
- Longer responses or tasks that need immediate feedback
When not to stream
- When you need to parse complete JSON or structured output
- Very short background responses where streaming adds little value
- Background batch jobs where nobody watches the output
Common pitfalls
- Proxies / nginx / Cloudflare buffer SSE by default. SeaLink disables buffering server-side but verify your stack forwards correctly.
- Forgetting to close the stream leaks connections — use try/finally or an async iterator that auto-closes.
- Usage is finalized at stream end. OpenAI SDK emits usage on the last frame (set stream_options={"include_usage": true}).