Streaming
Overview
Streaming lets you process the model's response incrementally as it's generated, rather than waiting for the full response. This is useful for displaying text in real time or processing long outputs efficiently.
Basic Streaming
Use the stream method with a block that receives each text chunk:
session = Fm::Session.new(model, instructions: "You are a storyteller.")
session.stream("Tell me a short story.") do |chunk|
print chunk
STDOUT.flush
end
puts
Each chunk is a String containing only the new text delta for that iteration -- not the full accumulated response. You can concatenate chunks yourself if you need the complete text.
Streaming with Options
Pass GenerationOptions to control temperature and other parameters:
options = Fm::GenerationOptions.new(
temperature: 1.2,
max_response_tokens: 1000_u32
)
session.stream("Write a creative poem.", options) do |chunk|
print chunk
STDOUT.flush
end
puts
Streaming JSON
Stream structured JSON output matching a schema:
schema = %({"type":"object","properties":{"title":{"type":"string"},"summary":{"type":"string"}},"required":["title","summary"]})
session.stream_json("Summarize the Crystal language.", schema) do |chunk|
print chunk
STDOUT.flush
end
puts
Cancellation
Cancel an ongoing stream from another fiber:
# Start streaming in a separate fiber
spawn do
session.stream("Write a very long essay.") do |chunk|
print chunk
end
end
# Cancel after some time
sleep 2.seconds
session.cancel
Checking State
Check whether the session is currently generating:
if session.responding?
puts "Generation in progress..."
end