GitHub
ESC

Basic Usage

Creating a Model

The SystemLanguageModel represents Apple's on-device foundation model. Create one with default settings or customize it with use case and guardrail options:

require "fm"

# Default model
model = Fm::SystemLanguageModel.new

# With specific use case and guardrails
model = Fm::SystemLanguageModel.new(
  use_case: Fm::UseCase::ContentTagging,
  guardrails: Fm::Guardrails::PermissiveContentTransformations
)

Use Cases

Value Description
UseCase::General General-purpose language model use (default)
UseCase::ContentTagging Optimized for content tagging and classification

Guardrails

Value Description
Guardrails::Default Default guardrails applied to all generation
Guardrails::PermissiveContentTransformations More permissive guardrails for content transformation tasks

Creating a Session

A Session holds conversation state between you and the model. You can provide system instructions and tools:

session = Fm::Session.new(model, instructions: "You are a helpful assistant.")

Sessions maintain multi-turn conversation context automatically.

Generating Responses

Use respond for a blocking call that returns the complete response:

response = session.respond("What is Crystal?")
puts response.content

# Follow-up (session maintains context)
response = session.respond("What about its type system?")
puts response.content

With a Timeout

Set a timeout to limit how long the model can take:

response = session.respond("Complex question", timeout: 10.seconds)
puts response.content

With Generation Options

Customize temperature, sampling strategy, and max tokens:

options = Fm::GenerationOptions.new(
  temperature: 0.8,
  sampling: Fm::Sampling::Random,
  max_response_tokens: 500_u32,
  seed: 42_u64  # optional: set a seed for reproducible output
)

response = session.respond("Write a haiku.", options)
puts response.content

Error Handling

All fm.cr errors inherit from Fm::Error. Use specific error types for targeted handling:

begin
  response = session.respond("Hello")
rescue ex : Fm::TimeoutError
  puts "Timed out: #{ex.message}"
rescue ex : Fm::ToolCallError
  puts "Tool '#{ex.tool_name}' failed: #{ex.message}"
rescue ex : Fm::GuardrailViolationError
  puts "Guardrail violation: #{ex.message}"
rescue ex : Fm::Error
  puts "Error: #{ex.message}"
end

See the Errors reference for the complete list.

Token Usage

Estimate how many tokens a prompt will use (requires macOS 26.4+):

if tokens = model.token_usage_for("Hello, world!")
  puts "Prompt tokens: #{tokens}"
end

You can also estimate token usage for instructions and tools configuration:

if tokens = model.token_usage_for_tools("You are a helpful assistant.", tools_json)
  puts "System tokens: #{tokens}"
end

Prewarming

Hint the model ahead of time with a prompt prefix to reduce latency:

session.prewarm("Tell me about")