Gemini 3.1 Flash Lite

Official

google/gemini-3.1-flash-lite

Google's cheapest GA model in the 3.x series. Matches Gemini 2.5 Flash quality at a fraction of the cost. Optimized for low-latency, high-volume workloads: classification, summarization, simple generation, and RAG at scale.

6x cheaper than Gemini 3.5 Flash

1049K contextVisionTool callingJSON mode

Pricing

Source: gemini_x0.5 · Verified 2026-06-05

InputOfficial $0.25 / M tokens

$0.125/ M tokensSave 50%

OutputOfficial $1.5 / M tokens

$0.75/ M tokensSave 50%

Cache read $0.0125/M · Cache write —/M

Estimate cost

Input tokensOutput tokens

Estimated cost$0.002

Protocols

OpenAI Chat CompletionsStatus: Available: Streaminghttps://api.onehop.ai/v1
OpenAI ResponsesStatus: Not supported—
Anthropic MessagesStatus: Not supported—
Google Vertex AIStatus: Not supported—
OpenAI ImagesStatus: Not supported—
OpenAI SoraStatus: Not supported—

Try it in the chat playground

Call IDs

Use either ID to call this model via the API.

OneHop namegoogle/gemini-3.1-flash-lite

Try it

Replace the ONEHOP_KEY placeholder with your API key. Create one →

from openai import OpenAI

client = OpenAI(
    base_url="https://api.onehop.ai/v1",
    api_key="<ONEHOP_KEY>",
)

completion = client.chat.completions.create(
    model="google/gemini-3.1-flash-lite",
    messages=[{"role": "user", "content": "What is the meaning of life?"}],
)
print(completion.choices[0].message.content)

base_url: https://api.onehop.ai/v1

Other variants in this family

Gemini 3.1 Pro

Gemini 3.1 Pro — Google flagship for advanced math, code, and reasoning.

$1/M↓ · $6/M↑