Build a Gemini Generate Content Request

The Google Gemini API uses the generateContent endpoint with a structure that differs meaningfully from both OpenAI and Anthropic. The system instruction goes in a dedicated systemInstruction field, user and model turns go in the contents array using "user" and "model" roles (not "assistant"), and generation parameters live in a generationConfig object. This example shows a complete Gemini 1.5 Pro request for a text summarization task. Gemini 1.5 Pro offers a 2,000,000-token context window — by far the largest of any production LLM API. This makes it uniquely suited for tasks involving very long documents, entire codebases, or lengthy research papers that exceed even Claude's 200K window. The 1.5 Flash model offers 1,000,000 tokens at significantly lower cost, making million-token context processing economically viable for the first time. The generationConfig object controls output behavior: temperature (0.0–2.0, default 1.0), topP, topK, maxOutputTokens, and stopSequences. Gemini also supports responseMimeType: "application/json" for JSON output mode, similar to OpenAI's json_object mode. For structured extraction tasks, combine JSON mode with a clear schema in the system instruction.

Example
{
  "model": "gemini-1.5-pro",
  "systemInstruction": {
    "parts": [{"text": "You are a concise technical writer. Summarize documents in plain English without jargon."}]
  },
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "Summarize the key points of the following technical documentation in 3-5 bullet points:\n\nREADME: This library provides a unified interface for connecting to multiple vector databases including Pinecone, Weaviate, Qdrant, and Chroma. It abstracts the client configuration, authentication, and query formats into a consistent API. Supported operations include upsert, query, delete, and list. The library handles connection pooling, retry logic, and rate limiting automatically."}]
    }
  ],
  "generationConfig": {
    "temperature": 0.3,
    "maxOutputTokens": 512
  }
}
[ open in Gemini API Request Builder → ]

FAQ

What is the Gemini equivalent of the system prompt?
The systemInstruction field at the top level of the request. It uses the same parts array format as content messages. Unlike OpenAI where system is a message role, Gemini treats systemInstruction as a separate, higher-priority field.
What is the difference between Gemini 1.5 Pro and 1.5 Flash?
Both have the same 1M-2M token context window, but Flash is optimized for speed and cost (about 4x cheaper than Pro). Pro is better for complex reasoning, nuanced writing, and tasks requiring strong instruction following. Flash is excellent for summarization and extraction.
Does Gemini support function calling like OpenAI?
Yes. Gemini supports function declarations in the tools array, using a functionDeclarations key. The model returns a functionCall part in the response, and you return the result via a functionResponse part in a subsequent user message.

Related Examples