Build an OpenAI Chat Completion Request

The Chat Completion API is the primary interface for all GPT models and the foundation of every OpenAI-powered application. A well-formed request includes a model identifier, a messages array with at least one user role message, and optional parameters to control output behavior. This example shows a complete request for a code explanation task with a system prompt, user message, temperature, and max_tokens configured for concise technical responses. The messages array follows a strict role protocol: system sets the persona and instructions, user carries the human turn, and assistant carries previous AI responses for multi-turn conversations. The order of messages matters — the system message should always come first, followed by alternating user and assistant messages. Inserting a second system message mid-conversation is not standard and may behave differently across models. Key parameters: temperature (0.0–2.0) controls randomness — use 0.0–0.3 for factual tasks and code, 0.7–1.0 for creative writing. max_tokens caps the response length in tokens (not words). top_p is an alternative randomness control; use either temperature or top_p, not both. The model field must exactly match an available model ID including version suffix (e.g., "gpt-4o-2024-08-06").

Example
{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "You are a senior software engineer. Explain code clearly and concisely."
    },
    {
      "role": "user",
      "content": "Explain what this function does:\n\nconst debounce = (fn, delay) => {\n  let timer;\n  return (...args) => {\n    clearTimeout(timer);\n    timer = setTimeout(() => fn(...args), delay);\n  };\n};"
    }
  ],
  "temperature": 0.2,
  "max_tokens": 300
}
[ open in OpenAI Request Builder → ]

FAQ

What is the difference between temperature and top_p?
Both control output randomness. Temperature scales the probability distribution of next tokens; top_p (nucleus sampling) truncates the distribution to the top p probability mass. OpenAI recommends using one or the other, not both simultaneously.
How do I format a multi-turn conversation?
Include previous messages in the messages array alternating between user and assistant roles. The API has no memory — you must send the full conversation history with every request to maintain context.
What model should I use for most tasks?
GPT-4o is the best general-purpose model for complex tasks. GPT-4o-mini offers 95%+ of GPT-4o quality at a fraction of the cost for simpler tasks. Use o1 or o3 for reasoning-intensive tasks like math, coding competitions, and multi-step planning.

Related Examples