Build an OpenAI Chat Completion Request
The Chat Completion API is the primary interface for all GPT models and the foundation of every OpenAI-powered application. A well-formed request includes a model identifier, a messages array with at least one user role message, and optional parameters to control output behavior. This example shows a complete request for a code explanation task with a system prompt, user message, temperature, and max_tokens configured for concise technical responses. The messages array follows a strict role protocol: system sets the persona and instructions, user carries the human turn, and assistant carries previous AI responses for multi-turn conversations. The order of messages matters — the system message should always come first, followed by alternating user and assistant messages. Inserting a second system message mid-conversation is not standard and may behave differently across models. Key parameters: temperature (0.0–2.0) controls randomness — use 0.0–0.3 for factual tasks and code, 0.7–1.0 for creative writing. max_tokens caps the response length in tokens (not words). top_p is an alternative randomness control; use either temperature or top_p, not both. The model field must exactly match an available model ID including version suffix (e.g., "gpt-4o-2024-08-06").
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a senior software engineer. Explain code clearly and concisely."
},
{
"role": "user",
"content": "Explain what this function does:\n\nconst debounce = (fn, delay) => {\n let timer;\n return (...args) => {\n clearTimeout(timer);\n timer = setTimeout(() => fn(...args), delay);\n };\n};"
}
],
"temperature": 0.2,
"max_tokens": 300
}FAQ
- What is the difference between temperature and top_p?
- Both control output randomness. Temperature scales the probability distribution of next tokens; top_p (nucleus sampling) truncates the distribution to the top p probability mass. OpenAI recommends using one or the other, not both simultaneously.
- How do I format a multi-turn conversation?
- Include previous messages in the messages array alternating between user and assistant roles. The API has no memory — you must send the full conversation history with every request to maintain context.
- What model should I use for most tasks?
- GPT-4o is the best general-purpose model for complex tasks. GPT-4o-mini offers 95%+ of GPT-4o quality at a fraction of the cost for simpler tasks. Use o1 or o3 for reasoning-intensive tasks like math, coding competitions, and multi-step planning.
Related Examples
A well-structured system prompt is the most important factor in GPT model output...
Count Tokens for GPT-4o with tiktokenOpenAI models use the tiktoken library with BPE (Byte Pair Encoding) to tokenize...
Build a JSON Schema for Structured AI OutputsStructured output APIs from OpenAI (json_schema mode) and Anthropic (tool use) r...
Build an Anthropic Messages API RequestThe Anthropic Messages API is the primary interface for all Claude models. Unlik...