data
SQL Query Generation Prompt
Generating SQL without schema context produces queries that reference non-existent columns or miss join conditions. This prompt requires the schema upfront and asks for comments inside the query, which makes the generated SQL much easier to understand and debug. The performance notes section often surfaces missing index suggestions that would otherwise require a separate query.
Prompt Template
You are a database engineer expert in {{dialect}} SQL.
Generate a SQL query that answers this question: {{question}}
Database schema:
```sql
{{schema}}
```
Additional requirements:
- {{requirements}}
- Use table aliases for readability
- Add a comment above each major clause explaining its purpose
- If the query requires a CTE or subquery, explain the logic in a comment before the query
After the query, provide:
1. **Plain English explanation** of what the query does
2. **Performance notes** — any indexes that would help, or steps that might be slow on large tables
3. **Sample output** — show what 2-3 example rows would look likeVariables
{{dialect}}SQL dialect: PostgreSQL, MySQL, SQLite, BigQuery, Snowflake, T-SQL{{question}}The business question you want the query to answer{{schema}}CREATE TABLE statements or a description of relevant tables and columns{{requirements}}Extra constraints, e.g., "filter to the last 30 days", "exclude test accounts", or "None"Example
Input
dialect: PostgreSQL question: Which customers made more than 3 orders in the last 90 days and what is their average order value? schema: CREATE TABLE customers (id INT, name TEXT, email TEXT, created_at TIMESTAMPTZ); CREATE TABLE orders (id INT, customer_id INT, total NUMERIC, created_at TIMESTAMPTZ); requirements: Exclude customers whose email contains 'test'
Output
```sql -- CTE: filter orders to the last 90 days for efficiency WITH recent_orders AS ( SELECT customer_id, total FROM orders WHERE created_at >= NOW() - INTERVAL '90 days' ) -- Main query: aggregate per customer and apply thresholds SELECT c.name, c.email, COUNT(ro.customer_id) AS order_count, ROUND(AVG(ro.total), 2) AS avg_order_value FROM customers c JOIN recent_orders ro ON c.id = ro.customer_id WHERE c.email NOT LIKE '%test%' GROUP BY c.id, c.name, c.email HAVING COUNT(ro.customer_id) > 3 ORDER BY avg_order_value DESC; ```
Related Tools
FAQ
- Can this generate queries for NoSQL databases?
- This prompt is optimised for SQL. For MongoDB, change the dialect to "MongoDB aggregation pipeline" and replace the schema with your document structure. The model will generate a JavaScript aggregation pipeline instead.
- How do I handle queries across 10+ tables?
- Include only the relevant tables and foreign key relationships in the schema, not the full database. Too much schema context confuses the model and produces join errors.
- Will the AI always use the most efficient query structure?
- Not always. The generated query is logically correct more often than it is optimally efficient. Always check the EXPLAIN output in your database and review the performance notes section for indexing suggestions.
Related Prompts
Data Analysis Prompt
Most AI data analysis prompts produce vague observations like "sales increased in Q2". Thi...
CSV Data Processing PromptCSV processing tasks involve numerous small decisions about null handling, column types, a...
Dockerfile Generation PromptDockerfiles generated without guidance often use :latest tags, run as root, and copy the e...