How many virtual users should I test with?

Test at 2-3x your expected peak load to find the breaking point. For SLO validation, test at expected peak. For capacity planning, increase until the system degrades and note the maximum stable throughput.

Why is my load test producing unrealistically good results?

Common causes: caching inflating results (parameterise your requests to vary IDs), missing think time (add sleep between requests), testing with fewer parameters than production queries use. Also check if the test is hitting a CDN instead of your origin.

How do I run load tests without impacting production?

Run against a production-like staging environment with the same infrastructure size. If you must test production, use a small percentage of traffic via feature flags or run during a maintenance window with monitoring alerted.

How many virtual users should I test with?

Test at 2-3x your expected peak load to find the breaking point. For SLO validation, test at expected peak. For capacity planning, increase until the system degrades and note the maximum stable throughput.

Why is my load test producing unrealistically good results?

Common causes: caching inflating results (parameterise your requests to vary IDs), missing think time (add sleep between requests), testing with fewer parameters than production queries use. Also check if the test is hitting a CDN instead of your origin.

How do I run load tests without impacting production?

Run against a production-like staging environment with the same infrastructure size. If you must test production, use a small percentage of traffic via feature flags or run during a maintenance window with monitoring alerted.

testingLlama

Load Testing Prompt (LLaMA / Ollama)

Load tests that hit a single endpoint with constant load do not reflect real user behaviour and produce misleading results. This prompt generates realistic traffic mixes with think time, parameterised data (to avoid cache inflation), and proper ramp-up profiles that match the system's actual access patterns. This variant is formatted for LLaMA / Ollama: Optimised for LLaMA 3, Mistral, and Ollama local models. Uses [INST] / <<SYS>> instruction format.

Prompt Template

[INST] <<SYS>>
You are a helpful, accurate, and detailed AI assistant. Follow the instructions carefully.
<</SYS>>

You are a performance engineer specialising in load testing and capacity planning.

Create a load test scenario for the following system:

System: {{system}}
Critical endpoint(s): {{endpoints}}
Performance targets: {{targets}}
Expected traffic patterns: {{traffic_pattern}}
Test type: {{test_type}}
Tool: {{tool}}
Duration: {{duration}}

Generate:
1. **Load test script** — complete runnable script in {{tool}}
2. **Ramp-up profile** — gradual increase to avoid instant overload
3. **Think time** — realistic delays between requests to simulate real user behaviour
4. **Assertions** — fail the test if thresholds are breached (p95 > X ms, error rate > Y%)
5. **Scenarios** — multiple user journeys reflecting real traffic mix (e.g., 70% read, 30% write)
6. **Data parameterisation** — vary input data to avoid cache hits inflating results

After the script, provide:
- How to run the test and generate an HTML report
- Key metrics to monitor during the test
- What constitutes a pass/fail for this test [/INST]

Variables

{{system}}System under test, e.g., "e-commerce checkout API"

{{endpoints}}Key endpoints to stress, e.g., "POST /checkout, GET /products, GET /products/:id"

{{targets}}Performance SLOs, e.g., "p95 < 500ms at 1000 concurrent users, error rate < 0.1%"

{{traffic_pattern}}Traffic distribution, e.g., "70% browse products, 20% add to cart, 10% checkout"

{{test_type}}Type: load (normal), stress (above normal), spike (sudden burst), soak (extended duration)

{{tool}}Load testing tool: k6, Locust, JMeter, Gatling, Artillery

{{duration}}Test duration, e.g., "10 minutes" or "1 hour for soak test"

Example

Input

system: E-commerce product catalog API
endpoints: GET /products, GET /products/:id, POST /cart/items
targets: p95 < 300ms at 500 concurrent users, error rate < 0.5%
traffic_pattern: 60% browse products list, 30% view product detail, 10% add to cart
test_type: load test at target capacity
tool: k6
duration: 10 minutes

Output

import http from 'k6/http';
import { sleep, check } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export const options = {
  stages: [
    { duration: '2m', target: 100 },  // ramp up
    { duration: '6m', target: 500 },  // hold at target
    { duration: '2m', target: 0 },    // ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<300'],
    errors: ['rate<0.005'],
  },
};

export default function () {
  const rand = Math.random();
  if (rand < 0.6) {
    browseProducts();
  } else if (rand < 0.9) {
    viewProduct();
  } else {
    addToCart();
  }
  sleep(Math.random() * 2 + 1); // 1-3s think time
}

FAQ

How many virtual users should I test with?: Test at 2-3x your expected peak load to find the breaking point. For SLO validation, test at expected peak. For capacity planning, increase until the system degrades and note the maximum stable throughput.
Why is my load test producing unrealistically good results?: Common causes: caching inflating results (parameterise your requests to vary IDs), missing think time (add sleep between requests), testing with fewer parameters than production queries use. Also check if the test is hitting a CDN instead of your origin.
How do I run load tests without impacting production?: Run against a production-like staging environment with the same infrastructure size. If you must test production, use a small percentage of traffic via feature flags or run during a maintenance window with monitoring alerted.

Related Prompts

Load Testing Prompt

Load tests that hit a single endpoint with constant load do not reflect real user behaviou...

Load Testing Prompt (ChatGPT)

Load tests that hit a single endpoint with constant load do not reflect real user behaviou...

Load Testing Prompt (Claude)

Load tests that hit a single endpoint with constant load do not reflect real user behaviou...

Load Testing Prompt (Gemini)

Load tests that hit a single endpoint with constant load do not reflect real user behaviou...

API Testing Prompt

API tests without authorisation testing are incomplete — they miss the most common class o...

Monitoring & Alerting Setup Prompt

Alert fatigue is the biggest cause of missed incidents. This prompt designs alerts from th...

Load Testing Prompt (LLaMA / Ollama)

Variables

Example

Related Tools

FAQ

Related Prompts