testingLlama

Load Testing Prompt (LLaMA / Ollama)

Load tests that hit a single endpoint with constant load do not reflect real user behaviour and produce misleading results. This prompt generates realistic traffic mixes with think time, parameterised data (to avoid cache inflation), and proper ramp-up profiles that match the system's actual access patterns. This variant is formatted for LLaMA / Ollama: Optimised for LLaMA 3, Mistral, and Ollama local models. Uses [INST] / <<SYS>> instruction format.

Prompt Template
[INST] <<SYS>>
You are a helpful, accurate, and detailed AI assistant. Follow the instructions carefully.
<</SYS>>

You are a performance engineer specialising in load testing and capacity planning.

Create a load test scenario for the following system:

System: {{system}}
Critical endpoint(s): {{endpoints}}
Performance targets: {{targets}}
Expected traffic patterns: {{traffic_pattern}}
Test type: {{test_type}}
Tool: {{tool}}
Duration: {{duration}}

Generate:
1. **Load test script** — complete runnable script in {{tool}}
2. **Ramp-up profile** — gradual increase to avoid instant overload
3. **Think time** — realistic delays between requests to simulate real user behaviour
4. **Assertions** — fail the test if thresholds are breached (p95 > X ms, error rate > Y%)
5. **Scenarios** — multiple user journeys reflecting real traffic mix (e.g., 70% read, 30% write)
6. **Data parameterisation** — vary input data to avoid cache hits inflating results

After the script, provide:
- How to run the test and generate an HTML report
- Key metrics to monitor during the test
- What constitutes a pass/fail for this test [/INST]

Variables

{{system}}System under test, e.g., "e-commerce checkout API"
{{endpoints}}Key endpoints to stress, e.g., "POST /checkout, GET /products, GET /products/:id"
{{targets}}Performance SLOs, e.g., "p95 < 500ms at 1000 concurrent users, error rate < 0.1%"
{{traffic_pattern}}Traffic distribution, e.g., "70% browse products, 20% add to cart, 10% checkout"
{{test_type}}Type: load (normal), stress (above normal), spike (sudden burst), soak (extended duration)
{{tool}}Load testing tool: k6, Locust, JMeter, Gatling, Artillery
{{duration}}Test duration, e.g., "10 minutes" or "1 hour for soak test"

Example

Input
system: E-commerce product catalog API
endpoints: GET /products, GET /products/:id, POST /cart/items
targets: p95 < 300ms at 500 concurrent users, error rate < 0.5%
traffic_pattern: 60% browse products list, 30% view product detail, 10% add to cart
test_type: load test at target capacity
tool: k6
duration: 10 minutes
Output
import http from 'k6/http';
import { sleep, check } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export const options = {
  stages: [
    { duration: '2m', target: 100 },  // ramp up
    { duration: '6m', target: 500 },  // hold at target
    { duration: '2m', target: 0 },    // ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<300'],
    errors: ['rate<0.005'],
  },
};

export default function () {
  const rand = Math.random();
  if (rand < 0.6) {
    browseProducts();
  } else if (rand < 0.9) {
    viewProduct();
  } else {
    addToCart();
  }
  sleep(Math.random() * 2 + 1); // 1-3s think time
}

Related Tools

FAQ

How many virtual users should I test with?
Test at 2-3x your expected peak load to find the breaking point. For SLO validation, test at expected peak. For capacity planning, increase until the system degrades and note the maximum stable throughput.
Why is my load test producing unrealistically good results?
Common causes: caching inflating results (parameterise your requests to vary IDs), missing think time (add sleep between requests), testing with fewer parameters than production queries use. Also check if the test is hitting a CDN instead of your origin.
How do I run load tests without impacting production?
Run against a production-like staging environment with the same infrastructure size. If you must test production, use a small percentage of traffic via feature flags or run during a maintenance window with monitoring alerted.

Related Prompts