Skip to content

The Fine Art of a Minimal Reproducible Example

Two hours in, what you're actually shipping is a GitHub issue, a JIRA ticket, or a Slack thread: symptoms, hunches, pasted logs, then one more file because someone asked nicely. Each round trip tugs another detail out of your repo while the runnable version still lives mostly on your machine. Then a follow-up makes you spell out the repro steps you sort of skipped the first time, and halfway through your reply the bug stops looking fuzzy. Nobody merged a fix for you. You just finally described it clearly enough to see it yourself.

That loop is about as expensive as debugging gets, and almost none of it is mandatory. The antidote is the minimal reproducible example (MRE): the smallest, most self-contained piece of code that reliably triggers the problem you're trying to explain. Being kind to whoever reads your ticket is a nice side effect. The main payoff is that your own picture of the break gets sharper. Most of the time you'll see the answer yourself before you hit send.

A Diagnostic Trilogy

An MRE has exactly three properties. Violate any one of them and you no longer have an MRE: you have either a code dump or a non-starter.

Minimal

Strip the code down to the smallest case that still triggers the issue. Remove authentication helpers, logging setup, database connections, unrelated configuration, and everything else that isn't directly involved in the failure. If you can delete a line and the bug still occurs, delete it.

This is harder than it sounds when you're deep in a problem. Everything feels relevant. It's not. The act of removing code is itself a debugging act: each deletion either moves you closer to the root cause or proves that section of code isn't involved.

Reproducible

The example must trigger the same behavior every time, in any environment, for anyone who runs it. "It only happens on my machine" isn't an MRE: it's still a hypothesis. Track down the conditions that consistently trigger the issue and make those conditions explicit in the example.

Your example shouldn't depend on:

  • External services that require credentials or network access
  • Files or state that exist only on your machine
  • Configuration loaded from environment variables that aren't shown
  • A sequence of prior steps that must be completed first

If external state is genuinely part of the problem, substitute the smallest possible local stand-in: a hardcoded string, an in-memory mock, or a local test fixture.

Example

An example is a program that runs. Not a fragment, not a diff, not pseudocode: a complete, standalone piece of code that anyone can paste into a clean environment and execute immediately. Include imports. Include dependency versions. Include the invocation. Remove every reason for someone to say "I can't run this."

The Debug That Happens Before You Ask

Creating an MRE is a more reliable debugging technique than most developers give it credit for. The process forces you to form and test hypotheses. You delete a block and the problem disappears: that block was involved. You delete a different block and the problem persists: that block isn't. You replace a library call with a direct implementation and the behavior changes: the issue was in the library, not your code.

This is the rubber duck effect scaled up and made systematic. Explaining a problem to an inanimate object forces clear articulation, and clear articulation surfaces the gaps in your reasoning. Constructing an MRE does the same thing, but with code instead of words.

The Disappearing Bug

Plenty of bugs show themselves while you're carving an MRE. You close the ticket because you already found the cause. That still counts as a win: you understand what broke instead of only muting the symptom.

Building One: The Process in Four Steps

Start from a blank file. Not a copy of your project with things deleted: a new file from scratch. Rebuild only what is necessary to trigger the issue. This discipline prevents you from dragging irrelevant code along through the back door.

Add the minimum until it fails. Introduce just enough structure to trigger the behavior. Run it after each addition. Stop adding the moment the issue appears.

Remove everything that isn't the bug. Go the other direction. Delete functions, variables, imports, and configuration blocks one at a time. Run after each removal. If removing something makes the issue disappear, put it back. Everything else goes.

Pin and document your environment. Record the exact versions of every tool and dependency involved. What looks like a logic bug can be a version regression. If you're not sure whether version matters, include it anyway: let the person helping you decide.

When 50 Lines Won't Get Shorter

If you genuinely can't reduce your example below a certain threshold, that's itself meaningful diagnostic information: the problem requires that level of complexity to manifest. Say so explicitly. It narrows the search space even before anyone reads the code.

Five Examples Across Languages and Tools

Each example below shows the same pattern: a production scenario with too much noise to diagnose cleanly, then the MRE that isolates the exact issue. Skim the "Before" tab for the friction, then watch what stripping away the extras reveals.

The Scenario: Picture a Go service that processes a list of tasks concurrently. The full-fat version connects to a database, sets up an HTTP client, initializes logging and metrics, and launches a goroutine per task. Every goroutine looks like it's processing the same task, even though the loop reads fine on inspection.

package main

import (
    "database/sql"
    "fmt"
    "log"
    "net/http"
    "sync"
    "time"

    _ "github.com/lib/pq"
)

type Task struct {
    ID   int
    Name string
}

func fetchTasks(db *sql.DB) []Task {
    // ... 40 lines of query logic ...
    return []Task{{ID: 1, Name: "alpha"}, {ID: 2, Name: "beta"}, {ID: 3, Name: "gamma"}}
}

func processTask(t Task, client *http.Client, logger *log.Logger) error {
    // ... 80 lines of HTTP calls, retries, metrics ...
    logger.Printf("processing task %d: %s", t.ID, t.Name)
    return nil
}

func main() {
    db, _ := sql.Open("postgres", "host=localhost dbname=tasks")
    client := &http.Client{Timeout: 10 * time.Second}
    logger := log.Default()

    tasks := fetchTasks(db)

    var wg sync.WaitGroup
    for _, t := range tasks {
        wg.Add(1)
        go func() {
            defer wg.Done()
            if err := processTask(t, client, logger); err != nil {
                logger.Printf("error: %v", err)
            }
        }()
    }
    wg.Wait()
}

The output shows gamma processed three times. Database query? Goroutine pool logic? The processTask implementation? With this much code in the way, every guess costs you time.

package main

import (
    "fmt"
    "sync"
)

func main() {
    items := []string{"alpha", "beta", "gamma"}

    var wg sync.WaitGroup
    for _, item := range items {
        wg.Add(1)
        go func() {
            defer wg.Done()
            fmt.Println(item)
        }()
    }
    wg.Wait()
}
gamma
gamma
gamma

The database connection, HTTP client, and logging setup are just noise. The bug lives in a few lines. Each goroutine closes over the loop variable item, which shared one storage location across iterations before Go 1.22. By the time any goroutine runs, the loop has finished and item holds its final value: "gamma". On Go 1.22 and later, each iteration gets its own item, so you'll see alpha, beta, and gamma once each (order may vary). The output above still matches Go 1.21 and earlier, and the closure pitfall still shows up in older code.

The Fix: Pass item as a parameter to the goroutine function literal.

for _, item := range items {
    wg.Add(1)
    go func(i string) {
        defer wg.Done()
        fmt.Println(i)
    }(item)
}

The Scenario: You've got a data pipeline that runs in two successive batches. Each batch reads a CSV, filters rows, and funnels results through a helper. After the second batch runs, the result set somehow contains rows from both batches. The real thing spans several files: CSV loading, configuration, row validation, and result handling.

import csv
import logging
from typing import Optional

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


def load_config(path: str) -> dict:
    # ... config loading logic ...
    return {"output_dir": "/tmp/results", "batch_size": 100}


def process_row(row: dict, config: dict) -> Optional[dict]:
    # ... validation, transformation, enrichment ...
    if row.get("active") == "true":
        return {"id": row["id"], "name": row["name"]}
    return None


def append_result(result: dict, accumulator: list = []) -> list:
    accumulator.append(result)
    return accumulator


def run_pipeline(input_file: str) -> None:
    config = load_config("config.yaml")
    results = []
    with open(input_file) as f:
        reader = csv.DictReader(f)
        for row in reader:
            processed = process_row(row, config)
            if processed:
                results = append_result(processed)
    logger.info("Collected %d results", len(results))


if __name__ == "__main__":
    run_pipeline("data_batch_1.csv")
    run_pipeline("data_batch_2.csv")

The second run reports more results than the second CSV actually contains, and some records clearly belong to the first batch. CSV loading? Config? Row validation logic? You won't spot the answer at this scale.

def append_result(result: dict, accumulator: list = []) -> list:
    accumulator.append(result)
    return accumulator


print(append_result({"id": 1}))
print(append_result({"id": 2}))
print(append_result({"id": 3}))
[{'id': 1}]
[{'id': 1}, {'id': 2}]
[{'id': 1}, {'id': 2}, {'id': 3}]

The CSV loading, config parsing, and logging are just noise. The bug's sitting in the function signature. Default argument values in Python are evaluated once when the function is defined, not each time it's called. The [] is a single list object that lives for the lifetime of the module. Every call that doesn't pass an explicit accumulator is appending to the same list.

The Fix: Use None as the sentinel and initialize the list inside the function.

def append_result(result: dict, accumulator: list | None = None) -> list:
    if accumulator is None:
        accumulator = []
    accumulator.append(result)
    return accumulator

The Scenario: You're cloning a Linux VM from a template in a vSphere-backed Terraform stack. The configuration sprawls across modules for networking, storage, and compute, with remote state in an S3 backend and dozens of resources. terraform apply dies with a guest OS customization error, even though the template and guest ID look fine in vCenter.

Disclaimer

This is not an official VMware by Broadcom document. This is a personal blog post.

The information is provided as-is with no warranties and confers no rights.

Please, refer to official documentation for the most up-to-date information.

# main.tf (condensed from a 600-line multi-module configuration)

module "network" {
  source     = "./modules/network"
  datacenter = var.datacenter
  # ...
}

module "storage" {
  source    = "./modules/storage"
  datastore = var.datastore
  # ...
}

resource "vsphere_virtual_machine" "app_server" {
  name             = "app-server-01"
  resource_pool_id = data.vsphere_compute_cluster.cluster.resource_pool_id
  datastore_id     = module.storage.datastore_id
  num_cpus         = 4
  memory           = 8192
  guest_id         = data.vsphere_virtual_machine.template.guest_id
  firmware         = data.vsphere_virtual_machine.template.firmware

  network_interface {
    network_id = module.network.portgroup_id
  }

  disk {
    label = "disk0"
    size  = data.vsphere_virtual_machine.template.disks.0.size
  }

  clone {
    template_uuid = data.vsphere_virtual_machine.template.id

    customize {
      linux_options {
        host_name = "app-server-01"
        domain    = var.domain
      }
      network_interface {}
    }
  }
}

The error's buried under module output, state refresh logs, and provider diagnostics. The signal's in there somewhere, but you're wading through output that has nothing to do with the failing resource.

terraform {
  required_providers {
    vsphere = {
      source  = "hashicorp/vsphere"
      version = "~> 2.11"
    }
  }
}

provider "vsphere" {
  vsphere_server       = var.vsphere_server
  user                 = var.vsphere_user
  password             = var.vsphere_password
  allow_unverified_ssl = true
}

data "vsphere_datacenter" "dc" {
  name = "dc-01"
}

data "vsphere_datastore" "ds" {
  name          = "datastore-01"
  datacenter_id = data.vsphere_datacenter.dc.id
}

data "vsphere_compute_cluster" "cluster" {
  name          = "cluster-01"
  datacenter_id = data.vsphere_datacenter.dc.id
}

data "vsphere_network" "network" {
  name          = "VM Network"
  datacenter_id = data.vsphere_datacenter.dc.id
}

data "vsphere_virtual_machine" "template" {
  name          = "ubuntu-22.04-template"
  datacenter_id = data.vsphere_datacenter.dc.id
}

resource "vsphere_virtual_machine" "vm" {
  name             = "mre-test-vm"
  resource_pool_id = data.vsphere_compute_cluster.cluster.resource_pool_id
  datastore_id     = data.vsphere_datastore.ds.id
  num_cpus         = 2
  memory           = 2048
  guest_id         = data.vsphere_virtual_machine.template.guest_id
  firmware         = data.vsphere_virtual_machine.template.firmware

  network_interface {
    network_id = data.vsphere_network.network.id
  }

  disk {
    label = "disk0"
    size  = data.vsphere_virtual_machine.template.disks.0.size
  }

  clone {
    template_uuid = data.vsphere_virtual_machine.template.id

    customize {
      linux_options {
        host_name = "mre-test-vm"
        domain    = "example.com"
      }
      network_interface {}
    }
  }
}

variable "vsphere_server"   { type = string }
variable "vsphere_user"     { type = string }
variable "vsphere_password" {
  type      = string
  sensitive = true
}
Error: error cloning virtual machine: error reconfiguring virtual machine:
ServerFaultCode: A general system error occurred: Customization of the guest operating
system 'ubuntu64Guest' is not supported in this configuration.

With the module hierarchy, remote state, and unrelated resources stripped away, the error and the failing resource configuration sit side by side. OS customization needs two things to succeed: VMware Tools in the template, and a vSphere service account with the Virtual machine.Provisioning.Customize privilege. Neither one jumps out from a 600-line multi-module configuration; both are easy to check against a 50-line MRE.

This is still an infrastructure MRE: you'll need a reachable vSphere environment, inventory names that match the data sources, and provider credentials you're supplying on purpose. The point isn't to eliminate the lab; it's to strip unrelated Terraform so the failure and the failing resource stay in the same view.

Terraform MREs: Inline Everything

Replace var_file references, remote state backends, and module calls with inline variable blocks and hardcoded values. Remove every resource that isn't the one failing. You want a configuration you can run with terraform init && terraform apply in a clean directory against a fresh state file.

The Scenario: Your PowerCLI script filters VMs by name prefix and takes action on the matches. It's got audit logging, email notifications, error handling, and the polite disconnect at the end. The Where-Object filter keeps returning zero VMs when you'd expect the web- prefix matches.

Disclaimer

This is not an official VMware by Broadcom document. This is a personal blog post.

The information is provided as-is with no warranties and confers no rights.

Please, refer to official documentation for the most up-to-date information.

param (
    [string]$VIServer = "vcenter.example.com",
    [string]$Cluster  = "Production",
    [string]$Prefix   = "web-"
)

Import-Module VMware.PowerCLI

function Write-AuditLog {
    param([string]$Message)
    Add-Content -Path "C:\audit\powercli.log" -Value "$(Get-Date) $Message"
}

Connect-VIServer -Server $VIServer -Credential (Get-Credential)

$cluster = Get-Cluster -Name $Cluster
$vms     = Get-VM -Location $cluster | Where-Object { $_.Name -like $Prefix }

foreach ($vm in $vms) {
    Write-AuditLog "Processing: $($vm.Name)"
    # ... 60 more lines ...
}

Disconnect-VIServer -Server $VIServer -Confirm:$false

Zero VMs. The audit log stays empty. Is Get-VM returning anything? Wrong cluster reference? A scope issue with $Prefix? Once VMware's in the story, you're stuck ruling out connectivity and permissions before you even reach the filter expression, and those have nothing to do with the actual bug.

$names  = @("web-01", "web-02", "app-01", "db-01")
$prefix = "web-"

$matched = $names | Where-Object { $_ -like $prefix }
Write-Output "Matched count: $($matched.Count)"
Matched count: 0

No vCenter, no credentials, no cluster lookup. The broken filter behavior shows up in five lines.

-like in PowerShell only does wildcard matching when you supply the wildcard. The pattern "web-" matches the literal string "web-" with nothing following it. No VM name is exactly that string, so Where-Object isn't wrong to filter everything out.

You can test the comparison directly:

"web-01" -like "web-"   # False: matches the literal string only
"web-01" -like "web-*"  # True: * matches any suffix

The Fix: Append * to the pattern: $_.Name -like "$Prefix*".

The Scenario: Your playbook uses Ansible's default filter to fall back to a safe value when a configuration variable isn't explicitly set. Locally, the fallback behaves. In staging, it doesn't: the variable's defined as an empty string in the inventory, and default quietly ignores the fallback value.

# site.yml
- name: Deploy application
  hosts: app_servers
  vars_files:
    - group_vars/all.yml
    - group_vars/staging.yml
  roles:
    - role: common
    - role: app_config
    - role: app_deploy
    - role: monitoring
# group_vars/staging.yml
app_env: ""
# roles/app_config/tasks/main.yml
- name: Show resolved environment
  ansible.builtin.debug:
    msg: "Deploying to: {{ app_env | default('production') }}"

The debug output shows "Deploying to: " in staging instead of "Deploying to: production". With roles, var files, and templates all in the mix, it isn't obvious which layer owns the variable or why the fallback never fires.

- name: MRE for default filter with empty string
  hosts: localhost
  gather_facts: false
  vars:
    app_env: ""
  tasks:
    - name: Show environment with default filter
      ansible.builtin.debug:
        msg: "Deploying to: {{ app_env | default('production') }}"

Run with:

ansible-playbook mre.yml -i localhost,
TASK [Show environment with default filter] ***
ok: [localhost] => {
    "msg": "Deploying to: "
}

The fallback still isn't applied. You don't need roles, var files, or inventory to see it.

In Ansible's Jinja2, default only applies when the variable is undefined. An empty string ("") still counts as defined: the variable exists, and its value is the empty string. default doesn't treat "empty" as a reason to fall back.

The Fix: Pass true as the second argument to default. This enables "default on falsy", which treats empty strings, zero, and None the same as undefined.

msg: "Deploying to: {{ app_env | default('production', true) }}"
TASK [Show environment with default filter] ***
ok: [localhost] => {
    "msg": "Deploying to: production"
}

hosts: localhost is Your Friend

Setting hosts: localhost and gather_facts: false makes any Ansible MRE easy to run without an inventory file or remote SSH. Use ansible-playbook mre.yml -i localhost, (the trailing comma makes a bare host list) to test variable behavior, filter logic, and task sequencing entirely on your laptop.

What to Include Alongside the Code

A runnable example without context is still an incomplete MRE. The code answers "what happens"; you also need to supply "what I expected to happen" and the environment in which it happened.

Include the following alongside any MRE:

Include Why It Matters
Expected behavior Defines success; "it doesn't work" isn't a problem statement
Observed behavior The actual output or error, verbatim
Tool and dependency versions Behavior differences between versions are a common root cause
OS and runtime Some bugs are platform-specific
Exact command to run the example Remove any reason to guess

Version Reference by Tool

Tool How to Capture the Version
Go go version output and go.mod content
Python python --version and pip show <package>
Terraform terraform version output and the required_providers block
PowerShell / PowerCLI $PSVersionTable and Get-Module VMware.PowerCLI -ListAvailable
Ansible ansible --version

Common Mistakes

Keeping Production Dependencies in the Example

A database that requires internal credentials, a secrets manager that needs an IAM role, an API endpoint only reachable inside the VPN: any of these makes the example non-reproducible for everyone except you. Replace each external dependency with the smallest local stand-in that triggers the same behavior.

Reporting the Side Effect Instead of the Root Failure

An exception stack trace is a symptom, not always the root cause. A function that silently returns the wrong value, a resource created with incorrect configuration, a template that renders to an unexpected string: these are symptoms that can be just as informative as an explicit error. Make sure your MRE demonstrates the actual behavior that surprised you, not a downstream artifact of it.

Stopping Too Early

"I simplified the code and the bug is still there" isn't a complete MRE when you still have 300 lines. Keep stripping. If the example genuinely can't be reduced below a certain threshold, say so and explain why. That constraint is itself diagnostic.

Not Running the Example Before Sharing

Run your MRE in a clean environment before you send it. You'll occasionally discover that it either doesn't reproduce the issue or has a missing dependency. Find this out before someone else does.

After you've built a few MREs, you notice a rhythm: you sit down to ask for help, and somewhere around the third deletion the story tells itself. Strip a dependency and the bug vanishes. Pin a version and the behavior shifts. Swap a library call for something tiny and obvious, and you suddenly see what the abstraction was hiding.

The MRE habit nudges you into real understanding before you outsource the thinking. Most bugs don't do well under that kind of light.

When an MRE doesn't crack the case alone, it still shrinks the gap between "I've got a bug" and "someone else gets my bug" from hours to minutes.

That's the quiet gift: you've already mapped the boundary so everyone can spend time fixing, not doing archaeology.