← all field notes

Field note · May 22, 2026 · reliability

How we caught our own scanner lying

A scanner that sells trust has exactly one job: when it points at a line and says this, the line has to actually say that. If it ever doesn't, the demo dies, the conversion dies, and the founder reading the report stops trusting anything else we showed them.

On April 23 our scanner reported plan_tool.py:8 in a public Hugging Face repo as an agent chokepoint. Line 8 was a comment. The same scan reported main.py:813 as RCE-equivalent shell exec. Line 813 was warnings.filterwarnings("ignore", …). Two flagrant false positives sampled at random, both in the same live scan.

The scanner doesn't use an LLM. It's 100% regex + Python AST. So neither false positive was a hallucination in the model sense — but the effect on the visitor reading the report was identical: confident output, plausible framing, technically false. That is unacceptable for a tool whose pitch is we catch what your tests can't.

Here are the two real bugs, the five layers we shipped so this class of failure can't recur silently, and what shipped to main across four commits over the next 27 hours.


The two bugs

Bug 1 · AST line numbers on decorated functions

The decorator-orchestrator detector walks the Python AST and emits a finding at node.lineno. On a decorated function, node.lineno in CPython is the line of the def, not the line of the decorator above it. The author wanted the decorator's line, so the finding pointed somewhere upstream of where the actual handler symbol lived. Sometimes that landed on imports, sometimes on docstrings, sometimes on comments.

# What the scanner emitted
plan_tool.py:8  AGENT CHOKEPOINT  Controller.handle() / Dispatcher.dispatch()

# What plan_tool.py:8 actually was
8:  # In-memory storage for the current plan

Bug 2 · optional alternation matching method names in prose

The agent-orchestrator detector used a regex with an optional alternation: (?:def|function)?\s*(plan|execute|dispatch|handle). The ? on the keyword made the keyword optional, so the regex matched the method name anywhere it appeared — including comments, docstrings, and call-sites that mentioned the word in prose. Any file with the word plan or execute in a code comment got flagged as an orchestrator.

# What the scanner emitted
session.py:142  AGENT CHOKEPOINT  orchestrator method 'execute'

# What session.py:142 actually was
142:  # We execute the steps in order, with retry on the inner loop.

Two bugs, both in the same class: the scanner's claim about a line did not match the bytes on that line. We could have patched the two regexes and shipped. We didn't. The bigger question was: how do we keep this class of bug from sneaking back in next time someone adds a detector?


The five layers

Each layer catches a different failure mode. The combination makes a silent regression cost more than just shipping the right code in the first place.

🔒 Layer 1 · adversarial trap fixtures

A test fixture directory whose entire purpose is to look like trouble at the surface and be inert on inspection. Comments that say # subprocess.Popen. F-strings that interpolate the literal word execute. Docstrings that name a chokepoint we don't actually have. The whole detector suite has to assert zero findings across these files. Any new detector that lights up here is a regex-leaks-into-comments bug before it ever hits production.

tests/fixtures/adversarial_trap/
  no_shell_exec.py        # mentions subprocess.Popen in a docstring
  no_orchestrator.py      # 'execute' and 'plan' in comments only
  no_payment_call.py      # 'stripe.refunds.create' as a string literal
  README.md               # explains the trap to the next developer

def test_traps_emit_zero_findings():
    for path in TRAP_FIXTURES:
        findings = scan(path)
        assert findings == [], (
            f"{path} fired {len(findings)} findings — "
            "regex is leaking into comments or strings"
        )

🔒 Layer 2 · runtime self-check

The scanner walks files and emits findings. Before any finding is returned, the self-check re-opens the file at the reported line and asserts the snippet the finding claims is actually on that line. If it isn't, the finding is dropped and a counter ticks. This is the layer that would have caught both bugs above on the first scan of ml-intern — the line said warnings.filterwarnings, the finding said subprocess.Popen, the assertion would have failed, the finding never ships.

# packages/supervisor-discover/scanners/__init__.py

def _self_check(finding: Finding, source: str) -> bool:
    """Return True iff the reported snippet appears on the reported line."""
    lines = source.splitlines()
    if finding.line < 1 or finding.line > len(lines):
        return False
    actual = lines[finding.line - 1]
    needle = finding.snippet.strip()
    return needle in actual

def emit(detector, source, raw_findings):
    kept = []
    for f in raw_findings:
        if _self_check(f, source):
            kept.append(f)
        else:
            METRICS.dropped_by_self_check += 1
            log.warning("self_check_drop", detector=detector, line=f.line)
    return kept

Twenty lines. It assumes every detector emits a snippet alongside the line number, which we enforce at the type level. It is the single most important layer.

🔒 Layer 3 · AST-first for Python

Regex on raw text leaks into comments, docstrings, and f-string contents. The fix isn't cleverer regex — it's walking the AST and asserting that the match is on a real expression node, not a token inside a Comment or a Constant(str). The orchestrator detector now resolves a dotted name through ast.walk and a helper that knows Controller.handle is the same symbol as self.handle on a method of Controller. It can no longer match the word in a comment.

🔒 Layer 4 · golden-repo snapshots

A handful of real-world repositories — pinned by commit SHA — are scanned in CI. The output is committed to tests/golden_repos/ and diffed on every PR. If a detector change adds or removes findings on a golden repo, the test fails loudly. Intentional changes update the snapshot with a one-line command; accidental drift gets caught at review.

tests/golden_repos/
  ml-intern@a3f9c1.findings.json       # 47 findings — pinned
  fastapi-quickstart@b2e480.findings.json
  langchain-tutorial@de1109.findings.json

# Updating intentionally
$ pytest tests/golden_repos/ --update-snapshots
# Reviewer sees the diff in the PR alongside the detector change

🔒 Layer 5 · confidence gate on the public UI

The first four layers protect the scanner's correctness in CI. The fifth protects what an anonymous visitor sees on the landing. Findings carry a confidence tier — high, medium, low — derived from how many independent signals fired and whether the AST resolution was unambiguous. The public scan only renders high confidence priority findings. Medium and low sit behind a sign-in. If a future detector regresses to a noisier baseline, the noise stays in the dashboard, not on the homepage.


What this means for a repo you scan today

A re-scan of the same Hugging Face repo went from 55 findings to 47 — eight false positives eliminated, with the remaining 47 cross- checked line-by-line against the real source. Every finding the scanner emits now has been validated against the bytes on the reported line before it leaves the process. The decorator detector resolves through the AST. New detectors run against the trap fixtures the first time they're imported. Golden snapshots fail CI before a regression ships.

The supervisor we're asking you to put in front of your agent had to clear a bar first: it had to be more reliable than the agent. These five layers are how we got there, and the tests stay green at 94/94 on every commit to main.

What I'm not worried about

The next class of bug. There will be one — every detector has a blind spot until someone scans a repo that exercises it. The layers don't prevent the next bug; they make sure the next bug fails noisily in CI instead of quietly in a customer demo. That's the only contract a scanner can honestly offer.


scan your repo

Paste a repo URL. Every finding is line-checked.

Public repos free. No login, no API key, no instrumentation. You get the high-confidence findings, the snippet that lives on the reported line, and a copy-paste wrap for each one.

scan my repo →