Field note · May 26, 2026 · data privacy
Your agent just emailed the wrong customer's invoice
A B2B SaaS we know runs a billing agent that emails monthly invoices. On a Tuesday morning, eleven customers received an invoice belonging to a different customer. Different company names. Different amounts. Different line items. The agent didn't do anything dramatic — it ran the same template it had been running for six months. The change was three lines deeper in the stack, in a query the LLM had quietly refactored.
The fix isn't at the email layer. By the time the agent has a row in hand, the privacy bug has already happened. The fix is one wrap at the data layer.
The query the agent shipped
The billing agent reads outstanding invoices and emails one per recipient. The original query was tenant-scoped:
# before
def outstanding_invoices(tenant_id: str) -> list[Invoice]:
return (
db.query(Invoice)
.filter(Invoice.tenant_id == tenant_id)
.filter(Invoice.status == "outstanding")
.all()
)Then the engineer asked Cursor to add a sort by due-date and a limit. The model refactored the function, and the new version came back without the tenant_id filter:
# after — what shipped
def outstanding_invoices(tenant_id: str) -> list[Invoice]:
return (
db.query(Invoice)
.filter(Invoice.status == "outstanding")
.order_by(Invoice.due_date.desc())
.limit(50)
.all()
)
# tenant_id parameter still there. WHERE clause gone.The function signature still demanded a tenant_id, so every caller still passed one — code review didn't flag it because the call sites looked unchanged. Tests passed because the seeded test DB only had one tenant. Production has a hundred.
Why the email-layer guard wouldn't have helped
The instinct after an incident like this is to put a check right before send_invoice_email — verify the recipient's tenant matches the invoice's tenant. That guard is fine. It doesn't solve the problem.
By the time you're at the email step, the cross-tenant read has already happened. The agent has the wrong row in memory. Now it's in the LLM's context. Now it's in whatever logging service captures function arguments. Now it might end up in a customer-support chat transcript when a human asks the agent what it did this morning. The bug fans out.
The right gate sits where the data leaves the database. If a row crosses a tenant boundary on read, the read itself is the policy violation.
The wrap
Vibefixing's data_access action type intercepts four signals: dataset, columns, actor, purpose. The wrap lets the policy refuse a read where the actor doesn't belong to the row's tenant.
from supervisor_guards import supervised
from supervisor_guards.signals import data_access
@supervised(
"data_access",
signals=lambda tenant_id, **_: data_access(
dataset="invoice",
columns=["id", "tenant_id", "amount", "recipient_email"],
actor={"tenant_id": tenant_id, "role": "billing-agent"},
purpose="monthly_billing_run",
),
)
def outstanding_invoices(tenant_id: str) -> list[Invoice]:
rows = (
db.query(Invoice)
.filter(Invoice.status == "outstanding")
.order_by(Invoice.due_date.desc())
.limit(50)
.all()
)
# the supervisor refuses the read if any row's tenant_id
# differs from actor.tenant_id — the missing WHERE clause
# surfaces as a denial, not as a leaked invoice.
return rowsThe policy file that backs the wrap is plain YAML:
# packages/policies/data_access.tenant_isolation.v1.yaml
applies_to: action_type=data_access, dataset=invoice
require_evidence:
- field: rows[*].tenant_id
must_equal: actor.tenant_id
on_violation:
decision: deny
reason: "cross-tenant read attempted by billing-agent"
emit_threat: { detector_id: "cross-tenant-read", level: "critical" }When the LLM ships a query that loses the tenant_id filter, the policy denies the read. The agent gets back an empty list and a decision the operations team can replay. The eleven customers never see the wrong invoice.
What shadow mode would have shown the team the morning before
With the wrap installed and SUPERVISOR_ENFORCEMENT_MODE=shadow set, nothing is blocked — but every would-block writes a decision to the supervisor's evidence chain. The morning digest the day before the incident:
Subject: 4 actions would have been blocked yesterday — vibefixing
1. CROSS_TENANT_READ × 4 calls
outstanding_invoices(tenant_id=t_842) returned rows for
tenants {t_111, t_842, t_904}. The function lost its tenant
WHERE clause in commit 1f4c2a.
Open replay →
Flip to enforce when you trust shadow → vibefixing.me/dashboardThe fix is in source control. The four caught reads in shadow mode are the warning before the eleven uncaught reads in prod.
What I'm not worried about
Your DB is fine. Your ORM is fine. The wrap doesn't change the query, doesn't add indexes, doesn't need a migration. It reads what the function read, compares it to the actor's scope, and refuses the inconsistent case. Under fifteen minutes to install on the billing path, plus the parallel paths the scanner flags for you when you point it at the repo.
related
The data-privacy section of the risk hub.
Same failure pattern, framed for first-time visitors and tied to the live data_access action type in the registry.