What is Vibefixing and what does it scan?

Vibefixing is a runtime supervisor and security scanner for AI agents. It statically analyzes your codebase to find unsafe tool calls your AI agent can execute before they reach production: unguarded Stripe charges, raw database mutations, filesystem writes, and shell commands.

How does Vibefixing prevent prompt injection in AI agents?

Vibefixing identifies tool calls that lack input validation or guardrails, which are the primary attack surface for prompt injection. By flagging every path where an LLM output can trigger an irreversible action without a human confirmation step, it eliminates the conditions that make injection exploits dangerous.

Is Vibefixing safe to use for vibe coders shipping with AI-generated code?

Yes. Vibefixing is designed for vibe coders: developers shipping fast with AI assistants like Claude, Cursor, or Copilot. It catches risky patterns LLMs commonly generate: unguarded API calls, database deletes without confirmation, and credential exposure in tool call arguments.

Does Vibefixing work with any AI agent framework?

Vibefixing analyzes your repository at the code level, so it works with any agent framework: LangChain, LlamaIndex, CrewAI, custom OpenAI function-calling, Anthropic tool use, or plain Python scripts. No instrumentation or runtime hooks required.

I'm not building an AI agent — does Vibefixing still help?

Yes. The same patterns an agent can misuse — Stripe checkout sessions, account mutations, webhook handlers that re-deliver and double-charge — are what get exploited by prompt injection, a careless contractor, or a runaway script. Vibefixing flags them whether or not an LLM is in the call chain. Supported stacks include Next.js App Router (route handlers and Server Actions), Stripe (TypeScript and Python), Supabase (service-role writes and row-level security checks on migrations), Prisma, and SQLAlchemy.

What unsafe actions does Vibefixing detect?

Vibefixing detects: Stripe charges, refunds, subscription changes, and customer portal sessions (TypeScript and Python); raw SQL mutations and Supabase writes that bypass row-level security; Next.js Server Actions and API route handlers; webhook handlers that replay database writes on retry; filesystem deletes; shell execution; emails sent without per-call confirmation. Each finding includes the file, line, and a one-line wrap.

How long does a Vibefixing scan take?

A public repository scan typically completes in under 60 seconds. Vibefixing uses static analysis and does not run your code or require an API key for the scan. Results include a risk score, a list of unsafe actions, and copy-paste guardrail code for each finding.

Does Vibefixing scan every pull request automatically?

Yes. Install the Vibefixing GitHub App on your repo and every pull request gets scanned automatically. Within 5 seconds of opening a PR, vibefixing diffs the head ref against your previous scan and posts a comment listing only the new unsafe call-sites. Clean PRs get nothing — no spam. Free for public repos; private repos and CI integration are part of Builder ($29/mo), while team workflows, SSO, and org-level controls start at Pro ($99/workspace/mo).

How is Vibefixing different from regular code review or static analysis?

Code review and SAST tools catch bugs in code your tests already cover. Vibefixing catches what your tests cannot: actions an LLM can fire at runtime in ways no test case anticipated. Refunds the model decides to issue, files it decides to delete, emails it decides to send. Each detector maps to a runtime guard you can drop in with one line.

How is Vibefixing different from Snyk or Semgrep for AI agents?

Snyk and Semgrep are built for human-written code vulnerabilities: CVEs, dependency audits, known patterns. Vibefixing is built specifically for LLM-driven execution paths — the tool calls an AI agent fires at runtime based on user input. It understands agent shapes (tool-using, chatbot, RAG, MCP-server) and surfaces risk framed around what the model can do, not what a human wrote incorrectly.

Does Vibefixing replace runtime monitoring tools like LangSmith?

No — they are complementary. LangSmith and similar tools observe what your agent does at runtime after deployment. Vibefixing runs before deployment, on your static codebase, to catch unsafe tool call patterns before they ever reach production. Use Vibefixing in CI to prevent risky code from shipping, and runtime monitoring to observe behavior after it ships.

← all field notes

Field note · May 26, 2026 · explainability

Why did the agent do that?

A customer emails support: three weeks ago your bot refused to refund my October order. My card was charged twice. I want to know why you said no. Engineering opens the ticket. The application log shows the agent took an action. It does not show what the agent saw, which policy was live, or why the rule that fired fired. The answer to the customer's question is a best guess.

That is the explainability gap. It is not a property of the model. It is a logging discipline you applied — or didn't — at the boundary where the agent did something irreversible.

The three things that need to live in one record

A decision the agent made three weeks ago is reconstructable if and only if you can answer three questions from a single log line:

What did the agent see? The fingerprint of the input that reached the tool call. Not the entire payload — that often contains PII you don't want in an audit log forever — but enough to identify the case.
Which policy decided? The version of the policy file at the moment the decision was made. Not the current version; the historical one. A fix you shipped this week must not silently rewrite last month's history.
Why? The list of reasons that resolved to a denial or an approval, in human-readable strings, derived from the policy that ran.

Three fields. If any one is missing, the reconstruction is best guess.

What the evidence event looks like

Each supervised action emits an evidence event the moment the decision is made. Below is the actual shape the supervisor writes, inlined from a denied refund:

{
  "event_id": "ev_2026-04-09T14:22:11Z_4f2a",
  "action_type": "refund",
  "input_fingerprint": "sha256:8c2e…",   // not the raw payload
  "decision": "deny",
  "risk_score": 0.82,
  "reasons": [
    "refund_velocity_24h > 3 (saw 5)",
    "customer_age_days < 30 (saw 11)",
    "amount > 500 (saw 729.00)"
  ],
  "threats": [
    { "detector_id": "refund-burst",
      "owasp_ref": "LLM06",
      "level": "warn",
      "message": "5 refunds in 24h for new account" }
  ],
  "policy_version": "refund.base@v1.4",
  "policy_ref": "packages/policies/refund.base.v1.yaml#L42-L78",
  "enforcement_mode": "enforce",
  "occurred_at": "2026-04-09T14:22:11.842Z"
}

Three weeks later you look this up by customer_id or by occurred_at. The reasons array tells you, in English, what the agent saw and what fired. The customer gets a real answer.

Replay a past decision against today's policy

The customer's next question is usually would it still deny if I tried again? The supervisor exposes the same decision endpoint in dry-run mode, so you can re-evaluate the same input against the current policy without committing to anything:

curl -X POST https://api.vibefixing.me/v1/actions/evaluate \
  -H 'authorization: Bearer …' \
  -d '{
    "action_type": "refund",
    "input": { /* the original payload, retrieved by input_fingerprint */ },
    "dry_run": true
  }'

# response
{
  "decision": "review",                  # not 'deny' anymore
  "risk_score": 0.61,
  "reasons": [
    "refund_velocity_24h > 3 (saw 5)",
    # customer_age_days check removed in policy v1.6
  ],
  "policy_version": "refund.base@v1.6"
}

Now the engineer can answer the customer with precision: the policy in effect three weeks ago denied; today's policy would route the same case to a human reviewer. The policy diff between v1.4 and v1.6 is visible in source control. The ticket closes with a one-line explanation instead of a corporate paragraph that says nothing.

The wrap

On the agent side, the discipline is one decorator at the chokepoint. The supervisor writes the evidence event whether the decision was allow, deny, or review — failures are recorded too:

from supervisor_guards import supervised

@supervised("refund")
def issue_refund(customer_id: str, amount: float, reason: str) -> Refund:
    return stripe.refunds.create(
        charge=resolve_charge(customer_id),
        amount=int(amount * 100),
        reason=reason,
    )

Every call to issue_refund now produces a record of what reached the supervisor and what the supervisor decided, before the Stripe call happens. There is no extra logging code in the function body. The agent code didn't change shape; it just became reconstructable.

What I'm not worried about

PII bloat in the log. The evidence event stores an input_fingerprint, not the raw payload. The raw input sits in the action store with whatever retention policy your data team already runs; evidence retention is configurable separately. Replay works because you can hand the original payload back to the dry-run endpoint when you need it, not because we kept a copy forever.

Your dashboards are also fine. The evidence chain is append-only and hash-linked; it sits next to the operational logs your team already grep. You don't move to a new logging stack. You add a column.

The explainability section of the risk hub.

Same idea, framed for someone landing on the site for the first time: explainability isn't a model property, it's a logging discipline at the boundary.

/risks · explainability →scan my repo