Score a public repo
Paste a github URL. Score over the last 100 commits, out of 100. Higher is cleaner.
Receipts — slop that shipped to production
Every card below traces to a public incident. The score next to the name is the live public scorer verdict at the time of the failure. Click any card to open the full scan.
LiteLLM 1.86.2
May 2026LLM-generated cache merge in caching_handler.py appended sub-batch indices verbatim. Downstream Java + Python ETL pipelines crashed on duplicated data[*].index.
OpenCode v1.15.13
Jun 2026PR #23068 refactor dropped mandatory agent / model args from tool/task.ts. Every sub-agent silently NULL'd the columns in SQLite. Telemetry blind for days.
rsync 3.4.3
May 2026Incremental backups silently broke. 36 AI-attributed commits between 3.4.1 and 3.4.3. Emergency 3.4.4 shipped Jun 8.
Faker.js
2026LLM-generated locale "optimisation" broke seed determinism. CI runs fail unpredictably across identical seeds. Average score looks clean — the temporal signal flagged the slop commit.
The bigger pattern
- 13 h AWS outage (Dec 2025) — AI agent Kiro autonomously delete-and-recreated prod. Guardian/FT.
- $186/mo invisible tax / affected employee — 1 hr 56 min cleanup × 40% of staff hit. HBR · Stanford · BetterUp, n=1,150. ~$9M/yr per 10k-employee org.
- CIO press: Forbes Apr 2026 · TechTarget Jan 2026.
The AI slop code tool for teams shipping with LLMs
Sloppoke is the AI slop code tool — a slop detector + AI slop fix tool in one engine — for the codebase your team ships with Cursor, Claude Code, or Copilot. Catches LLM slop (the residue every assistant leaves around the working code) and patches it in the same commit.
Three names, one tool: AI slop detector when you ask what it does, AI slop code tool when you list what's in your stack, AI code slop fix when you describe the cleanup. The pre-commit gate is the same either way.
How it works
slop poke sends your staged diff, returns a verdict
+ a unified-diff patch. Sub-10ms. Safe deletes auto-applied;
anything semantic gets a // TODO(slop) spliced —
you decide.
Adaptive. Every slop learn "…"
tunes detection for your account. False positives quiet down,
real misses get caught next time.
How sloppoke compares
What sloppoke is. A statistical detector for the correlation between known LLM-coding patterns and codebase failures. Two surfaces: a public scanner (VC / PM / HR due-diligence on code quality) and a developer-side CLI (catch + clean at the commit boundary).
What sloppoke is not. Not a perf tool. Not a safety guarantee. Slop density is a correlation signal, not a proof of correctness. For runtime: profilers, load tests. For correctness: types, tests, formal verification.
| sloppoke | CodeRabbit | OSS slop | Linters | |
|---|---|---|---|---|
| Pre-commit gate | ✓ | ✗ | ✓ | ✓ |
| Verdict latency | <10 ms | 15 min+ | ~ms | ~ms |
| Action on a hit | strip / TODO(slop) |
review comment | flag only | --fix rewrites body |
| Deterministic verdict | ✓ | ✗ | ✓ | ✓ |
| Learns from your feedback | ✓ | ✗ | ✗ | ✗ |
| Multi-model RL loop | ✓ (NSED) | ✗ | ✗ | ✗ |
| Vendor sees source | diff only | full repo + PR | — | — |
| Pricing | flat sub | seat + LLM tokens | free | free |
- CodeRabbit lives in the PR by design. Vendor logo, post-push, more LLM prose in the diff. Sloppoke gates pre-commit — residue stripped before it lands.
- Catalog tunes itself. An offline deliberation loop (peeramid labs' NSED) re-ranks rules against fresh signal weekly. OSS + linters ship on static release cadence; this one adapts.
Tiers
Starter
- 100,000 pokes / month
- All detectors + adaptive learning
- One SSH key per sub
- 30-day money-back guarantee · cancel anytime
Install, run slop poke — first metered call
returns a Stripe URL keyed to your SSH key. No signup.
Show me the checkout flow
$ slop login
✓ SSH-key handshake complete (3 ms)
$ slop poke
✓ Free tier: 99,847 / 100,000 pokes remaining
# later, when you hit the quota:
$ slop poke
⚠ Monthly cap reached. Checkout to continue:
https://buy.stripe.com/<your-keyed-url>
Pay once → unlimited until next billing cycle.
The URL is bound to your SSH key — bookmark it.
Install
Pick the path that matches your stack. Each method runs the same binary; only the wiring differs.
-
🍺
Homebrew
macOS + Linuxbrew install peeramid-labs/tap/slop -
🦀
From source
any platform with cargogit clone https://github.com/peeramid-labs/sloppoke.git cd sloppoke cargo install --path crates/sloppoke-cli -
🤖
Claude Code plugin
slash commands + skill/plugin marketplace add peeramid-labs/plugin-marketplace /plugin install sloppoke@peeramid-labsAdds
/slop:poke,/slop:apply,/slop:learn. -
📡
Codex CLI plugin
before_command hookcodex plugin marketplace add github:peeramid-labs/plugin-marketplace codex plugin install sloppoke@peeramid-labsWires Codex
git commitcalls toslop poke --staged. -
📎
Cursor / Continue
skill onlymkdir -p ~/.claude/skills curl -fsSL https://raw.githubusercontent.com/peeramid-labs/sloppoke/main/skills/slop.md \ -o ~/.claude/skills/slop.md -
▶
First run
after any install methodslop login # SSH-key handshake slop poke # scan working tree slop poke --gh org/repo --range X..Y # scan any public repo slop apply # apply + amend HEADStdout = patch, stderr = verdict — pipes to
git apply --unidiff-zeroordeltadirectly.
Independently measurable
39.9% of the static-analysis findings that AI commits shipped past human review and reached HEAD across 5,173 production OSS repos would have been caught by sloppoke's pre-commit gate — measured against a peer-published academic dataset of 304K AI-authored commits (Liu et al. 2026).
Re-run the measurement yourself: github.com/peeramid-labs/sloppoke-bench. Same archive → same number, every time.
Privacy & data
Servers in Germany, EU rules. We process diffs, return verdicts, persist only the learning signals — never raw source. Purge anytime via billing portal.
Identity = SSH key fingerprint. No emails, no usernames, no trackers. Stripe handles billing in isolation.
Security
Both CLI and server in Rust — memory-safety
CVEs don't exist by construction. Minimal deps:
one binary, one HTTP client, one ssh-keygen sign.
7-day release buffer on third-party crate upgrades — they bake, the Rust security advisory feed catches bad ones, then we ship. Stable over cutting edge.
FAQ
What does AI slop cost — and what does sloppoke measure?#
See the Receipts section above for the top incidents (LiteLLM / OpenCode / rsync / Faker.js + the industry signal). One extra entry that fits less neatly into a card:
- C23 / glibc compile-fix wave (early 2026) — LLM "shortest semantic path" patches across legacy C utilities to clear glibc 2.43 errors. Aggressive
const-casts + macro masking → modern GCC/Clang optimise unreachable branches → segfaults, silent memory leaks, buffer holes in decade-stable code. (Generic-git scan support inbound:sourceware.org/git/glibc.gitand other non-GitHub hosts.)
What sloppoke measures:
- Slop density per repo over time (gate merges on a target).
- Hits blocked × 1 hr 56 min × your eng rate = hours/dollars saved.
- Determinism — same diff → same finding, audit-ready.
- Per-category TP / FP, tracked each catalog release.
- Verdict p95 <10 ms (
elapsed_msin every response).
Does sloppoke measure runtime performance or guarantee correctness?#
No. Different tool category. Runtime perf → profilers (perf, flamegraphs) + load tests (k6, wrk, Locust). Correctness → types, tests, formal verification. Density of LLM residue in source is what sloppoke measures — a statistical correlation with the failure modes the FAQ above documents, not a proof of correctness or speed.
Adjacent failure mode it does catch indirectly: "shortest semantic path" compile-fix patches (see the C23 / glibc bullet above) leave the code compiling but push GCC/Clang into UB. The markers fire because the patches drop language-level guarantees, not because we instrument the runtime.
Wait, is this frontend vibecoded?#
Yes — landing, copy, pixel widgets, all agent-sketched. The catalog is not: deterministic ML + ruleset from 15+ yrs regulated engineering. Vibe the visible layer; stay deterministic where it counts.
How do you characterize slop?#
Three flavours: wordy nothing (comments
restating code, vacant names), defensive theatre
(guards for impossible cases, empty catches),
unfinished work shipped (placeholders,
untested branches, AI trailers in commits).
Catalog isn't published and isn't static — every
slop learn tunes yours.
Why no GitHub app or PR bot?#
PR bots fire after the slop is in git history. slop
poke runs on the staged diff before commit. No
force-push cleanup. CI still covered:
slop poke --range $BASE..$HEAD drops into any
pipeline as a one-liner, exits non-zero on SLOP. No
GitHub-App scope on your repo.
Why SSH keys instead of an email signup?#
Reuse what already works. Requests signed by
ssh-keygen -Y sign; fingerprint = account.
Nothing to provision, no email breach target, no marketing
list. Same key in CI: drop in a secret, done.
What does slop actually see about my code?#
Unified diff. Nothing else. No origin URL, no SHA, no OAuth token, no clone. Server processes the patch in memory, returns verdict + apply-patch, persists only learning weights. Diff bytes do contain your literal lines — treat sending them like any code-review tool.
What languages are supported?#
Surface detection on every language. Deep analysis lights up first for Rust, TS/JS, Python, Go. More land continuously server-side — no CLI reinstall.
What if I disagree with a finding?#
slop learn "false positive on … because …".
Quiets next scan for your account + project. We keep the
calibration weights, not the raw text.
Can sloppoke run inside a Trusted Execution Environment?#
Yes, Enterprise. Server runs in AMD SEV-SNP confidential VM; we can't read your diffs even with root on the host. Remote attestation proves the running binary matches our published hash; session key sealed to that measurement. EU residency default; Intel TDX / AWS Nitro on request. Trust the math.
Can I run it on-prem or self-hosted?#
Enterprise. Private-corpus calibration, SSO, SLA, audit trail, server image inside your perimeter. engineering@peeramid.xyz.
Does the CLI work without the cloud API?#
Today: no. Thin client → catalog match runs server-side. On-prem / TEE available under Enterprise. Patches kept 24 h for the learning loop on our own model fleet (no third-party LLM APIs); only anonymised patterns survive after. EU residency, per- account purge on request.