Score a public repo

Paste a github URL. Score over the last 100 commits, out of 100. Higher is cleaner.

Receipts: litellm opencode rsync faker

—/ 100

· commits · changed lines · hits

Top categories

Worst files

Receipts — slop that shipped to production

Every card below traces to a public incident. The score next to the name is the live public scorer verdict at the time of the failure. Click any card to open the full scan.

LiteLLM 1.86.2

May 2026

LLM-generated cache merge in caching_handler.py appended sub-batch indices verbatim. Downstream Java + Python ETL pipelines crashed on duplicated data[*].index.

332 hits / 100 commits

OpenCode v1.15.13

Jun 2026

PR #23068 refactor dropped mandatory agent / model args from tool/task.ts. Every sub-agent silently NULL'd the columns in SQLite. Telemetry blind for days.

60/100 DRIFTING · 169 hits ↑

rsync 3.4.3

May 2026

Incremental backups silently broke. 36 AI-attributed commits between 3.4.1 and 3.4.3. Emergency 3.4.4 shipped Jun 8.

42/100 SLOPPY · 99 hits ↑

Faker.js

2026

LLM-generated locale "optimisation" broke seed determinism. CI runs fail unpredictably across identical seeds. Average score looks clean — the temporal signal flagged the slop commit.

83/100 CLEAN · 25 hits

The bigger pattern

13 h AWS outage (Dec 2025) — AI agent Kiro autonomously delete-and-recreated prod. Guardian/FT.
$186/mo invisible tax / affected employee — 1 hr 56 min cleanup × 40% of staff hit. HBR · Stanford · BetterUp, n=1,150. ~$9M/yr per 10k-employee org.
CIO press: Forbes Apr 2026 · TechTarget Jan 2026.

The AI slop code tool for teams shipping with LLMs

Sloppoke is the AI slop code tool — a slop detector + AI slop fix tool in one engine — for the codebase your team ships with Cursor, Claude Code, or Copilot. Catches LLM slop (the residue every assistant leaves around the working code) and patches it in the same commit.

Three names, one tool: AI slop detector when you ask what it does, AI slop code tool when you list what's in your stack, AI code slop fix when you describe the cleanup. The pre-commit gate is the same either way.

How it works

diff → scan → verdict → apply

slop poke sends your staged diff, returns a verdict + a unified-diff patch. Sub-10ms. Safe deletes auto-applied; anything semantic gets a // TODO(slop) spliced — you decide.

Adaptive. Every slop learn "…" tunes detection for your account. False positives quiet down, real misses get caught next time.

How sloppoke compares

What sloppoke is. A statistical detector for the correlation between known LLM-coding patterns and codebase failures. Two surfaces: a public scanner (VC / PM / HR due-diligence on code quality) and a developer-side CLI (catch + clean at the commit boundary).

What sloppoke is not. Not a perf tool. Not a safety guarantee. Slop density is a correlation signal, not a proof of correctness. For runtime: profilers, load tests. For correctness: types, tests, formal verification.

	sloppoke	CodeRabbit	OSS slop	Linters
Pre-commit gate	✓	✗	✓	✓
Verdict latency	<10 ms	15 min+	~ms	~ms
Action on a hit	strip / `TODO(slop)`	review comment	flag only	`--fix` rewrites body
Deterministic verdict	✓	✗	✓	✓
Learns from your feedback	✓	✗	✗	✗
Multi-model RL loop	✓ (NSED)	✗	✗	✗
Vendor sees source	diff only	full repo + PR	—	—
Pricing	flat sub	seat + LLM tokens	free	free

CodeRabbit lives in the PR by design. Vendor logo, post-push, more LLM prose in the diff. Sloppoke gates pre-commit — residue stripped before it lands.
Catalog tunes itself. An offline deliberation loop (peeramid labs' NSED) re-ranks rules against fresh signal weekly. OSS + linters ship on static release cadence; this one adapts.

Tiers

Launch − 40%

Starter

$20 $12 / month

100,000 pokes / month
All detectors + adaptive learning
One SSH key per sub
30-day money-back guarantee · cancel anytime

Get started →

Install, run slop poke — first metered call returns a Stripe URL keyed to your SSH key. No signup.

Show me the checkout flow

$ slop login
✓ SSH-key handshake complete (3 ms)

$ slop poke
✓ Free tier: 99,847 / 100,000 pokes remaining

# later, when you hit the quota:
$ slop poke
⚠  Monthly cap reached. Checkout to continue:
   https://buy.stripe.com/<your-keyed-url>

   Pay once → unlimited until next billing cycle.
   The URL is bound to your SSH key — bookmark it.

Enterprise

Talk to us

Custom volume + team accounts
Private-corpus calibration
SLA, audit trail, SSO
On-prem or confidential-compute (TEE) deployment

See ROI →

Or email us.

Install

Pick the path that matches your stack. Each method runs the same binary; only the wiring differs.

🍺
Homebrew
macOS + Linux
```
brew install peeramid-labs/tap/slop
```

🦀

From source

any platform with cargo

git clone https://github.com/peeramid-labs/sloppoke.git
cd sloppoke
cargo install --path crates/sloppoke-cli

🤖

Claude Code plugin

slash commands + skill

/plugin marketplace add peeramid-labs/plugin-marketplace
/plugin install sloppoke@peeramid-labs

Adds /slop:poke, /slop:apply, /slop:learn.

📡

Codex CLI plugin

before_command hook

codex plugin marketplace add github:peeramid-labs/plugin-marketplace
codex plugin install sloppoke@peeramid-labs

Wires Codex git commit calls to slop poke --staged.

📎

Cursor / Continue

skill only

mkdir -p ~/.claude/skills
curl -fsSL https://raw.githubusercontent.com/peeramid-labs/sloppoke/main/skills/slop.md \
  -o ~/.claude/skills/slop.md

▶

First run

after any install method

slop login                              # SSH-key handshake
slop poke                               # scan working tree
slop poke --gh org/repo --range X..Y    # scan any public repo
slop apply                              # apply + amend HEAD

Stdout = patch, stderr = verdict — pipes to git apply --unidiff-zero or delta directly.

Independently measurable

39.9% of the static-analysis findings that AI commits shipped past human review and reached HEAD across 5,173 production OSS repos would have been caught by sloppoke's pre-commit gate — measured against a peer-published academic dataset of 304K AI-authored commits (Liu et al. 2026).

Re-run the measurement yourself: github.com/peeramid-labs/sloppoke-bench. Same archive → same number, every time.

Privacy & data

Servers in Germany, EU rules. We process diffs, return verdicts, persist only the learning signals — never raw source. Purge anytime via billing portal.

Identity = SSH key fingerprint. No emails, no usernames, no trackers. Stripe handles billing in isolation.

Security

Both CLI and server in Rust — memory-safety CVEs don't exist by construction. Minimal deps: one binary, one HTTP client, one ssh-keygen sign.

7-day release buffer on third-party crate upgrades — they bake, the Rust security advisory feed catches bad ones, then we ship. Stable over cutting edge.

FAQ

What does AI slop cost — and what does sloppoke measure?#

See the Receipts section above for the top incidents (LiteLLM / OpenCode / rsync / Faker.js + the industry signal). One extra entry that fits less neatly into a card:

C23 / glibc compile-fix wave (early 2026) — LLM "shortest semantic path" patches across legacy C utilities to clear glibc 2.43 errors. Aggressive const-casts + macro masking → modern GCC/Clang optimise unreachable branches → segfaults, silent memory leaks, buffer holes in decade-stable code. (Generic-git scan support inbound: sourceware.org/git/glibc.git and other non-GitHub hosts.)

What sloppoke measures:

Slop density per repo over time (gate merges on a target).
Hits blocked × 1 hr 56 min × your eng rate = hours/dollars saved.
Determinism — same diff → same finding, audit-ready.
Per-category TP / FP, tracked each catalog release.
Verdict p95 <10 ms (elapsed_ms in every response).

Does sloppoke measure runtime performance or guarantee correctness?#

No. Different tool category. Runtime perf → profilers (perf, flamegraphs) + load tests (k6, wrk, Locust). Correctness → types, tests, formal verification. Density of LLM residue in source is what sloppoke measures — a statistical correlation with the failure modes the FAQ above documents, not a proof of correctness or speed.

Adjacent failure mode it does catch indirectly: "shortest semantic path" compile-fix patches (see the C23 / glibc bullet above) leave the code compiling but push GCC/Clang into UB. The markers fire because the patches drop language-level guarantees, not because we instrument the runtime.

Wait, is this frontend vibecoded?#

Yes — landing, copy, pixel widgets, all agent-sketched. The catalog is not: deterministic ML + ruleset from 15+ yrs regulated engineering. Vibe the visible layer; stay deterministic where it counts.

How do you characterize slop?#

Three flavours: wordy nothing (comments restating code, vacant names), defensive theatre (guards for impossible cases, empty catches), unfinished work shipped (placeholders, untested branches, AI trailers in commits). Catalog isn't published and isn't static — every slop learn tunes yours.

Why no GitHub app or PR bot?#

PR bots fire after the slop is in git history. slop poke runs on the staged diff before commit. No force-push cleanup. CI still covered: slop poke --range $BASE..$HEAD drops into any pipeline as a one-liner, exits non-zero on SLOP. No GitHub-App scope on your repo.

Why SSH keys instead of an email signup?#

Reuse what already works. Requests signed by ssh-keygen -Y sign; fingerprint = account. Nothing to provision, no email breach target, no marketing list. Same key in CI: drop in a secret, done.

What does slop actually see about my code?#

Unified diff. Nothing else. No origin URL, no SHA, no OAuth token, no clone. Server processes the patch in memory, returns verdict + apply-patch, persists only learning weights. Diff bytes do contain your literal lines — treat sending them like any code-review tool.

What languages are supported?#

Surface detection on every language. Deep analysis lights up first for Rust, TS/JS, Python, Go. More land continuously server-side — no CLI reinstall.

What if I disagree with a finding?#

slop learn "false positive on … because …". Quiets next scan for your account + project. We keep the calibration weights, not the raw text.

Can sloppoke run inside a Trusted Execution Environment?#

Yes, Enterprise. Server runs in AMD SEV-SNP confidential VM; we can't read your diffs even with root on the host. Remote attestation proves the running binary matches our published hash; session key sealed to that measurement. EU residency default; Intel TDX / AWS Nitro on request. Trust the math.

Can I run it on-prem or self-hosted?#

Enterprise. Private-corpus calibration, SSO, SLA, audit trail, server image inside your perimeter. engineering@peeramid.xyz.

Does the CLI work without the cloud API?#

Today: no. Thin client → catalog match runs server-side. On-prem / TEE available under Enterprise. Patches kept 24 h for the learning loop on our own model fleet (no third-party LLM APIs); only anonymised patterns survive after. EU residency, per- account purge on request.