How detection works under the hood
Sloppoke uses a two-loop architecture. A deterministic pattern
engine in front of your git commit, and a multi-model deliberation
pipeline behind it that continuously sharpens the catalog the engine
runs against.
The fast path
flowchart LR
A["your patch<br/>(git diff --staged)"] --> B["slop CLI"]
B -->|"HTTPS POST<br/>signed by your SSH key"| C["/api/v1/poke"]
C --> D["curated category catalog<br/>~50 categories, language-aware"]
D --> E["hit list<br/>+ suggested cleanup"]
style D fill:#142a36,stroke:#6ee0d8,color:#ecf6f0
style E fill:#1b3645,stroke:#ff85b1,color:#ecf6f0
Steps:
- The CLI reads your patch (
git diff --staged, a range, a file). - It signs the request with your SSH key and POSTs it to
/api/v1/poke. - The server matches the patch against a curated category catalog that's been distilled from real LLM-assisted diffs across many languages and ecosystems.
- Language-aware analysis walks structural categories that flat text
matching can't reach: branch coverage, doc-comment presence,
#[cfg(test)]boundaries, framework idioms. - Hits are deduplicated, scored, and returned with a
cleanup_actionsarray.
Sub-10 ms verdict per typical patch. No model in the request path, no GPU, no per-commit cost. Same patch → same verdict, every time — the verdict is reproducible, not generative.
What's in the catalog
See the catalog reference for the full list. Five buckets cover most of the surface:
- Language-agnostic LLM tells (self-congratulatory verbs, defensive crud, narrative comments)
- Language-specific structural traps (Python
bare_except, Rustunwrapoutside tests, TSas unknown as) - SQL anti-patterns
- Cross-file structural checks (untested branches)
- Comment-marker hunters (FIXME / HACK / XXX / TODO)
Every category has been distilled from real LLM-assisted diffs, not academic taxonomies. The list grows as new patterns surface in the wild.
The slow path — where the intelligence lives
Detection accuracy improves over time via NSED Orchestrator, an asynchronous multi-model deliberation backend.
flowchart TD
A["slop learn feedback"] --> B["LearnLog<br/>(server-side, EU-resident)"]
B --> C["NSED async deliberation"]
C --> D["multi-model panel review"]
C --> E["category weight tuning"]
C --> F["per-account corpus diffs"]
D --> G["per-account catalog updates"]
E --> G
F --> G
G -->|"next deploy"| H["fast path"]
style C fill:#142a36,stroke:#a78bfa,color:#ecf6f0
style H fill:#1b3645,stroke:#6ee0d8,color:#ecf6f0
What this means for you:
- Every false positive you report (
slop learn "the X warning is wrong because…") feeds the deliberation loop. - The loop processes feedback out-of-band — no synchronous latency at the commit boundary.
- Deliberation runs on our own model fleet, not third-party LLM APIs. Your patch text never leaves our infrastructure to a vendor.
- After 24 h only anonymized, generalized detection patterns survive server-side, scoped to your account.
Why this split is the right one
Detection at the commit boundary needs determinism: same patch → same verdict, every time. Tuning needs intelligence: surfacing patterns no catalog entry has codified yet.
Mixing them on the hot path would mean a model call per commit, which would:
- Add seconds of latency to every
git commit - Cost real money per commit
- Make every verdict non-reproducible
Splitting them keeps slop poke boring, fast, and predictable. The
intelligence lives in the catalog. The catalog learns. You stay on
the fast path.