Why pre-commit is the right boundary
You could catch slop in a lot of places: in the editor as you type,
inside an LLM's review step, on a pull request, in CI, on the
production branch. Sloppoke runs against the staged diff right
before git commit. This is deliberate.
The cost graph
The cost of removing slop scales with how far down the pipeline you remove it:
flowchart LR
A["edit<br/>~$1"] --> B["staged diff<br/>~$1<br/>← slop poke here"]
B --> C["pull request<br/>~$10"]
C --> D["merged commit<br/>~$50"]
D --> E["production<br/>~$500"]
E --> F["new hire reads it<br/>'why is this here?'"]
style B fill:#1b3645,stroke:#6ee0d8,color:#ecf6f0
style E fill:#3a1a25,stroke:#ff85b1,color:#ecf6f0
style F fill:#3a1a25,stroke:#ff85b1,color:#ecf6f0
Numbers are illustrative. Direction is not. Each step right adds:
- A reviewer's attention
- A CI cycle
- A force-push to "clean up"
- Time between you and the original intent
Pre-commit is the last cheap step. Anything earlier requires either an editor plugin (per editor, per language) or hooking the LLM itself (works only when the LLM is one specific tool). Pre-commit works regardless of which assistant produced the code, which editor you edited it in, or which CI you run.
Why not at the LLM layer
LLM-side guards are the obvious move and they're fine, but:
- They only cover that LLM
- They run during generation, which means the slop you do ship is by definition slop the guard missed
- They run on every token, not every commit — orders of magnitude more expensive
Pre-commit is one verdict per intent-to-ship. Cheap, decisive, vendor- neutral.
Why not in CI
CI is a fine second gate. The CI how-to walks through it. But by the time CI fires:
- The slop is already in the git history
- A reviewer has to read it
- A force-push is needed to clean it up
- Or the slop ships, because force-pushing got contentious
CI catches what slipped past the local gate. It doesn't replace it.
What gets sacrificed
Pre-commit gating means a milliseconds-fast check on every commit. We can't run a 10-second LLM call there. The detector is a deterministic pattern engine that returns under 10 ms on a typical patch. That constraint shapes the catalog: every entry is mechanically detectable from the diff alone, no model in the request path.
The downside: there are slop classes that need actual reasoning to
catch (e.g. "this entire function duplicates one ten files away"). For
those, the NSED Orchestrator async backend
sharpens the same engine over time from your slop learn feedback.
You stay on the fast path.