GitHub - jbethune777/ninchi: Human Software Accountability

4 min read Original article ↗

Does your team understand the code they ship?

Ninchi is an open-source developer accountability tool. When a developer opens a pull request, Ninchi analyzes the diff, generates a targeted comprehension question using an LLM, and posts a timed challenge directly in the PR. The developer answers in a web UI. The LLM grades their answer and updates the PR status.

The goal is not to punish developers or restrict AI-assisted coding — it's to ensure humans remain accountable for the software they ship.


How it works

Developer opens a PR
        ↓
Ninchi analyzes the diff
        ↓
LLM generates a comprehension question + rubric
        ↓
Challenge link posted as a PR comment
        ↓
Developer clicks → timed answer page
        ↓
LLM evaluates the answer
        ↓
PR check updated (pass / fail / informational)

Modes

Mode PR impact Use case
Casual None Individual learning, zero friction
Tracking None (always green) Observe before enforcing
Blocking Fails on wrong answer Accountability enforcement
Strict Fail + tighter time limits High-stakes codebases

New installs default to Tracking so you can see it in action before flipping on enforcement.


Try it in 60 seconds (no GitHub App needed)

The fastest way to see Ninchi work — just the LLM pipeline against a real diff:

# Install the core package
pip install -e core/

# Run against the built-in JWT auth sample diff
OPENAI_API_KEY=sk-... python scripts/demo.py

# Use a preset question (no LLM generation wait — good for testing pass/fail)
OPENAI_API_KEY=sk-... python scripts/demo.py --preset

# Run against your own recent changes
git diff HEAD~1 | OPENAI_API_KEY=sk-... python scripts/demo.py --stdin

# Use Anthropic Claude instead of OpenAI
NINCHI_LLM_MODEL=anthropic/claude-3-5-sonnet-20241022 \
ANTHROPIC_API_KEY=sk-ant-... \
python scripts/demo.py

Full self-hosted setup

What you'll need

  • Docker + Docker Compose
  • An OpenAI or Anthropic API key
  • A GitHub account (to create a GitHub App)
  • ngrok (to receive webhooks during local development)

1. Clone and configure

git clone https://github.com/ninchi-ai/ninchi
cd ninchi
cp .env.example .env

Edit .env — at minimum you need:

NINCHI_LLM_MODEL=openai/gpt-4o
OPENAI_API_KEY=sk-...
SECRET_KEY=$(openssl rand -hex 32)
APP_BASE_URL=http://localhost:3000

The rest of the GitHub App credentials are filled in after step 2.

2. Create a GitHub App

See docs/github-app-setup.md for the full step-by-step guide. The short version:

  1. Go to https://github.com/settings/apps/new
  2. Set the webhook URL to your ngrok URL + /webhooks/github
  3. Grant Pull requests (read/write), Contents (read), and Commit statuses (read/write)
  4. Subscribe to Pull request events
  5. Generate a private key and save it as github-app.pem in the repo root
  6. Copy the App ID, Client ID, and Client Secret into .env

3. Start the stack

docker compose up --build

This starts Postgres, Redis, the FastAPI backend (with migrations), and the Celery worker.

Start the Next.js frontend separately for hot reload:

cd frontend && npm install && npm run dev

The UI is at http://localhost:3000. The API is at http://localhost:8000/docs.

4. Install the App on a repo and open a PR

Go to https://github.com/apps/YOUR-APP-NAME/installations/new, install it on a test repo, then open any pull request. A Ninchi challenge comment should appear within ~20 seconds.


Repo structure

ninchi/
├── core/           # The engine: diff analysis, LLM question generation, answer evaluation
├── backend/        # FastAPI API + Celery workers
├── frontend/       # Next.js: timed challenge UI
├── scripts/        # CLI demo tool
└── docs/           # GitHub App setup guide

The core/ package is the heart of Ninchi. It has no dependencies on the rest of the stack and can be used directly:

from ninchi_core import analyze_diff, generate_question, evaluate_answer
from ninchi_core.models import LLMConfig

config = LLMConfig(model="openai/gpt-4o", api_key="sk-...")
analysis = analyze_diff(my_diff_text)
question = generate_question(analysis, config)
result = evaluate_answer(question, developer_answer, config)
print(result.passed, result.score, result.feedback)

Run the tests

pip install -e "core/[dev]"
pytest core/tests/ -v

LLM support

Ninchi uses litellm under the hood, so any model it supports works. Set NINCHI_LLM_MODEL to any litellm model string:

NINCHI_LLM_MODEL=openai/gpt-4o
NINCHI_LLM_MODEL=anthropic/claude-3-5-sonnet-20241022
NINCHI_LLM_MODEL=ollama/llama3.1   # local model via Ollama

Contributing

PRs welcome. A few pointers:

  • The prompts in core/ninchi_core/prompts.py are the most impactful thing to improve — question quality and evaluation fairness are the core product
  • core/tests/ has unit tests for the engine; add tests for any change to generation or evaluation logic
  • The difficulty calibration (matching question depth to the significance of the diff) is an ongoing area of improvement

License

MIT. See LICENSE.


Built by Ninchi. We use Ninchi on our own PRs.