CodeCraft
CodeCraft: an Educative + CodeCrafters hybrid with Docker-sandboxed, multi-language code grading.
The problem
Learning systems engineering by reading is a dead end — you only internalize Redis, Git or a DNS server by building one. CodeCraft combines structured multi-lesson courses with hands-on 'build your own X' challenges that have staged progression, automated grading and instant feedback. The hard requirement underneath: run arbitrary learner-submitted code on the server, in five languages, without letting it touch the network, the host, or another submission.
The approach
The platform is a Next.js 16 App Router monolith (React 19, TypeScript 5, Tailwind v4) with Prisma over SQLite for persistence, in-browser editing via Monaco, and GitHub OAuth through NextAuth. Content is filesystem-defined: 23 build-your-own challenges each carry a definition.json (304 stages, 855 grading test cases total) and 7 structured courses carry a course.json (59 modules, 617 lessons). Grading executes every submission in a throwaway Docker container; an Anthropic-SDK-backed 'What's wrong with my code?' panel sits next to the editor.
Key decisions
The load-bearing tradeoff was isolation over convenience for code execution: each run spins a fresh container with --rm, --network=none, a memory cap, --cpus=1, --pids-limit=64, a read-only filesystem and a 10s timeout, across four purpose-built images. That is slower and more operationally complex than an in-process interpreter, but it's the only honest way to run untrusted code from the internet. Persistence stayed on SQLite via Prisma rather than a networked DB — a deliberate single-node simplification, with the Docker socket mounted so the app itself launches the sandbox containers.
What broke
Two honest notes. The README prose had drifted from ground truth — it claimed '19 challenges' while the filesystem holds 23 definition.json files, so every number here comes from parsing the files on disk, not the README. And CI was red before it was green: earlier runs failed on the day of the fix before the suite was brought to passing.
Outcome
The codebase is substantial and real: ~50,591 lines of TS/TSX, 36 Next.js API route handlers, 123 commits, with a Prisma schema modelling the full learner platform (progress, submissions, streaks, certificates). Its own test suite runs and passes — 123 tests across 9 files via vitest — kept honest by GitHub Actions CI (typecheck, build, tests), latest run on main green. Production packaging exists as a docker-compose stack: the Next.js app, a one-shot sandbox-image builder, a Caddy auto-TLS reverse proxy with HSTS, and a watchtower watchdog.
Untrusted code runs in a throwaway, network-isolated container per submission, across gcc · go · python · node images. Verified counts from the filesystem, not the README.