Deleted 24,000 Lines, Added a Python IDE, and Made AI Grade Your Code

Big update.

Since last time, the site went from 27 copy-pasted lesson pages to one dynamic page, got a real Python editor that runs in the browser, and now has an AI that checks your practice answers and gives you hints.

The Great Deletion

Every lesson used to be its own 907-line HTML file. 27 of them. All basically identical.

Changing one button meant changing it 27 times, which I was obviously never going to do.

So I merged them into one lesson.html that reads ?unit=2&lesson=3 from the URL and imports that lesson’s data.js dynamically.

29 files changed, 25 insertions(+), 24498 deletions(-)

Deleted 24,498 lines! Best commit ever.

The IDE

Next, I replaced the static code examples with an actual editor (CodeMirror) plus Python running in the browser through Pyodide.

No server, because the code runs on your own machine.

Then I added it to the practice questions too: every question gets its own mini IDE with Run + Save, and saves go to Supabase so your code is still there when you come back.

This took way more bug fixing than expected. There was a cursed '\n' error, the editor kept breaking when switching themes, and long lines ran off the screen until I added line wrapping.

Like 5 commits of just fixing the editor (help me).

The AI Checker

The big one. There is now a Check button next to Run that has gpt-4o-mini grade your code:

✗ Almost! Your scoreboard only prints one player.
  Hint: you need a second print() line for the second player.

The site is fully static, so I couldn’t just put an API key in the frontend (the whole internet would spend my credits).

Instead, the code and its output get sent to a Supabase Edge Function, which hides the key, checks that you’re logged in, and asks OpenAI for a strict JSON verdict: passed, feedback, hint.

I compared Gemini and Ollama first, but went with OpenAI because I have $5 of credits left from a previous grant.

And since I don’t want to overuse my credits: 5 checks per minute and 500 per day per user, 1,400 per day for the whole app.

Next up: grading notes per question so the AI knows exactly what each task wants, and maybe an admin dashboard to watch usage.