You are browsing as a guest. Sign up (or log in) to start making projects!

SideLine - Match Recorder

  • 3 Devlogs
  • 9 Total hours

A fixed-camera AI system that turns one cheap recording of a school match into coach analytics, event reels, and a highlight clip for every player running fully on a local laptop with no subscriptions.

Open comments for this post

27m 6s logged

The pivot — drone → courtside gimbal (and a lesson)

Here’s where the project hit its biggest wall, and I want to log it honestly because the recovery taught me more than any milestone.

I was about to spec a drone. Then I actually researched drone law in Qatar and found out it’s one of the strictest regimes in the world: every single flight needs a QCAA permit (owning a drone gives you no right to fly it), and there’s a privacy law (Penal Code Art. 333) against filming people without consent, penalties up to two years. For a system whose whole job is filming school matches full of players and spectators, that’s not a paperwork hurdle, it’s a wall. My original picture a drone autonomously filming matches was effectively dead.

The lesson, written down so I don’t forget it: research the constraints of the real world before you fall in love with a plan. I’d validated the software to machine precision, but I almost skipped checking whether the deployment was even legal. Check the boring stuff first.

So I swapped the drone for a 3-DOF gimbal on a rail: a motorised camera mount (pan / tilt / roll) that slides along a track, parked courtside. No permit. Same filming-consent terms my existing fixed cameras already navigate. And it runs every match, not the one-off a permitted drone flight would’ve been. It’s more deployable than the drone ever was.

A gimbal is also a categorically real difference from just pointing a camera, for one optical reason that isn’t arguable: a fixed camera can only ever crop the pixels it already has, but a gimbal physically moves the lens real parallax, a real push-in that changes perspective, footage that simply doesn’t exist in a static feed. The human directs the intent (“orbit that player,” “push in”) and the system, which understands the scene, executes the move. That’s the thing a fixed camera can’t do, and it survived the pivot completely intact.

M9 — the gimbal’s brain (software)

Stardance just opened a hardware track, but this milestone was deliberately software: adapt the system to the gimbal before the parts arrive, so the maths is proven when the hardware lands.

The beautiful part: because I’d kept the architecture cleanly separated for nine milestones, the drone→gimbal pivot touched exactly one layer. The gesture engine, the scene tracking, the intent commands, the flight-path engine, the live bridge all unchanged. The only new code is a converter that takes an ideal 3D camera pose and works out what the gimbal can physically do with it: pan/tilt/roll angles to aim, plus a position along the rail.

And it stays honest, which is the whole point. A gimbal at a fixed pivot can rotate perfectly but can’t translate so a true orbit (circling a subject) is impossible without the camera leaving its rail. The converter proves this with numbers instead of faking it: a follow shot aims with 0.00° error and zero positional gap (the rig does it cleanly), while an attempted orbit still aims perfectly but reports a 9.05-unit gap, flagged out-of-reach on every frame the system refuses to pretend it can do something the hardware can’t.

Honest problems that will probably come up: this is geometry on a perfect simulator. The real gimbal will have motor speed limits, inertia, and backlash, it won’t snap to an angle instantly. That’s the hardware phase’s problem, and I’m not pretending the clean sim is the real rig.

Up next: the glove parts are shipping. When they land, the build goes physical. breadboard, sensors, solder. Then the gimbal it is

Original post
@Osmosis

The pivot — drone → courtside gimbal (and a lesson)

Here’s where the project hit its biggest wall, and I want to log it honestly because the recovery taught me more than any milestone.

I was about to spec a drone. Then I actually researched drone law in Qatar and found out it’s one of the strictest regimes in the world: every single flight needs a QCAA permit (owning a drone gives you no right to fly it), and there’s a privacy law (Penal Code Art. 333) against filming people without consent, penalties up to two years. For a system whose whole job is filming school matches full of players and spectators, that’s not a paperwork hurdle, it’s a wall. My original picture a drone autonomously filming matches was effectively dead.

The lesson, written down so I don’t forget it: research the constraints of the real world before you fall in love with a plan. I’d validated the software to machine precision, but I almost skipped checking whether the deployment was even legal. Check the boring stuff first.

So I swapped the drone for a 3-DOF gimbal on a rail: a motorised camera mount (pan / tilt / roll) that slides along a track, parked courtside. No permit. Same filming-consent terms my existing fixed cameras already navigate. And it runs every match, not the one-off a permitted drone flight would’ve been. It’s more deployable than the drone ever was.

A gimbal is also a categorically real difference from just pointing a camera, for one optical reason that isn’t arguable: a fixed camera can only ever crop the pixels it already has, but a gimbal physically moves the lens real parallax, a real push-in that changes perspective, footage that simply doesn’t exist in a static feed. The human directs the intent (“orbit that player,” “push in”) and the system, which understands the scene, executes the move. That’s the thing a fixed camera can’t do, and it survived the pivot completely intact.

M9 — the gimbal’s brain (software)

Stardance just opened a hardware track, but this milestone was deliberately software: adapt the system to the gimbal before the parts arrive, so the maths is proven when the hardware lands.

The beautiful part: because I’d kept the architecture cleanly separated for nine milestones, the drone→gimbal pivot touched exactly one layer. The gesture engine, the scene tracking, the intent commands, the flight-path engine, the live bridge all unchanged. The only new code is a converter that takes an ideal 3D camera pose and works out what the gimbal can physically do with it: pan/tilt/roll angles to aim, plus a position along the rail.

And it stays honest, which is the whole point. A gimbal at a fixed pivot can rotate perfectly but can’t translate so a true orbit (circling a subject) is impossible without the camera leaving its rail. The converter proves this with numbers instead of faking it: a follow shot aims with 0.00° error and zero positional gap (the rig does it cleanly), while an attempted orbit still aims perfectly but reports a 9.05-unit gap, flagged out-of-reach on every frame the system refuses to pretend it can do something the hardware can’t.

Honest problems that will probably come up: this is geometry on a perfect simulator. The real gimbal will have motor speed limits, inertia, and backlash, it won’t snap to an angle instantly. That’s the hardware phase’s problem, and I’m not pretending the clean sim is the real rig.

Up next: the glove parts are shipping. When they land, the build goes physical. breadboard, sensors, solder. Then the gimbal it is

Replies

Loading replies…

0
1
Open comments for this post

1h 59m 29s logged

Hi I’m Aarav Sharma and I want to direct a drone with my hands, not just piloting but actual direction.

Introducing AirLine: direct a drone with your hands, don’t pilot it. Most gesture drones map your hand to a flight stick (tilt to bank, raise to climb) — you become the joystick. AirLine instead reads cinematic intent (“follow that player,” “orbit the subject”) and figures out the flight itself, because it understands the scene. It’s a branch off my capstone SideLine (an AI sports-analytics system that already tracks “the player with the ball”), reusing its vision core untouched. Software first, glove I build myself next, drone last.

Devlog:

M1: AirLine runs SideLine’s tracker through a clean seam, vision core untouched. ~19 FPS.
M2: Lock onto one subject. Found track IDs fragment — 293/540 frames “lost.” Quantified the re-ID problem on footage, not a $400 drone.
M3: Virtual camera follows like an operated shot, drifts to wide when the target’s lost. First time it looked real.
M4: Webcam gestures (MediaPipe) — but it conflicts with my vision stack. Setup stopped instead of breaking it. Best decision of the day.
M5: Quarantined gestures in their own environment. First real-hand numbers: mostly 10/10 after fixing bugs unit tests missed.
M6: Two-process live system — my hand directs the shot in real time. ~297ms latency, only 35ms of it transport.
M7: First 3D camera move: orbit on a tiltable plane, proven by geometric invariants to machine precision (~4e-16).
M8: Added push-in, pull-out, dolly. Full shot vocabulary done. 146 tests, vision core never touched.

Next: hardware actually building the gesture glove by hand and integrating a drone

Original post
@Osmosis

Hi I’m Aarav Sharma and I want to direct a drone with my hands, not just piloting but actual direction.

Introducing AirLine: direct a drone with your hands, don’t pilot it. Most gesture drones map your hand to a flight stick (tilt to bank, raise to climb) — you become the joystick. AirLine instead reads cinematic intent (“follow that player,” “orbit the subject”) and figures out the flight itself, because it understands the scene. It’s a branch off my capstone SideLine (an AI sports-analytics system that already tracks “the player with the ball”), reusing its vision core untouched. Software first, glove I build myself next, drone last.

Devlog:

M1: AirLine runs SideLine’s tracker through a clean seam, vision core untouched. ~19 FPS.
M2: Lock onto one subject. Found track IDs fragment — 293/540 frames “lost.” Quantified the re-ID problem on footage, not a $400 drone.
M3: Virtual camera follows like an operated shot, drifts to wide when the target’s lost. First time it looked real.
M4: Webcam gestures (MediaPipe) — but it conflicts with my vision stack. Setup stopped instead of breaking it. Best decision of the day.
M5: Quarantined gestures in their own environment. First real-hand numbers: mostly 10/10 after fixing bugs unit tests missed.
M6: Two-process live system — my hand directs the shot in real time. ~297ms latency, only 35ms of it transport.
M7: First 3D camera move: orbit on a tiltable plane, proven by geometric invariants to machine precision (~4e-16).
M8: Added push-in, pull-out, dolly. Full shot vocabulary done. 146 tests, vision core never touched.

Next: hardware actually building the gesture glove by hand and integrating a drone

Replies

Loading replies…

0
1
Open comments for this post

6h 4m 13s logged

Hi I’m Aarav Sharma, I started this before Stardance, so this first devlog covers everything built so far (about 6 hours of work) in one go. Sideline is an AI system that films a school football or basketball match from a single fixed camera and turns that one recording into three different things: tactical analytics for coaches, event highlight reels for the school socials, and a personal highlight clip for every player, not just the goal-scorers. It runs entirely on a local laptop with no subscriptions, which matters because the commercial equivalents (Veo, Pixellot) cost hundreds up front plus monthly fees, putting them out of reach for a normal school.
The computer vision (both football and basketball). The core is one shared pipeline that detects players and the ball, tracks them across the match, works out which team each player is on, and maps the camera view onto real court coordinates so distances and positions actually mean something in metres. Getting each piece trustworthy was the bulk of the time, and a lot of it was fighting genuinely hard problems. The basketball ball tracker kept locking onto players’ heads (round, ball-sized, and constantly held right next to the ball during a shot), and I had to prove that no shape or size rule could ever separate them before fixing it with a small appearance classifier I trained on a few thousand hand-sorted image crops (head rejection went from hopeless to 98.5%). Team detection on similar kits kept collapsing into one cluster until I switched to appearance embeddings, which also caught a possession stat that was confidently pointing at the wrong team entirely.
The three deliverables. Coach analytics come out as a one-page report plus an annotated tactical video (heatmaps, distance covered, possession, formation, territory), with every number honestly labelled as either validated or derived. Event highlights are auto-detected and ranked so the best moments float to the top instead of flooding whoever is editing. Player highlights are built to be inclusive by construction: every player who steps on the court gets footage, using their on-ball moments where they have them, which is the part no commercial product bothers with and the part a school actually cares about.
The app. All of that is wrapped in a working local web app, “Sideline, Match Studio”, with a dark cinematic interface a non-technical teacher can run start to finish: upload a match, click the four court corners on a freeze-frame, pick which deliverables they want, and download the results. The backend is a FastAPI server that runs the heavy AI as background jobs (a full match takes around 3 hours, so it processes while you walk away and can survive a restart), and it wraps my existing pipeline scripts rather than rewriting them. The whole thing is tested end to end with 62 passing checks, and I have already stress-tested it on a full 47-minute match to see what breaks at real scale.
Next stop is real deployment at my school, which is the one thing I cannot fully validate until I have footage from our own courts and teams.

Original post
@Osmosis

Hi I’m Aarav Sharma, I started this before Stardance, so this first devlog covers everything built so far (about 6 hours of work) in one go. Sideline is an AI system that films a school football or basketball match from a single fixed camera and turns that one recording into three different things: tactical analytics for coaches, event highlight reels for the school socials, and a personal highlight clip for every player, not just the goal-scorers. It runs entirely on a local laptop with no subscriptions, which matters because the commercial equivalents (Veo, Pixellot) cost hundreds up front plus monthly fees, putting them out of reach for a normal school.
The computer vision (both football and basketball). The core is one shared pipeline that detects players and the ball, tracks them across the match, works out which team each player is on, and maps the camera view onto real court coordinates so distances and positions actually mean something in metres. Getting each piece trustworthy was the bulk of the time, and a lot of it was fighting genuinely hard problems. The basketball ball tracker kept locking onto players’ heads (round, ball-sized, and constantly held right next to the ball during a shot), and I had to prove that no shape or size rule could ever separate them before fixing it with a small appearance classifier I trained on a few thousand hand-sorted image crops (head rejection went from hopeless to 98.5%). Team detection on similar kits kept collapsing into one cluster until I switched to appearance embeddings, which also caught a possession stat that was confidently pointing at the wrong team entirely.
The three deliverables. Coach analytics come out as a one-page report plus an annotated tactical video (heatmaps, distance covered, possession, formation, territory), with every number honestly labelled as either validated or derived. Event highlights are auto-detected and ranked so the best moments float to the top instead of flooding whoever is editing. Player highlights are built to be inclusive by construction: every player who steps on the court gets footage, using their on-ball moments where they have them, which is the part no commercial product bothers with and the part a school actually cares about.
The app. All of that is wrapped in a working local web app, “Sideline, Match Studio”, with a dark cinematic interface a non-technical teacher can run start to finish: upload a match, click the four court corners on a freeze-frame, pick which deliverables they want, and download the results. The backend is a FastAPI server that runs the heavy AI as background jobs (a full match takes around 3 hours, so it processes while you walk away and can survive a restart), and it wraps my existing pipeline scripts rather than rewriting them. The whole thing is tested end to end with 62 passing checks, and I have already stress-tested it on a full 47-minute match to see what breaks at real scale.
Next stop is real deployment at my school, which is the one thing I cannot fully validate until I have footage from our own courts and teams.

Replies

Loading replies…

0
1

Followers

Loading…