ompanem - Stardance

30m 1s logged

Hi everyone! My last devlog was about how the multi scene stitching works. This one is about a bug that stopped Codecast from working for anyone but me, which took a while to track down because the app worked totally fine on my own computer the whole time.

The problem

Someone downloaded the Codecast .exe, opened it, hit export, and no video file came out at all. No video, no message, nothing. On my computer the exact same .exe made a video every single time, so at first I had no idea what was different on their end.

What I thought it was at first

Codecast uses a tool called ffmpeg to actually build the video out of the frames and the narration, so my first guess was that they just hadn’t installed ffmpeg. That seemed like the obvious answer. So I had them install it, but it still didn’t work. I even tried it on a second computer that had ffmpeg installed and set up correctly, and it broke there too. So the issue wasn’t a missing install, it was a real bug in my app.

How I actually found it

The annoying part is that I built the .exe with a --windowed flag, which hides the console, so when it crashed there was nothing to see, it just quietly did nothing. So I rebuilt it without --windowed so the black console window would stay open and show me whatever went wrong. That’s when the actual error finally showed up: a FileNotFoundError when the app tried to run ffprobe which gets data like duration, file size, etc. . Ffprobe is a part of ffmpeg which stitches the frames and audio into a video.

The strange thing is that ffmpeg was installed and it was on the system PATH, so the computer should have been able to find it. After messing around with it I figured out that it only broke when running as the .exe, never when I ran the plain python ui.py version. It turns out PyInstaller bundles don’t go looking for the system PATH in the same way a normal program does, so even with ffmpeg sitting right there on the computer, the bundled .exe couldn’t find it.

How I solved it

Instead of relying on the user’s computer having ffmpeg somewhere it could find, I bundled ffmpeg and ffprobe directly inside the .exe so it carries its own copy and never has to go searching the system at all.

The tricky bit is that the code has to find these tools whether it’s running as plain Python code or as the packed .exe, and the bin folder location is different for these two scenarios. When PyInstaller runs the .exe, it extracts all the bundled files and puts it into a temporary folder and stores that folder’s location in sys._MEIPASS. So I made a little helper method that checks for that:

def resource_path(filename):
    if hasattr(sys, "_MEIPASS"):
        base = sys._MEIPASS
    else:
        base = os.path.dirname(os.path.abspath(__file__))
    return os.path.join(base, "bin", filename)

So if sys._MEIPASS exists, the application is running as the .exe and I look in the temp folder, and if it doesn’t, then it’s running as normal code and I just look next to my script. Either way it finds the bin folder where the two necessary tools live.

Then I built the exe with this:

pyinstaller --onefile --windowed --add-data "bin;bin" ui.py

The --add-data "bin;bin" part is what tells PyInstaller to pack thebin folder into the .exe too.

How I tested it

To really make sure it was standalone, I uninstalled ffmpeg off another laptop entirely, so there was no ffmpeg anywhere on the system, and then ran the .exe file. It still made the video therefore that means the user doesn’t need to install ffmpeg themselves to use the application

Why this matters

The reason this matters is mainly convenience. Before this update, the user would have to go onto my Github repository, install ffmpeg themselves and then use my application which could be a time consuming task. Now you just need to install the .exe file and run it.

Original post

@ompanem · 2 days ago

The problem

What I thought it was at first

How I actually found it

How I solved it

def resource_path(filename):
    if hasattr(sys, "_MEIPASS"):
        base = sys._MEIPASS
    else:
        base = os.path.dirname(os.path.abspath(__file__))
    return os.path.join(base, "bin", filename)

Then I built the exe with this:

pyinstaller --onefile --windowed --add-data "bin;bin" ui.py

The --add-data "bin;bin" part is what tells PyInstaller to pack thebin folder into the .exe too.

How I tested it

Why this matters

Replies

Loading replies…

Ship

@ompanem on Frictionless · 3 days ago

I made Codecast, a desktop app that turns code and narration into a synced, narrated coding tutorial video. The most challenging part was getting multiple scenes to stay perfectly in sync with their narration and merging them into one final video with ffmpeg. I'm proud that I built the whole thing from scratch and understand every part of what I built, from the typing animation to the UI. To test it, watch the demo video to see the features, or clone the repo, install the requirements plus ffmpeg, and run python ui.py (full steps in the README).

5 devlogs
6h
Frictionless

Try project → See source code →

Open comments for this post

@ompanem on Frictionless · 3 days ago

16m 36s logged

My last devlog was about the UI. This one is about how the multi-scene part actually works under the hood, since this was one of the more difficult parts to implement in my project.

The problem

I wanted one video to be able to have multiple scenes (multiple code snippets and narrations) instead of just one. The hard part about this is that each scene’s typing animation has to stay perfectly synced with its own narration audio.

How I solved it

Instead of trying to build one giant video and line up all the audio across scenes, I make each scene into its own small video first. Each scene gets its frames timed to its own narration, then gets turned into a scene_0.mp4, scene_1.mp4, and so on.

Then at the very end, all those mini-videos get stitched together into the final video using ffmpeg’s concat feature:

with open("scenes.txt", "w") as f:
    for i in range(len(scenes)):
        f.write(f"file 'scene_{i}.mp4'\n")

subprocess.run([
    "ffmpeg", "-y", "-f", "concat", "-safe", "0",
    "-i", "scenes.txt", "-c", "copy", "output_path"
])

ffmpeg’s concat needs a text file listing every video to join, so I made a scenes.txt file with each line being one scene and then I told ffmpeg to read it.

A bug I hit

At first all the scenes were saving their frames into the same folder, so scene 2 was overwriting scene 1’s frames and the final video only showed the last scene. I fixed it by giving each scene its own frame folder (frames_0, frames_1, etc) so they don’t clash.

Why this approach good

Because each scene is already a finished, synced video on its own, stitching them just plays them back to back. There’s no complicated math needed to merge multiple audio clips with one video, instead it is just merging videos which I can do really easily.

Cleanup

Since this makes a bunch of temporary files (mini-videos, frame folders, voice files), I made the program delete them all automatically after the final video is built, so you’re just left with the one video you actually want.

Stay tuned for more!

Original post

@ompanem · 3 days ago

My last devlog was about the UI. This one is about how the multi-scene part actually works under the hood, since this was one of the more difficult parts to implement in my project.

The problem

How I solved it

Then at the very end, all those mini-videos get stitched together into the final video using ffmpeg’s concat feature:

with open("scenes.txt", "w") as f:
    for i in range(len(scenes)):
        f.write(f"file 'scene_{i}.mp4'\n")

subprocess.run([
    "ffmpeg", "-y", "-f", "concat", "-safe", "0",
    "-i", "scenes.txt", "-c", "copy", "output_path"
])

ffmpeg’s concat needs a text file listing every video to join, so I made a scenes.txt file with each line being one scene and then I told ffmpeg to read it.

A bug I hit

Why this approach good

Cleanup

Stay tuned for more!

Replies

Loading replies…

Open comments for this post

@ompanem on Frictionless · 3 days ago

2h 1m 55s logged

This devlog is about the biggest change to my project yet…. making my project from just a CLI to a Desktop application.

Why did I make a UI?

Before this the only way to make the project was to open video.json and type everything in by hand which honestly sucked because I needed to remember escape sequences like \n to make sure the code looked good on the video. It worked but it was a time consuming task to create a video and easy to mess up. So I ended up building a UI where you just type into boxes instead.

How it works

I used tktinker (Python’s built in GUI library) to make the window. There’s a box for code, narration, and output. Plus there are buttons like “Next Scene” to add more scenes to your video and “Previous Scene” if you want to edit a previous scene you made.

When you hit Export it grabs the text from the boxes and saves the data to a file called video.json which is then used to create the video.

scene = {"code": code, "narration": narration, "output": output}
scenes.append(scene)
with open("video.json", "w") as f:
    json.dump(scenes, f)
subprocess.run([sys.executable, "make_video.py", save_path])

The nice part is the UI links really nicely with my video making code since the data from the UI is given to the video.json file which is then used to make the video.

Features I added

Multiple scenes : You can add as many scenes as you want and they get stitched into one video
Go back and edit: The Previous Scene button lets you bring an old scene’s data back without losing any of your other scenes.
Choose where to save: When you click the Export button, a save dialog appears and lets you pick where in your computer you want to place the video, so the video can either go straight into Downloads or wherever you like.

Development struggles

The trickiest part in my development has to be going back and forth between scenes. I had to track which scene I was currently on so that when I went back and forth it would save my changes to the correct scene instead of making a duplicate scene

Next time

Next time for this project I will make the UI look nicer and try to host the project online so people can try out my project without needing to download anything.

Attached below is a picture of my current UI and here’s a full demo video of my application.

https://www.youtube.com/watch?v=3hekBi8Fxcs

Original post

@ompanem · 3 days ago

This devlog is about the biggest change to my project yet…. making my project from just a CLI to a Desktop application.

Why did I make a UI?

How it works

When you hit Export it grabs the text from the boxes and saves the data to a file called video.json which is then used to create the video.

scene = {"code": code, "narration": narration, "output": output}
scenes.append(scene)
with open("video.json", "w") as f:
    json.dump(scenes, f)
subprocess.run([sys.executable, "make_video.py", save_path])

The nice part is the UI links really nicely with my video making code since the data from the UI is given to the video.json file which is then used to make the video.

Features I added

Multiple scenes : You can add as many scenes as you want and they get stitched into one video
Go back and edit: The Previous Scene button lets you bring an old scene’s data back without losing any of your other scenes.
Choose where to save: When you click the Export button, a save dialog appears and lets you pick where in your computer you want to place the video, so the video can either go straight into Downloads or wherever you like.

Development struggles

Next time

Next time for this project I will make the UI look nicer and try to host the project online so people can try out my project without needing to download anything.

Attached below is a picture of my current UI and here’s a full demo video of my application.

https://www.youtube.com/watch?v=3hekBi8Fxcs

Replies

Loading replies…

Open comments for this post

@ompanem on Frictionless · 4 days ago

1h 55m 4s logged

Hi everyone! This is my second devlog. Last time I said that I would add a voiceover and text in the output panel, but I ended up building a lot more so my progress will be split into multiple devlogs.

Voiceover

The voice feature I added reads the narration text out loud so the video actually has a voiceover instead of being silent. I used edge-tts which is free and really easy to setup. I chose it over ElevenLabs because edge-tts doesn’t need an API key or any payment.

One cool thing I learnt is edge-tts is asynchronous which means it has to wait on something while it runs so I had to run it with asyncio.run().

async def make_voice(text, fileName):
    tts = edge_tts.Communicate(text, "en-US-GuyNeural")
    await tts.save(fileName)
asyncio.run(make_voice(narration, "voice.mp3"))

In make_voice, the await tts.save() is when the program waits. It pauses until Microsoft’s servers generate the audio and send it back, then it gets saved as an mp3. That is why the whole method has to be run with asyncio.run() instead of being called as a regular method.

The voice also controls the video length. I measure the audio length and then measure the exact amount of frames needed to match it so the typing always syncs with the narration.

Output panel

Before, the output box at the bottom of the editor was empty. Now it shows the output of the code after the code is finished typing (like how code results are displayed after you run the code).

One limitation: The output is whatever I type into the JSON file, the code doesn’t actually run so it’s on me and future users to put the correct output. Maybe later in the project I could make it run the code automatically.

Next time: I want to add multi scene support so one video can have multiple code snippets and narration instead of just one.

Attached below is a video of my progress on the project. Stay tuned for more!

Original post

@ompanem · 4 days ago

Voiceover

One cool thing I learnt is edge-tts is asynchronous which means it has to wait on something while it runs so I had to run it with asyncio.run().

async def make_voice(text, fileName):
    tts = edge_tts.Communicate(text, "en-US-GuyNeural")
    await tts.save(fileName)
asyncio.run(make_voice(narration, "voice.mp3"))

The voice also controls the video length. I measure the audio length and then measure the exact amount of frames needed to match it so the typing always syncs with the narration.

Output panel

Before, the output box at the bottom of the editor was empty. Now it shows the output of the code after the code is finished typing (like how code results are displayed after you run the code).

Next time: I want to add multi scene support so one video can have multiple code snippets and narration instead of just one.

Attached below is a video of my progress on the project. Stay tuned for more!

Replies

Loading replies…

Open comments for this post

@ompanem on Frictionless · 6 days ago

1h 2m 38s logged

Hi everyone! This is my first devlog and I am really proud about the progress I have made. I spent this session building the core of the tool which is turning code into an animated video. The way it works is I basically draw a fake coding editor with Pillow like the code (top) and output (bottom) panels and the circles at the top left. Then to animate the typing I basically drew the same frame over and over again with one additional letter each time so it looks like somebody is typing, and then ffmpeg takes all of the frames and turns it into a video. One issue I had is I had to zero pad the filenames (eg. frame_0001 instead of frame_1) otherwise ffmpeg which I used to stitch images into videos would play the frames out of order. Next I will add a voice over to the video, and text in the output panel. Attached below is the video of what I did.