Devlog by @ompanem

1h 55m 4s logged

Hi everyone! This is my second devlog. Last time I said that I would add a voiceover and text in the output panel, but I ended up building a lot more so my progress will be split into multiple devlogs.

Voiceover

The voice feature I added reads the narration text out loud so the video actually has a voiceover instead of being silent. I used edge-tts which is free and really easy to setup. I chose it over ElevenLabs because edge-tts doesn’t need an API key or any payment.

One cool thing I learnt is edge-tts is asynchronous which means it has to wait on something while it runs so I had to run it with asyncio.run().

async def make_voice(text, fileName):
    tts = edge_tts.Communicate(text, "en-US-GuyNeural")
    await tts.save(fileName)
asyncio.run(make_voice(narration, "voice.mp3"))

In make_voice, the await tts.save() is when the program waits. It pauses until Microsoft’s servers generate the audio and send it back, then it gets saved as an mp3. That is why the whole method has to be run with asyncio.run() instead of being called as a regular method.

The voice also controls the video length. I measure the audio length and then measure the exact amount of frames needed to match it so the typing always syncs with the narration.

Output panel

Before, the output box at the bottom of the editor was empty. Now it shows the output of the code after the code is finished typing (like how code results are displayed after you run the code).

One limitation: The output is whatever I type into the JSON file, the code doesn’t actually run so it’s on me and future users to put the correct output. Maybe later in the project I could make it run the code automatically.

Next time: I want to add multi scene support so one video can have multiple code snippets and narration instead of just one.

Attached below is a video of my progress on the project. Stay tuned for more!