Devlog by @etaiami09-cmd

@etaiami09-cmd on PSim - Particle Simulator · 5 days ago

3h 33m 50s logged

Devlog - 04 - Fast JSON Serialization And Deserialization
I need a way to save the simulation’s state. My options were few:

Create a custom binary format (pretty easy, but not readable)
Create a custom text format (hard but readable)
Use an existing JSON library and format state into a JSON (very easy, very readable)
Option 3 is naturally the best one, so that’s the one I went with.

I used nlohmann/json as my JSON library, and for now I have just gone with the native approach of manually looping through the particles to build an object tree and walking the object tree to deserialize files.

Now, I know that technically this part of the application doesn’t really require any form of optimization, since even with 1,000 particles and O3 compiler optimizations this native approach takes a small enough time that there isn’t even one frame of overhead. But I still didn’t want to leave performance on the table for no reason.

I started with a 10000 particle serialization benchmark. It originally took 100ms to serialize and write to disk with O3 optimizations. After a few changes, mainly first formatting the json object to a std::string and then piping it to a std::ofstream to reduce OS overhead, I quickly reached about 40ms for building an object tree and writing to disk.

I got similar results without really adding any optimizations to the deserialization process.

Now, this is definitely fast enough for the kind of application that PSimUltimate is, but it still seems slow. People manage to write gigabytes of JSON per second, why is my process so slow?

Well, turns out the main reason is the fact that nlohmann/json is, well, pretty slow to build an object tree with. Every single item in the tree is a heap-allocated object, which means both malloc and free overhead and cache-unfriendliness. You can see in the benchmark that serializeState and deserializeState both take a couple milliseconds despite not doing anything but set up the work for the other functions, and this is because of the time it takes to malloc and free the object trees.

Theoretically, there are ways to write faster nlohmann/json code that doesn’t rely on as many heap allocations, but it’s not really worth it for me to do spend time looking into it and designing it right now, so this is good enough for me for now.