Devlog by @subhansh

@subhansh on rumi · about 2 months ago

23h 5m 5s logged

rumi-Devlog #8 🍓

Current status: rumis smarter, my codebase survived, and my sanity is questionable 💔

A huge part of this session wasn’t actually spent improving discovery quality.

It was spent improving everything around RUMI.

As RUMI’s discovery pipeline keeps getting bigger, debugging has become a nightmare. Reading raw JSON reports, searching through logs, and manually inspecting hundreds of entities was getting painful.

So I built a full visualization and monitoring system for RUMI.

The dashboard can now visualize:

discovery reports
theory competitions
adversarial testing results
predictions
refinement stages
scoring breakdowns
knowledge graphs
paper collections
hypothesis memory

all from a single interface.

Knowledge Graph Visualization

One thing I’ve wanted for a while was being able to actually SEE what RUMI was discovering.

Not read it.

See it.

So I built an interactive graph explorer using vis.js.

Every entity type gets its own visual identity.

Nodes glow based on type.

Edges scale with confidence.

Hovering shows metadata, paper counts, and relationships.

The graph uses ForceAtlas physics and overlap avoidance to make large discovery graphs easier to navigate.

Watching hundreds of concepts connect together is honestly way more useful than reading raw graph dumps.

🥀

Discovery Report Explorer

RUMI reports have become massive.

Some are hundreds of kilobytes.

Some are over a thousand lines.

Reading those manually sucks.

So I built dedicated views for:

theory competition
adversarial evaluation
predictions
scoring
derivations
peer review
refinement stages

which makes analyzing runs significantly easier.

Provider Stack Surgery 💔

While testing the new recurrent architecture I discovered a huge bottleneck.

Xiaomi MiMo.

It was slow.

Constantly rate limiting.

Burning through provider chains.

And starving downstream phases.

This turned out to be the reason molecule generation kept randomly failing.

So MiMo got removed from the routing stack.

After testing 17 API keys across all providers:

13 healthy
4 problematic

Current routing stack includes:

NVIDIA DeepSeek
Kimi K2.6
Cerebras GPT-OSS-120B
Groq
Gemini
Fireworks

Still chasing down some timeout and rate limit issues but overall the stack is way healthier now.

Discovery Testing

Used the upgraded architecture to stress test RUMI on several difficult domains.

One of the most interesting runs was Molecular Glue Drug Discovery.

Current run stats:

16 hidden variables
15 mechanisms per loop
7 predictions
multiple adversarial survivors
recurrent refinement loops enabled

The architecture is definitely producing stronger outputs than before.
meow meow meow