You are browsing as a guest. Sign up (or log in) to start making projects!

Scout

@Scout

Joined June 2nd, 2026

  • 3Devlogs
  • 4Projects
  • 1Ships
  • 1Votes
Hey everyone, Im Scout. I am a PC tech enthusiast, Raspberry Pi Lover, and Website designer.
Open comments for this post

23m 46s logged

Devlog - 01
Started work on KAKUSEIOS (KaOS), a custom themed WebOS built in vanilla HTML/CSS/JS.

First 20 sum minutes = Fleshing out HTML/CSS skeleton and JS properties.

The concept: instead of a standard landing page, the OS boots through a fake Arch Linux TTY sequence. You log in as USER, run ‘exec kakusei’, watch a gold progress bar fill, then the desktop fades in.
Built out the full foundation today:

Boot sequence: a typewriter terminal effect with configurable line speed and delay per line. Gold loading bar with named stages (Initializing, Loading window manager, Mounting desktop, etc). Transitions via CSS opacity fade into the OS layer.

Desktop is a deep navy/black palette with gold accents throughout. Glassmorphism windows with gold borders and glow. Fixed topbar with OS name, Japanese subtitle (覚醒), and live clock. Kanji character icon placeholders for apps.

Simple WIP Window manager- draggable windows from header handle, open/close system, z-index stacking so last-clicked window always sits on top. Topbar always stays above everything. createWindow() helper lets you spin up new apps without rewriting.
(Working out bugs)

Currently working on: About window: first functional app, OS origin, etc…

Next up: building out functional apps starting with an in-OS terminal.

Devlog - 01
Started work on KAKUSEIOS (KaOS), a custom themed WebOS built in vanilla HTML/CSS/JS.

First 20 sum minutes = Fleshing out HTML/CSS skeleton and JS properties.

The concept: instead of a standard landing page, the OS boots through a fake Arch Linux TTY sequence. You log in as USER, run ‘exec kakusei’, watch a gold progress bar fill, then the desktop fades in.
Built out the full foundation today:

Boot sequence: a typewriter terminal effect with configurable line speed and delay per line. Gold loading bar with named stages (Initializing, Loading window manager, Mounting desktop, etc). Transitions via CSS opacity fade into the OS layer.

Desktop is a deep navy/black palette with gold accents throughout. Glassmorphism windows with gold borders and glow. Fixed topbar with OS name, Japanese subtitle (覚醒), and live clock. Kanji character icon placeholders for apps.

Simple WIP Window manager- draggable windows from header handle, open/close system, z-index stacking so last-clicked window always sits on top. Topbar always stays above everything. createWindow() helper lets you spin up new apps without rewriting.
(Working out bugs)

Currently working on: About window: first functional app, OS origin, etc…

Next up: building out functional apps starting with an in-OS terminal.

Replying to @Scout

0
35
Ship Pending review

This is ATLAS, a Image search and geo recognition tool that uses VLMs (Visual Language Models) to scan a string of images, guess where they were taken, and supply estimated coordinates.
(As said in my Dev log the coordinates may be off wildly, this is a known issue being worked on currently)

The code is being worked on every week and major updates are to come, V3.3 is not the final model. I plan a V4 very soon...

  • 1 devlog
  • 1h
Try project → See source code →
Open comments for this post
Reposted by @Scout

1h 21m 13s logged

Devlog - 01

A few weeks ago I almost helped find $20K buried somewhere in the American southwest. ATLAS is the image recognition pipeline I built to try.

It Works by taking a feed of images (50+ for best results, I used 625 for my tests) sending them through Various Local VLMs for inference. (Via LM Studio)

I’ve been working on this project for weeks, decided to share it with everyone here.

Currently I am working on Version 3.3.
Previous models of ATLAS only utilized one VLM for inference, I had noticed the outputs to be significantly off course. So In V3.3 I designed it to use an ensemble system.

Instead of using one model, it cross checks data between 2 or more. Then determines the winning output from all models.

This way if one model hallucinates and the others don’t, Your data wont be skewed… Though I am still working out some bugs, as V3.3 has been off course by a MARGIN.

It runs a user defined boundary. So instead of having to manually check coordinates yourself you’d know if it was incorrect instantly.

Though V3.3 doesn’t really respect the boundary, so I need to get that fixed.

-(My latest test ran 4,355Km off target. this is a known geometry bug in the coordinate derivation, not the inference pipeline itself)

The speed at which ATLAS Processes images depends entirely on your hardware and number/type of model(s) you are using.

Example:

(Using qwen 2.5 7B (Q4_K_M), and Moondream 2 (Q4)

  • 600 Frames

Estimated Times

  • Best tier GPU: RTX 4090, Dual RTX 3090s, Mac Studio M3 Ultra, etc…

  • Runtime:
    Qwen 2.5 7B ~0.25 seconds
    Moondream2 ~0.08 seconds
    Total: 3 to 4 minutes

  • High Tier GPU: RTX 4080, RTX 4070 Ti Super, M3 Max, etc…

  • Runtime
    Qwen 2.5 7B ~0.60 seconds
    Moondream2 ~0.20 seconds
    Total: 6 to 8 minutes

  • Mid tier GPU: RTX 4060 Ti (16GB), RTX 4070, M3 Pro, etc…

  • Runtime
    Qwen 2.5 7B ~2.00 seconds
    Moondream2 ~0.50 seconds
    Total: 22 to 30 minutes

  • Low tier GPU: RTX 3050, GTX 1660 Super, Base M1/M2 Mac, etc…

  • Runtime:
    Qwen 2.5 7B ~5.00 seconds
    Moondream2 ~1.20 seconds
    Total: 1 to 1.5 hours

  • CPU Only: Core i9 / Ryzen 9 (32GB DDR5 RAM), etc…

  • Runtime:
    Qwen 2.5 7B ~35.00 seconds
    Moondream2 ~7.00 seconds
    Total: 6 to 8 hours

(tested on RX 9070 XT 16GB)

Updated features:
Global Luminance Profiling: maps brightness across the entire dataset to skip night frames before inference, saves ~10% compute

Triple-Model Consensus Pass: runs 3 VLMs in parallel and cross-checks outputs to filter hallucinations

Orbital Mechanics Longitude Tracking: derives longitude from solar noon timing in image timestamps

Haversine Validator: checks derived coordinates against the search boundary using spherical earth math

Currently working on fixing shadow angle calculation (Keeps returning unknown for certain vectors), and Search area constraint bias.
(Image shows V3.3 running with a Tri-Qwen ensemble)

Devlog - 01

A few weeks ago I almost helped find $20K buried somewhere in the American southwest. ATLAS is the image recognition pipeline I built to try.

It Works by taking a feed of images (50+ for best results, I used 625 for my tests) sending them through Various Local VLMs for inference. (Via LM Studio)

I’ve been working on this project for weeks, decided to share it with everyone here.

Currently I am working on Version 3.3.
Previous models of ATLAS only utilized one VLM for inference, I had noticed the outputs to be significantly off course. So In V3.3 I designed it to use an ensemble system.

Instead of using one model, it cross checks data between 2 or more. Then determines the winning output from all models.

This way if one model hallucinates and the others don’t, Your data wont be skewed… Though I am still working out some bugs, as V3.3 has been off course by a MARGIN.

It runs a user defined boundary. So instead of having to manually check coordinates yourself you’d know if it was incorrect instantly.

Though V3.3 doesn’t really respect the boundary, so I need to get that fixed.

-(My latest test ran 4,355Km off target. this is a known geometry bug in the coordinate derivation, not the inference pipeline itself)

The speed at which ATLAS Processes images depends entirely on your hardware and number/type of model(s) you are using.

Example:

(Using qwen 2.5 7B (Q4_K_M), and Moondream 2 (Q4)

  • 600 Frames

Estimated Times

  • Best tier GPU: RTX 4090, Dual RTX 3090s, Mac Studio M3 Ultra, etc…

  • Runtime:
    Qwen 2.5 7B ~0.25 seconds
    Moondream2 ~0.08 seconds
    Total: 3 to 4 minutes

  • High Tier GPU: RTX 4080, RTX 4070 Ti Super, M3 Max, etc…

  • Runtime
    Qwen 2.5 7B ~0.60 seconds
    Moondream2 ~0.20 seconds
    Total: 6 to 8 minutes

  • Mid tier GPU: RTX 4060 Ti (16GB), RTX 4070, M3 Pro, etc…

  • Runtime
    Qwen 2.5 7B ~2.00 seconds
    Moondream2 ~0.50 seconds
    Total: 22 to 30 minutes

  • Low tier GPU: RTX 3050, GTX 1660 Super, Base M1/M2 Mac, etc…

  • Runtime:
    Qwen 2.5 7B ~5.00 seconds
    Moondream2 ~1.20 seconds
    Total: 1 to 1.5 hours

  • CPU Only: Core i9 / Ryzen 9 (32GB DDR5 RAM), etc…

  • Runtime:
    Qwen 2.5 7B ~35.00 seconds
    Moondream2 ~7.00 seconds
    Total: 6 to 8 hours

(tested on RX 9070 XT 16GB)

Updated features:
Global Luminance Profiling: maps brightness across the entire dataset to skip night frames before inference, saves ~10% compute

Triple-Model Consensus Pass: runs 3 VLMs in parallel and cross-checks outputs to filter hallucinations

Orbital Mechanics Longitude Tracking: derives longitude from solar noon timing in image timestamps

Haversine Validator: checks derived coordinates against the search boundary using spherical earth math

Currently working on fixing shadow angle calculation (Keeps returning unknown for certain vectors), and Search area constraint bias.
(Image shows V3.3 running with a Tri-Qwen ensemble)

Replying to @Scout

3
1116
Open comments for this post

1h 21m 13s logged

Devlog - 01

A few weeks ago I almost helped find $20K buried somewhere in the American southwest. ATLAS is the image recognition pipeline I built to try.

It Works by taking a feed of images (50+ for best results, I used 625 for my tests) sending them through Various Local VLMs for inference. (Via LM Studio)

I’ve been working on this project for weeks, decided to share it with everyone here.

Currently I am working on Version 3.3.
Previous models of ATLAS only utilized one VLM for inference, I had noticed the outputs to be significantly off course. So In V3.3 I designed it to use an ensemble system.

Instead of using one model, it cross checks data between 2 or more. Then determines the winning output from all models.

This way if one model hallucinates and the others don’t, Your data wont be skewed… Though I am still working out some bugs, as V3.3 has been off course by a MARGIN.

It runs a user defined boundary. So instead of having to manually check coordinates yourself you’d know if it was incorrect instantly.

Though V3.3 doesn’t really respect the boundary, so I need to get that fixed.

-(My latest test ran 4,355Km off target. this is a known geometry bug in the coordinate derivation, not the inference pipeline itself)

The speed at which ATLAS Processes images depends entirely on your hardware and number/type of model(s) you are using.

Example:

(Using qwen 2.5 7B (Q4_K_M), and Moondream 2 (Q4)

  • 600 Frames

Estimated Times

  • Best tier GPU: RTX 4090, Dual RTX 3090s, Mac Studio M3 Ultra, etc…

  • Runtime:
    Qwen 2.5 7B ~0.25 seconds
    Moondream2 ~0.08 seconds
    Total: 3 to 4 minutes

  • High Tier GPU: RTX 4080, RTX 4070 Ti Super, M3 Max, etc…

  • Runtime
    Qwen 2.5 7B ~0.60 seconds
    Moondream2 ~0.20 seconds
    Total: 6 to 8 minutes

  • Mid tier GPU: RTX 4060 Ti (16GB), RTX 4070, M3 Pro, etc…

  • Runtime
    Qwen 2.5 7B ~2.00 seconds
    Moondream2 ~0.50 seconds
    Total: 22 to 30 minutes

  • Low tier GPU: RTX 3050, GTX 1660 Super, Base M1/M2 Mac, etc…

  • Runtime:
    Qwen 2.5 7B ~5.00 seconds
    Moondream2 ~1.20 seconds
    Total: 1 to 1.5 hours

  • CPU Only: Core i9 / Ryzen 9 (32GB DDR5 RAM), etc…

  • Runtime:
    Qwen 2.5 7B ~35.00 seconds
    Moondream2 ~7.00 seconds
    Total: 6 to 8 hours

(tested on RX 9070 XT 16GB)

Updated features:
Global Luminance Profiling: maps brightness across the entire dataset to skip night frames before inference, saves ~10% compute

Triple-Model Consensus Pass: runs 3 VLMs in parallel and cross-checks outputs to filter hallucinations

Orbital Mechanics Longitude Tracking: derives longitude from solar noon timing in image timestamps

Haversine Validator: checks derived coordinates against the search boundary using spherical earth math

Currently working on fixing shadow angle calculation (Keeps returning unknown for certain vectors), and Search area constraint bias.
(Image shows V3.3 running with a Tri-Qwen ensemble)

Devlog - 01

A few weeks ago I almost helped find $20K buried somewhere in the American southwest. ATLAS is the image recognition pipeline I built to try.

It Works by taking a feed of images (50+ for best results, I used 625 for my tests) sending them through Various Local VLMs for inference. (Via LM Studio)

I’ve been working on this project for weeks, decided to share it with everyone here.

Currently I am working on Version 3.3.
Previous models of ATLAS only utilized one VLM for inference, I had noticed the outputs to be significantly off course. So In V3.3 I designed it to use an ensemble system.

Instead of using one model, it cross checks data between 2 or more. Then determines the winning output from all models.

This way if one model hallucinates and the others don’t, Your data wont be skewed… Though I am still working out some bugs, as V3.3 has been off course by a MARGIN.

It runs a user defined boundary. So instead of having to manually check coordinates yourself you’d know if it was incorrect instantly.

Though V3.3 doesn’t really respect the boundary, so I need to get that fixed.

-(My latest test ran 4,355Km off target. this is a known geometry bug in the coordinate derivation, not the inference pipeline itself)

The speed at which ATLAS Processes images depends entirely on your hardware and number/type of model(s) you are using.

Example:

(Using qwen 2.5 7B (Q4_K_M), and Moondream 2 (Q4)

  • 600 Frames

Estimated Times

  • Best tier GPU: RTX 4090, Dual RTX 3090s, Mac Studio M3 Ultra, etc…

  • Runtime:
    Qwen 2.5 7B ~0.25 seconds
    Moondream2 ~0.08 seconds
    Total: 3 to 4 minutes

  • High Tier GPU: RTX 4080, RTX 4070 Ti Super, M3 Max, etc…

  • Runtime
    Qwen 2.5 7B ~0.60 seconds
    Moondream2 ~0.20 seconds
    Total: 6 to 8 minutes

  • Mid tier GPU: RTX 4060 Ti (16GB), RTX 4070, M3 Pro, etc…

  • Runtime
    Qwen 2.5 7B ~2.00 seconds
    Moondream2 ~0.50 seconds
    Total: 22 to 30 minutes

  • Low tier GPU: RTX 3050, GTX 1660 Super, Base M1/M2 Mac, etc…

  • Runtime:
    Qwen 2.5 7B ~5.00 seconds
    Moondream2 ~1.20 seconds
    Total: 1 to 1.5 hours

  • CPU Only: Core i9 / Ryzen 9 (32GB DDR5 RAM), etc…

  • Runtime:
    Qwen 2.5 7B ~35.00 seconds
    Moondream2 ~7.00 seconds
    Total: 6 to 8 hours

(tested on RX 9070 XT 16GB)

Updated features:
Global Luminance Profiling: maps brightness across the entire dataset to skip night frames before inference, saves ~10% compute

Triple-Model Consensus Pass: runs 3 VLMs in parallel and cross-checks outputs to filter hallucinations

Orbital Mechanics Longitude Tracking: derives longitude from solar noon timing in image timestamps

Haversine Validator: checks derived coordinates against the search boundary using spherical earth math

Currently working on fixing shadow angle calculation (Keeps returning unknown for certain vectors), and Search area constraint bias.
(Image shows V3.3 running with a Tri-Qwen ensemble)

Replying to @Scout

3
1116

Followers

Loading…