MetalSniper

I stopped looking for signals and built a risk machine instead

An autonomous trading system I designed, built, and operate end to end. A Python brain that analyses markets and decides, a C++ execution core that acts, and an independent risk watchdog that can shut everything down on its own. It runs unattended, survives its own crashes, records every decision it makes, and messages my phone when anything important happens.

Quantitative Systems

Statistical Arbitrage

Risk Engineering

Machine Learning

Python

C++

Interactive Brokers

ZeroMQ

Supabase

View Repository

The Project

MetalSniper trades two strategies through Interactive Brokers: machine learning driven positions in gold and silver, and market neutral arbitrage on companies listed on two stock exchanges at once.

But the project did not start as an engineering exercise. It started as an uncomfortable realization, and that realization became the thesis the whole system is built around.

"Trading is not a prediction business. It is a risk management business that occasionally uses predictions."

The Problem

Every trading influencer sells the same fantasy: the signal. The magic indicator, the AI that predicts the market. I chased that too, until I studied my own trade history.

My wins and losses had less to do with when I got in than with what I did afterwards. The trades that hurt were not bad entries, they were good entries managed badly. Held too long. Added at the wrong time. Refused to admit a winner had turned. So the system I set out to build treats the entry as the least interesting decision, and pours its engineering into everything downstream: holding, exiting, adding, and surviving. The discipline a human cannot sustain at 3am.

Treat the entry as the cheapest decision and engineer everything after it

Separate slow research and judgment from execution critical runtime behavior

Keep risk alive even if the intelligence layer slows down or fails

Build for the hours a human cannot supervise, not the ones they can

Runtime System

Most retail bots are one script talking to a broker and hoping nothing freezes. MetalSniper is split the way a real trading desk splits people, because their failure modes differ.

The three talk over ZeroMQ, and a supervisor process manages them like cattle, not pets. One early bug made this philosophy concrete: the engine reported success to its supervisor even when it had crashed, so the watchdog never restarted it. The system failed politely. Getting exit codes, heartbeats, and readiness signals right turned out to matter more than any trading logic.

The brain

Python ingests prices and news, runs the models, and decides. It is allowed to be slow and thoughtful.

Judgment, in no hurry

The hands

C++ owns the broker connection and places the orders. The hands never think, they act fast and reliably.

Execution only, fast and reliable

The safety officer

A watchdog living inside the hands, deliberately out of the brain's reach, checks every position once a second and can flatten the entire book without asking anyone.

Independent and always awake

The supervisor

A supervisor process manages the three like cattle, not pets: preflight checks before anything trades, automatic restarts with backoff, and a hard stop plus a phone alert if a component crash loops.

Preflight, restart, contain, alert

supervised boot sequence

supervisor start
  -> preflight: environment, ports, broker session, execution mode
  -> launch C++ execution core (broker connection + watchdog)
  -> launch Python brain (models, decision pipeline, stat arb)
  -> connect ZeroMQ: commands (req/rep), telemetry (pub/sub)
  -> monitor heartbeats and readiness probes
  -> on abnormal exit: restart with backoff
  -> on crash loop: hard stop + Telegram alert

Quant Core

The system supports two different trading paradigms without pretending they are the same thing.

Metals, with humility about direction

The system does not boldly declare that gold will rise. Its opening move is a hedge, both directions at once, letting the market kill the wrong leg. The surviving side has revealed real momentum, and that is where the actual work begins: hold through noise, exit genuine turns, and add only under one non negotiable rule.

You may only scale into a position whose profit is already locked in by a stop.

Floating gains are the market's money, not yours.

Dual listing arbitrage

Anglo American is one company whose shares trade in Johannesburg and London, simultaneously, in different currencies. Convert both to rand and they should match. They drift, and when they drift too far, they snap back. The system shorts the expensive listing, buys the cheap one, and collects on the reunion, indifferent to whether markets rise or fall. It watches five such pairs across Johannesburg, London, Amsterdam, and New York, while respecting a constraint most backtests ignore: two prices are only comparable when both exchanges are actually open.

Machine Learning and Research

The judgment calls, is this entry worth taking, has this winner turned, is this add justified, used to be made by a large language model I paid per decision.

I replaced it with small models I trained myself on the system's own decision history: thousands of times faster, free per call, reproducible, and incapable of silently vanishing behind someone else's API. Training them properly was the real work, done under rules I now keep codified in the repo, and the pipeline literally refuses to write a dataset that violates them. It caught three subtle leaks in my own work before a model ever saw them.

Own the model

Small gradient boosted judges trained on real decision history replaced a paid per call API, so nothing critical can disappear behind a vendor.

No future leakage

Every input a model sees must have been knowable at the moment of decision, and every outcome label must account for trading costs.

Respect time

Validation never shuffles the future into the past, and the pipeline refuses to produce a dataset that breaks the rule.

Risk

Risk lives in three layers, each unable to depend on the layer above it.

Hard rules in code

Daily loss halts, exposure caps, and the locked profit rule. No model can override them.

Trained models as judges

Veto shaped by design. They can block a trade or recommend an exit, but they can never force one.

The independent watchdog

If the Python brain freezes or the machine chokes, the C++ sentinel still guards the book with daily loss flattening, rapid loss cuts, and trailing stops.

The component protecting the money does not depend on the component most likely to fail.

Architecture

Strategy families

Machine learning metals and market neutral arbitrage

Risk layers

Hard rules, model judges, independent watchdog

Arbitrage pairs

Dual listed equities tracked for mispricing

Exchanges

Johannesburg, London, Amsterdam, and New York

Failure isolation

The system is split the way a trading desk splits people, because their failure modes differ. Risk logic lives outside the crash prone layer, so the part protecting capital does not depend on the part most likely to fail.

Execution that acts, not thinks

A C++17 core owns broker connectivity and order placement against the Interactive Brokers API, kept deliberately dumb and fast so heavy model work in Python can never stall an exit.

Research discipline

Leakage gates, cost aware labels, time respecting validation, postmortems, and a full audit trail all point to a builder who cares about whether a result is real, not just whether it looks good.

Layer	Choice	Why
Brain	Python 3.12	Model inference, the decision pipeline, and the statistical arbitrage engine
Execution core	C++17 + IBKR API	Broker connectivity and order placement, isolated from slow AI workloads
Messaging	ZeroMQ	Request and reply for commands, publish and subscribe for telemetry
Decision models	Gradient-boosted trees	Trained on the system's own history, with training and serving parity enforced at load
Control plane & audit	Supabase (Postgres)	System state plus a durable record of every autonomous decision
Supervision	Process supervisor	Readiness probes, exponential backoff restarts, and crash loop containment
Alerts	Telegram	Human facing notifications when something important happens

Monitoring

Two discoveries taught me that silence is the most dangerous failure mode.

Dead for months

During an audit I found the old AI judge had been retired by its vendor. Every call was failing, and each component quietly fell back to its own default. No error, no alert. The system had degraded politely. Now every autonomous decision lands in an audit database, and the ones that matter hit my phone.

A 100 percent win rate

A backtest handed me a perfect result. The correct response is not celebration, it is an audit of your own measurement. Exchanges close at different times, so my simultaneous prices were not, and the strategy was harvesting a timestamp illusion. Tested honestly, most of the edge evaporated.

If the sentinel flattens the book at 3am, I know by 3:01. The pairs that survived, I trust because the process was built to kill the ones that lied.

Skills Demonstrated

Each of these is pinned to evidence in the sections above, not asserted in the abstract.

Systems and distributed design

A three process Python and C++ system over ZeroMQ, built around failure isolation so risk logic never depends on the crash prone layer.

Reliability engineering

A supervisor with preflight gates, readiness probes, backoff restarts, and crash loop containment, born from the exit code bug that made self healing real.

Quantitative research

Cointegration and mean reversion analysis across four exchanges, cost aware backtesting, and the scepticism to catch look ahead bias in my own results.

Machine learning engineering

Dataset design with automated leakage gates, time respecting validation, model training, and deployment with training and serving parity enforced in code.

Risk engineering

Layered controls where hard rules outrank models and an independent watchdog outranks everything.

Data engineering

Multi source market data with currency unit normalization, London quotes in pence and Johannesburg in cents, plus persistence and schema migrations.

Observability and auditability

Decision level audit trails, real time alerting, and the conviction that a system must confess its failures loudly.

AI directed development

AI wrote plenty of this code under my specification, review, and verification. Knowing what to build, how to check it, and owning the outcome is the skill.

Delivery

One person carried this from thesis to data to models to infrastructure to a system running live paper trading, with real money consequences pending.

Takeaways

"In the AI age, writing code is the cheap part. What remains scarce is the judgment around it."

Risk must live outside the thing that can crash

A perfect backtest means a broken measurement, not a discovery

Silence is a failure mode, so a system has to confess loudly

Carry one idea through research, engineering, and operation until it runs unattended overnight

That is systems thinking. It is the same discipline whether the system trades metals, routes trucks, or serves customers, and it is what I bring to the table.

Repository

MetalSniper runs on Python and C++ over ZeroMQ against Interactive Brokers, with gradient boosted decision models, an independent C++ risk watchdog, and a full decision audit trail. For the technical deep dive, get in touch.

Open MetalSniper-StatArb