DocsOrderbook
Learning Path00 Start Here01 Big Picture
IntroductionWhat Is an Order BookBids and AsksSpread and Mid PriceLimit OrdersMarket OrdersDepth and Liquidity
Matching Engine Basics02 Core Domain03 Matching Flow05 Guided Exercises06 Visual Guide07 Why This Design08 Performance Bridge
ArchitectureTechnical ArchitectureUI Notes and Terminal Ideas
Engine Performance NotesPerformance
Milestones
Glossary
LearnTerminal
DocsEngine Study

07 Why This Design

Why the project is structured as a deterministic engine first.

07 Why This Design

This chapter exists for one reason:

to answer "why this design?" before your brain starts inventing five other designs and pulling you off track.

That instinct is good. It means you are thinking like an engineer. But early in learning, too many alternatives can make the current design feel less solid than it really is.

So this chapter is here to anchor you.

The core rule

The project does not try to be the most advanced matching engine possible.

It tries to be:

  • correct
  • deterministic
  • understandable
  • measurable
  • good enough to teach real tradeoffs

That is why some structures are simpler than what a production HFT engine might eventually use.

Why `BTreeMap` for bids and asks?

Current choice:

  • bids: BTreeMap<Price, PriceLevel>
  • asks: BTreeMap<Price, PriceLevel>

Why this was chosen:

  • prices stay sorted automatically
  • best bid and best ask are easy to find
  • the relationship between price priority and the data structure is easy to see
  • it is deterministic and readable

Why not a dense price array yet?

  • a dense array assumes a bounded and practical price grid
  • it is faster only when the market structure makes that worthwhile
  • it adds another layer of representation to understand early

Why not a heap?

  • heaps are good at finding a best value
  • they are worse at full ordered traversal and level management
  • they also do not naturally model grouped price levels the way a map does

Beginner takeaway:

BTreeMap is not the fastest possible structure. It is the clearest structure that still behaves like a real engine.

Why `PriceLevel` instead of storing all orders in one giant list?

Current choice:

  • orders are grouped by price first
  • then FIFO is handled inside each price level

Why:

  • price-time priority is easier to reason about in two layers:
  • best price first
  • then oldest order at that price

Why not one giant sorted structure of all orders?

  • it hides the mental model
  • it makes price levels less obvious
  • it makes “best price, then FIFO within that price” harder to see directly

Beginner takeaway:

Grouping by price is not just an implementation detail. It matches how the matching rule itself works.

Why slot-indexed linked storage inside a price level?

Current choice:

  • each price level keeps FIFO order
  • each order can also be removed later by a known slot

Why:

  • matching wants cheap pop-from-front behavior
  • cancellation wants cheap removal of a known resting order
  • the current design tries to satisfy both

Why not just use VecDeque<Order>?

  • VecDeque is very clean for FIFO
  • but cancellation inside one deep level becomes expensive if you have to search
  • earlier versions showed that deep cancel was a real weakness

Why not use a hash-heavy linked design?

  • we tried that direction in prototype form
  • it improved one microcase
  • it made broader engine behavior worse

Why this current design instead?

  • it keeps the cancel win
  • it stays more compact than the rejected hash-heavy variant
  • it still makes FIFO visible

Beginner takeaway:

This is an example of a compromise structure: not the simplest possible, not the fanciest possible, but one that exposes the right tradeoff for learning.

Why an `OrderLocator` index?

Current choice:

  • HashMap<OrderId, OrderLocator>

Why:

  • cancel needs a direct path from OrderId to the book location
  • without it, you would have to scan the book

Why not just search all levels on cancel?

  • because that teaches the wrong habit
  • it is simple, but too wasteful for a serious engine

Why not store raw pointers everywhere?

  • more dangerous
  • harder to reason about
  • easier to get wrong while learning

Beginner takeaway:

The locator index is a professional compromise: fast enough to matter, simple enough to understand.

Why keep matching logic separate from HTTP?

Why:

  • matching rules should not depend on transport format
  • tests should not need a server
  • future wrappers should be replaceable

Why not just build everything inside the API layer?

  • because then the engine becomes harder to test, reuse, and reason about
  • transport code and matching code start contaminating each other

Beginner takeaway:

Separation is not “extra architecture.” It is what keeps the real logic clean.

Why keep invariants at all?

Why:

  • invariants catch impossible states
  • they help you trust the engine while learning
  • they make refactors safer

Why not disable them permanently for speed?

  • because then you lose an important correctness tool
  • and you stop learning where safety costs actually come from

Why not always do full-book invariant walks?

  • because measurements showed that was too expensive on the hot path

Why Local by default?

  • it keeps checks on
  • it limits the work to touched structures
  • it preserves the learning value without paying the full old cost

Beginner takeaway:

This is a good example of engineering compromise: not maximum safety cost, not zero safety cost, but a measured middle ground.

Why not optimize everything right now?

Because optimization without understanding tends to make systems worse.

This repo already proved that:

  • one attempted optimization looked good in a narrow case
  • benchmarks showed it hurt the engine more broadly
  • the right move was to reject it

That is real engineering.

The right way to use this chapter

When you catch yourself thinking:

  • "Why not just use a vector?"
  • "Why not use a heap?"
  • "Why not use a database?"
  • "Why not make it async?"

come back here first.

Most of the time, the answer is:

because the project is trying to teach the matching engine clearly before chasing every advanced possibility.

PreviousNext
On This Page9
The core ruleWhy `BTreeMap` for bids and asks?Why `PriceLevel` instead of storing all orders in one giant list?Why slot-indexed linked storage inside a price level?Why an `OrderLocator` index?Why keep matching logic separate from HTTP?Why keep invariants at all?Why not optimize everything right now?The right way to use this chapter