COBS — Consistent Overhead Byte Stuffing
A clever framing trick from 1997: rewrite the message so it contains no zero bytes anywhere — then a single 0x00 at the end is unambiguously the frame boundary. The cost is fixed and tiny: one extra byte per 254 bytes of payload, no matter what the data looks like.
The trick in one paragraph
SLIP escapes C0 with a two-byte sequence, so a pathological payload doubles in size. COBS does something cleverer: it removes every zero from the payload by replacing it with a distance. The encoded frame begins with an overhead byte — a count N — meaning “the next zero is N bytes ahead.” You skip ahead N bytes; that position holds another overhead byte pointing to the next zero, and so on. The chain ends at a final 0x00 appended as the frame delimiter. Since no overhead byte is ever zero, and the original zeros have all been replaced by counts, the encoded frame contains exactly one 0x00 — the delimiter. Receivers find frame boundaries instantly, with no escape-sequence parsing.
Payload bytes
COBS encoding
Cost
Why the overhead is bounded by 1 byte per 254
The overhead byte is a count from 1 to 255. A count of N means “the next zero is at offset N from here” (so the byte at offset N from the overhead byte represents a zero). With 254 non-zero data bytes in a row, the count is 255 (= 0xFF), but no actual zero exists. COBS handles this with one rule: a count of 255 is a special “keep going” marker — the next overhead byte starts immediately after, without consuming a zero. Worst case: payload of 254 non-zero bytes uses one extra overhead byte; 508 non-zero bytes uses two. That’s 1/254 ≈ 0.4% overhead, regardless of content.
SLIP vs COBS for this payload
The point isn’t that one is always smaller — it’s that COBS’s overhead is predictable. SLIP can be cheaper for nice payloads and more than 2× the size for unlucky ones. COBS is always within 0.4% of the input size. For real-time systems where you must size buffers up front, predictable beats lucky.
Things to try
- Click No zeros at all. The encoder still adds two bytes — one overhead byte at the front (
0x07, since the next “zero” is the implicit terminator 7 steps away) and the trailing0x00delimiter. Constant overhead: 2 bytes for any short payload. - Click Lots of zeros. Each zero in the payload becomes an arrow link in the chain. The encoded length equals the input length plus 2 — zeros aren’t free, but they aren’t expensive either. Compare to SLIP, which would cost the same for a payload full of
C0s. - Click 254 non-zero bytes. The count saturates at
0xFF. The decoder treats this as “continue without a zero,” so a follow-on overhead byte appears even though the payload had no actual zero there. This is the only special case in the protocol — and it’s why the worst-case overhead is exactly 1 byte per 254. - Click Single zero at start. The frame starts
0x01 0x01 ...— the first overhead byte points one step ahead, where the second overhead byte sits. That second one then points further ahead. Pointers can land directly on other pointers; that’s the linked-list structure. - Click All zeros. The result is the longest output relative to a small input, but only by a constant factor. n input zeros produce n+1 overhead bytes plus a trailing zero — still beats SLIP’s 2× expansion on its own worst case.
Compared to SLIP, HDLC, and length-prefixed framing
SLIP picks two reserved bytes and escapes them; overhead is unbounded in the worst case (2× expansion) but typically near zero. HDLC bit-stuffing is similar in spirit but works at the bit level (stuff a 0 after every five 1s). Length-prefixed framing avoids stuffing entirely but requires reading the length before any data, so a single corrupt length byte misaligns the rest of the stream until you re-sync somehow.
COBS wins where you need both delimiter-based resync (a single byte you can hunt for) and predictable maximum size (no doubling under adversarial input). That combination is exactly what microcontroller serial protocols and packet-radio links want, which is why COBS shows up in MAVLink, OpenLCB, ROS’s rosserial, and many home-grown firmware framings.