Learning VPP: Filtering Packets at 100GbE Line Rate | Denys Haryachyy
Skip to primary content
TL;DR. A VPP software data plane classifies and drops packets using a tuple-space search (TSS) : rules are grouped by mask shape into per-mask bihash tables, and each packet probes one table per distinct mask. Matching cost is O(number of distinct masks), independent of how many rules you load . VPP’s in-tree ACL plugin solves the same problem with the TupleMerge variant (one shared bihash, relaxed and merged masks); our filter uses plain TSS (one small exact-mask hash per shape). Same idea, opposite trade-offs.
Dropping a packet sounds free, but the CPU still has to receive it, classify it, decide, and free the buffer. This article is about the classifier that makes the rule-matching step effectively free — no matter how many rules you load.
Figure 1: Tuple-space search — each packet probes one bihash per distinct mask shape (O(#masks)), so matching cost is independent of the rule count.
Tuple-Space Search: One<br>Hash per Mask Shape
Tuple-space search (Figure 1) avoids walking the rule list at all. A rule’s mask<br>shape is the combination of fields it constrains — source/destination<br>prefix lengths, protocol, port ranges. Rules that share a mask shape can<br>be matched by a single exact-match hash lookup on the masked packet<br>fields. So TSS:
Groups rules by mask shape into a handful of per-mask<br>hash tables — each one a VPP clib_bihash_24_8 (a<br>bounded-index, lock-free hash), keyed on the packet’s 7-tuple fields<br>masked to that shape.
Probes one table per distinct mask for each packet.<br>The cost is the number of distinct masks (a small constant —<br>real-world ACLs have tens of mask shapes even with millions of rules),<br>not the number of rules . The first table that returns a<br>hit gives the action; if no table matches, the packet passes.
IPv6 rules use the same TSS with a wider key — a per-mask clib_bihash_40_8 (40-byte key: dst + src IPv6 + protocol). Both the IPv4 and IPv6 paths are pure TSS ; only the key width differs (24 B vs 40 B).
The consequence is the headline of this whole article:<br>classification cost scales with the number of distinct mask<br>shapes, not with how many rules you load. Load 100 rules or 1M; if they collapse into the<br>same handful of masks, the per-packet cost is identical.
Figure 2: A worked tuple-space-search example. Three rules collapse into three mask shapes, each with its own bihash. The packet (dst 192.168.1.55, src 8.8.8.8) is masked once per shape — dst/8, dst/24, dst/24+src/16 — and each masked key probes its table. Mask B hits rule R2 → DROP . The cost is exactly three probes (one per mask), no matter how many rules each table holds.
VPP’s Own ACL Plugin: TSS, but TupleMerge
VPP already ships a tuple-space search — in the in-tree ACL<br>plugin — and it’s worth comparing, because it makes the opposite<br>set of trade-offs. Both group rules by mask shape and probe hash tables<br>instead of scanning a rule list. But the ACL plugin uses the<br>TupleMerge variant (Daly & Torng, ICCCN 2017) and<br>folds every table into a single shared hash, where our filter stays plain<br>TSS with one small hash per mask.
Figure 3: Two takes on TSS. VPP’s ACL plugin (left) merges masks into one shared bihash_48_8, keyed by the full 5-tuple plus a mask-type index, so every hit is a candidate to re-verify. Our filter (right) keeps one small bihash_24_8 per mask shape with the rule index stored inline.
VPP’s ACL plugin. TupleMerge relaxes and merges<br>compatible masks into fewer tables, so a rule can land in a table whose mask<br>omits some of the bits it actually constrains. That bounds the table count —<br>it splits a table once it collects more than 39 colliding rules<br>— but it means a hash hit is only a candidate: the matched rule is<br>re-verified (port ranges and the relaxed-away bits) before it<br>counts. Every logical table lives in one shared<br>clib_bihash_48_8; the 48-byte key is the full 5-tuple plus a<br>mask_type_index and a lookup-context index, so a single physical<br>table holds every per-mask table and every interface’s context at once. To<br>honor ACE order it probes every mask type and keeps the lowest<br>applied-entry index — it can’t stop at the first hit.
Figure 4: The same three rules and packet as Figure 2, now under TupleMerge . Relaxing masks merges them into fewer mask-types in one shared bihash, so the packet takes fewer probes (2 vs 3) — but because bits were relaxed away, a hit is only a candidate: the matched rule (R2) is re-verified against the omitted bits before it counts. In plain TSS (Figure 2) the exact-mask hit needs no such re-check.
Our filter. Ours is plain TSS: one<br>clib_bihash_24_8 per distinct mask shape, no merging and no<br>relaxation. The 24-byte key is just the masked destination and source IPv4<br>plus protocol, and the value is the rule index inlined directly<br>— a collision chain appears only when two rules share a masked key. Because<br>masks are exact, the address-and-protocol match from the hash...