Scalable URL Matching with Small Memory Footprint

Anat Bremler-Barr, David Hay, Daniel Krauthgamer, Shimrit Tzur David
IFIP Networking,
2016
Conferences & Workshops
Deep Packet Inspection (DPI)

Abstract

URL matching lies at the core of many networking applications and Information Centric Networking architectures. For example, URL matching is extensively used by Layer 7 switches, ICN/NDN routers, load balancers, and security devices. Modern URL matching is done by maintaining a rich database that consists of tens of millions of URL which are classified to dozens of categories (or egress ports). In real-time, any input URL has to be searched in this database to find the corresponding category.
In this paper, we introduce a generic framework for accurate URL matching (namely, no false positives or miscategorization) that aims to reduce the overall memory footprint, while still having low matching latency. We introduce a dictionary-based compression method that compresses the database by 60%, while having only a slight overhead in time. Our framework is very flexible and it allows hot-updates,
cloud-based deployments, and can deal with strings that are not URLs.