Fingerprint Diagnostics

In the rapidly evolving landscape of network security and device identification, the question often arises: Can general-purpose Large Language Models (LLMs) replace specialized, purpose-built engines? To answer this, we conducted a comprehensive benchmark comparing Fingerbank's production fingerprinting engine against leading general-purpose LLMs on the same input telemetry. Each engine scored on speed, cost, self-reported confidence, and manufacturer accuracy.

§ 01

Cost vs latency

Average cost against average latency per call. Up-and-right is the sweet spot — cheaper and faster.

Efficiency frontier · log-log

Avg cost × avg latency

Cost per call

Min ↔ max range across scenarios, with average marked

§ 02

Confidence vs accuracy

Detection correctness by test category on the left; manufacturer match rate against Fingerbank’s OUI lookup on the right.

Device-name accuracy · by test category

Accuracy over device category

0% 100%

Accuracy · match rate

How often each engine matched the officially registered manufacturer

§ 03

Signal combinations

Co-occurrence heatmap showing how detection correctness varies with which signals appear together in the payload.

Correctness heatmap · signal × signal

Accuracy over signals combination

0% 100%

§ 04

Failure rate

How often each engine returned a device name containing “error” or “unknown” — i.e. failed to commit to an identification. Lower is better.

Failure rate

Share of cases with no confident device name

§ 05

The verdicts, case by case

Cases are grouped by test ID. Click a group to drop in its case list; click any case to expand. Use search to filter by test ID, description, MAC, or category — click reasoning inside an expanded case to see each engine's stated rationale.