system: OPERATIONAL
← back to all hacks
SUPPLY CHAIN MEDIUM NEW

HAMLOCK: a backdoor split between the model and the chip

A USENIX Security 2026 paper, covered June 15, 2026, splits a neural-network backdoor across software and silicon — the model alone never misclassifies, so software-only scanners like Neural Cleanse and MNTD find nothing.

2026-06-16 // 6 min affects: dnn-accelerators, fpga, asic, edge-ai, llm-accelerators

What is this?

HAMLOCK (HArdware-Model LOgically Combined attacK) is a neural-network backdoor that lives in two places at once: a few neurons in the model and a tiny circuit in the chip that runs it. Neither half is malicious on its own. The paper — by Sanskar Amgain, Daniel Lobo, Atri Chatterjee, Swarup Bhunia and Fnu Suya (University of Tennessee and University of Florida) — was accepted to USENIX Security 2026 (arXiv:2510.19145) and reached a wider audience through reporting on June 15, 2026. This article summarizes the published findings and defensive implications only; it contains no weights, payloads, or fabrication steps.

The attack matters because deep-learning systems on phones, cars and other edge devices increasingly run on custom silicon (FPGAs, ASICs) sourced from third-party design houses and foundries — each an extra point in the supply chain where someone can alter a device.

How it works

A conventional backdoor lives entirely in the model’s weights: the network learns to misclassify any input carrying a trigger (say, a small colored square). That logic leaves a traceable, layer-by-layer activation path that detectors can find.

HAMLOCK splits the logic across the hardware-software boundary. On the software side, the attacker tunes at most three neurons so they produce unusually high activation values when the trigger appears — but the model still classifies triggered inputs correctly. The misclassification never happens in software. On the hardware side, two small Trojan circuits finish the job: one watches the chosen neurons (reading a single bit or the floating-point exponent of their output) to detect the spike; the other then adds a large bias to the target logit, forcing the attacker’s chosen class. Triggers can be combinational, sequential, or temporal — for example, staying dormant in an autonomous vehicle until a mileage threshold, so the eventual failure reads like wear.

Why it matters

The split is what makes HAMLOCK stealthy across the whole review pipeline. In the lab, the simplest variant misclassified triggered inputs 100% of the time across four datasets and every tested model; the multi-neuron variant landed in the mid-90s. On clean inputs the doctored model performed within a few percent of a normal one. Pull the chip and the backdoor goes silent — the software alone misfired on triggers under 1% of the time. A reviewer testing the model by itself simply sees a working tool.

The hardware footprint is just as easy to miss: synthesized on a 45 nm process, the added logic was around 0.1% of chip area at most, with power overhead disappearing into normal manufacturing variation — so side-channel comparison against a “clean” chip mostly sees noise.

The current evaluation covers image classifiers, but the same FPGA/ASIC accelerators now run transformers and LLMs. Co-author Swarup Bhunia confirmed the activation-monitoring mechanism is expected to generalize to language models — with different payloads — and is the focus of the team’s ongoing work. For anyone deploying models on outsourced silicon, that turns HAMLOCK from an image-classifier curiosity into a supply-chain concern for LLM inference.

Defenses

The catch for defenders is that software-only model scanning is structurally blind here. The authors ran the model through Neural Cleanse and MNTD — both hunt for a trigger that causes a misclassification, and the software model never misclassifies, so there is no trail. Input-level inference-time detectors performed close to a coin flip, and the usual cleanup steps (fine-tuning, pruning) left the backdoor at full strength because retraining reads trigger images as harmless data.

Practical takeaways from the paper and the authors:

  • Don’t trust model-only backdoor scans when inference runs on third-party silicon. A clean Neural Cleanse / MNTD result says nothing about the hardware half.
  • Verify the silicon itself — check fabricated chips for added logic, however small, rather than assuming a trusted netlist survives fabrication.
  • Monitor behavior at runtime. Bhunia points to runtime anomaly detection — tracking internal model behavior during operation — as the most effective line, since the attack only reveals itself when it fires.
  • Pursue hardware-software co-verification, checking a compiled model’s datapath against the hardware layout. The authors flag this as an open research problem and plan to share results with EDA vendors.

Status

HAMLOCK is published research (USENIX Security 2026), not an observed in-the-wild incident. It assumes a strong attacker — access to the hardware design or fabrication stage plus knowledge of the model’s weights and layout — so it is most relevant to organizations that send models to untrusted manufacturers or deploy pretrained models on outsourced accelerators. Proof-of-concept code is public; the LLM-accelerator variant is the authors’ stated next step. Severity here is rated medium: high impact and strong evasion, but meaningful attacker prerequisites.

Key dates: paper posted October 2025 (arXiv:2510.19145), USENIX Security 2026 acceptance, broader disclosure coverage June 15, 2026.

Sources