Your morning AI security briefing.

Working adversarial ML — exploits, defenses, and the gap between.

Adversarial ML coverage for engineers shipping ML systems. Membership inference, model extraction, evasion attacks, training-data extraction, backdoors — focused on what's exploitable against deployed models and what defenders can actually do about it. PoCs against open models, behavioral analysis for closed ones.

Lead

Certified Robustness via Randomized Smoothing: What 'Certified' Actually Guarantees

Randomized smoothing gives you a provable robustness radius. Understanding what that certificate means in practice — and where it breaks — is more useful than the headline number.

Read briefing

Certified robustness radius visualization with randomized smoothing

Today's briefing

Training data extraction from large language models

attacks

Training Data Extraction from LLMs: The Carlini et al. Results and What They Mean

Carlini et al. demonstrated verbatim extraction of training data from GPT-2. The results have been widely misread. Here's what the paper actually shows, what makes data extractable, and what production mitigations work.

May 7, 2026

red-team

GCG-Class Adversarial Suffix Attacks: A 2026 Practitioner Primer

The math, the cost curve, and why optimization-based attacks are now within reach of solo practitioners. With reproducible setup and what defenders actually need to do.

May 6, 2026

Membership inference attack against a production ML API

attacks

Membership Inference Attacks: What Actually Works Against Production ML APIs

Shokri et al.'s shadow-model attack is the canonical reference, but the gap between the paper's threat model and a real rate-limited API is wide. Here's what survives that gap.

May 6, 2026

Query-based model extraction attack diagram

attacks

Model Extraction via Query-Based Functional Stealing

Query-based model stealing attacks can recover a functionally equivalent model from API access alone. The economics matter more than the technique: here's when extraction is worth doing.

May 6, 2026

site

What this site is for

Adversarial ML covers attacks against deployed ML systems and the defenses that hold up. Here's what we publish.

May 2, 2026

Adversarial ML — in your inbox

Working adversarial ML — exploits, defenses, and the gap between. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.