📝 Selected Publications
( * indicates equal contribution. Full list of publications)
🚀 Something is Coming Soon™ (Probably) Status: Thinking hard 🤔 …]

Black-box Optimization of LLM Outputs by Asking for Directions
[ICLR Trustworthy AI workshop 2026, Spotlight Talk]
Position: Adversarial ML Problems Are Getting Harder to Solve and to Evaluate
[ICML 2026, IEEE SP 2025, DLSP workshop]

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics
[NeurIPS 2025, Dataset $\&$ Benchmark Track]

Membership Inference Attacks on Sequence Models
[IEEE SP 2025, DLSP workshop, Best Paper Award]




Blind Baselines Beat Membership Inference Attacks for Foundation Models
[IEEE SP 2025, DLSP workshop]

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
[NeurIPS 2024 Dataset $\&$ Benchmark Track]



