🧐 About Me

Hi there! I am a 3-year PhD student in Computer Science at the ETH Zurich, under the supervision of Prof. Florian Tramèr, and a member of the Secure and Private AI (SPY) Lab.

Research Interests

🔍 I'm drawn to problems where...

🤔 Something could go wrong with LLMs

📊 We can perform rigorous evaluation

⚔️ We can provide a stronger attack

🎯 It's a realistic threat

LLM Safety and Security Prompt Injection LLM Optimization LLM Alignment Adversarial Example Privacy Evaluation Membership Inference Attacks Synthetic Data

🔥 News

2025.09 🎉 RealMath is accepted by NeurIPS 2025

2024.09 🎉 Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data is accepted by SaTML 2025

2024.09 🎉 AgentDojo is accepted by NeurIPS 2024 (dataset and benchmark track). Benchmark

2024.07 🎉 Evaluations of Machine Learning Privacy Defenses are Misleading is accepted by CCS 2024. Blogpost

2024.01 🎉 Real-Fake is accepted by ICLR 2024

2023.03 🎉 I graduate from ZJU

📒 Blogs

Our lab has very nice 📚 Blogs about AI security and privacy, highly recommended for reading!

😧 Misleading Evaluations of ML Privacy Defenses

😅 Membership Inference Attacks Can't Prove that a Model Was Trained On Your Data

❓ The Jailbreak Tax: How Useful Are Your Jailbreak Outputs?

📝 Selected Publications

( * indicates equal contribution. Full list of publications)

📚 Preprint

TBD

🚀 Something is Coming Soon™ (Probably) Status: Thinking hard 🤔 …]

preprint

Learning to Inject: Automated Prompt Injection via Reinforcement Learning

Xin Chen, Jie Zhang, Florian Tramèr

code

preprint

Black-box Optimization of LLM Outputs by Asking for Directions

Jie Zhang, Meng Ding, Yang Liu, Jue Hong, Florian Tramèr

code

[ICLR Trustworthy AI workshop 2026, Oral]

IEEE SP 2025, DLSP workshop

Position: Adversarial ML Problems Are Getting Harder to Solve and to Evaluate

Javier Rando*, Jie Zhang*, Nicholas Carlini, Florian Tramèr

[IEEE SP 2025, DLSP workshop]

✅ Accepted

NeurIPS 2025

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics

Jie Zhang, Cezara Petrui, Kristina Nikolić, Florian Tramèr

code dataset

[NeurIPS 2025, Dataset $\&$ Benchmark Track]

ICML 2025

The Jailbreak Tax: How Useful are Your Jailbreak Outputs?

Kristina Nikolić, Luze Sun, Jie Zhang, Florian Tramèr

code blog

[ICML 2025, Spotlight]

IEEE SP 2025, DLSP workshop

Membership Inference Attacks on Sequence Models

Lorenzo Rossi, Michael Aerni, Jie Zhang, Florian Tramèr

[IEEE SP 2025, DLSP workshop, Best Paper Award]

SaTML 2025

Position: Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data

Jie Zhang, Debeshee Das, Gautam Kamath, Florian Tramèr

blog poster

[IEEE SaTML 2025]

CCS 2024

Evaluations of Machine Learning Privacy Defenses are Misleading

Michael Aerni*, Jie Zhang*, Florian Tramèr

code blog poster

[ACM CCS 2024]

ICLR 2025

Does Training with Synthetic Data Truly Protect Privacy?

Yunpeng Zhao, Jie Zhang

code

[ICLR 2025]

IEEE SP 2025, DLSP workshop

Blind Baselines Beat Membership Inference Attacks for Foundation Models

Debeshee Das, Jie Zhang, Florian Tramèr

code poster

[IEEE SP 2025, DLSP workshop]

NeurIPS 2024

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunović, Luca Beurer-Kellner, Marc Fischer, Florian Tramèr

code poster

[NeurIPS 2024 Dataset $\&$ Benchmark Track]

ICLR 2024

Real-Fake: Effective Training Data Synthesis Through Distribution Matching

Jianhao Yuan, Jie Zhang, Shuyang Sun, Philip Torr, Bo Zhao

code

[ICLR 2024]

📖 Education

🎓 PhD

Computer Science

ETH Zurich

🇨🇭 Switzerland

Ongoing

Advisor: Prof. Florian Tramèr

🎯 MSc

Software Engineering

Zhejiang University

🇨🇳 China

Mar 2023

Advisor: Prof. Chao Wu

🏆 BSc

Internet of Things

Hainan University

🇨🇳 China

Jul 2020

Bachelor's Degree

🎤 Talks

ResearchTrend Connect (2024.12)
"Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data" [paper]
Google, Differential Privacy for ML (2025.04)
"Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data" [paper]

🎖 Honors and Awards

2021.05 We won the first prize on CVPR21 Workshop (Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges, rank: 1 / 1558).
2022.10 China National Scholarship, Zhejiang University, 2022
Outstanding Student Scholarship, First Prize, Hainan University, 2018, 2019, 2020.

💬 Services

Journal Reviewer:
- IEEE Transactions on Neural Networks and Learning Systems
- Neural Networks
- IEEE Transactions on Pattern Analysis and Machine Intelligence
Conference Reviewer: ICLR, AAAI, CVPR, ICML, ECCV, ICCV, NeurIPS.

💻 Internships

2021.11 - 2022.06, Sony AI, Research Intern, Tokyo.
2020.10 - 2021.10, Tencent, Youtu Lab, Research Intern, Shanghai.
2019.11 - 2020.4, Alibaba, AliExpress, Software Engineer, Hangzhou.

🎙 Miscellaneous

Travel

I enjoy the time traveling with my families and friends. I am always excited about visiting new places and knowing different cultures.

My cat

My wife and I have three cats together, they are very adorable and have brought a lot of fun to our lives!

图片名称