TRUE AI Lab, Aug 2025 — Group Lunch at ICML 2025
TRUE AI Lab, Aug 2025 — Group Lunch at ICML 2025
TRUE AI Lab, Dec 2024 — Zach Presented His NeurIPS Paper
TRUE AI Lab, Dec 2024 — Zach Presented His NeurIPS Paper
TRUE AI Lab, Dec 2022 — Sanghyun Presented NeurIPS Paper (Oral)
TRUE AI Lab, Dec 2022 — Sanghyun Presented NeurIPS Paper (Oral)

The TRUE AI Lab

We are trustworthy and responsible (TRUE) AI lab directed by Professor Sanghyun Hong in the School of Computer Science at Oregon State University. We build security foundations for AI systems by studying how they make decisions, where they fail, and how they can be made safer. We also use those secure and safe AI systems to strengthen cybersecurity.

🛡️
AI Security
⚙️
AI Reliability
🤖
Agentic AI Systems
🔐
AI for Cybersecurity

To learn more, see our recent research projects.

Recent Publications
Adversarial Robustness of Implicit Neural Representation Based Classifiers
ICML 2026
Jayoung Kim, Kookjin Lee, Noseong Park, and Sanghyun Hong
When Can You Poison Rewards? A Tight Characterization of Reward Poisoning in Linear MDPs
ICML 2026
Jose Efraim Aguilar Escamilla, Haoyang Hong, Jiawei Li, Haoyu Zhao, Xuezhou Zhang, Sanghyun Hong, and Huazheng Wang
AgentBreaker: A Framework for Evaluating Context-Aware Indirect Prompt Injection Risks in Modern Web Agents
ISSTA 2026
Yongbi Son, Changoo Lee, Dongwon Shin, Byoungyoung Lee, Sanghyun Hong, and Sooel Son
Site Isolation is Dead: How Site Isolation is Broken in Agentic Browsers and Extensions
IEEE S&P 2026
Suyoung Lee, Seongho Keum, Changoo Lee, Dongwon Shin, Sanghyun Hong, Byoungyoung Lee, and Sooel Son
IF-Guide: Influence Function-Guided Detoxification of LLMs
NeurIPS 2025
Zachary Coalson, Juhan Bae, Nicholas Carlini, and Sanghyun Hong
Demystifying the Resilience of Large Language Model Inference: An End-to-End Perspective
SC 2025
Yu Sun, Zachary Coalson, Shiyang Chen, Hang Liu, Sanghyun Hong, Zhao Zhang, Bo Fang, and Lishan Yang
Harnessing Input-adaptive Inference for Efficient VLN
ICCV 2025
Dongwoo Kang, Akhil Perincherry, Zachary Coalson, Aiden Gabriel, Stefan Lee, and Sanghyun Hong
Private Investigator: Extracting Personally Identifiable Information from Large Language Models Using Optimized Prompts
USENIX Sec. 2025
Seongho Keum, Dongwon Shin, Leo Marchyok, Sanghyun Hong, and Sooel Son
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
NeurIPS 2024
Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, and Nicholas Carlini
Parameterized Physics-informed Neural Networks for Parameterized PDEs
Award
ICML 2024  | 🏆 Oral
Woojin Cho, Minju Jo, Haksoo Lim, Kookjin Lee, Dongeun Lee, Sanghyun Hong, and Noseong Park

View all publications →

Talks & Presentations
IF-Guide: Influence Function-Guided Detoxification of LLMs SlidesLive
IF-Guide: Influence Function-Guided Detoxification of LLMs
Zachary Coalson · NeurIPS 2025
· Dec 2025
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning SlidesLive
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
Jose Escamilla · NeurIPS 2024
· Dec 2024
Great Haste Makes Great Waste: Exploiting and Attacking Efficient Deep Learning
Great Haste Makes Great Waste: Exploiting and Attacking Efficient Deep Learning
Sanghyun Hong · TrustML Scientist Seminar
· Mar 2023
Handcrafted Backdoors in Deep Neural Networks SlidesLive
Handcrafted Backdoors in Deep Neural Networks
Sanghyun Hong · NeurIPS 2022
· Dec 2022
A Sound Mind in a Vulnerable Body: Practical Hardware Attacks on Deep Learning
A Sound Mind in a Vulnerable Body: Practical Hardware Attacks on Deep Learning
Sanghyun Hong · USENIX Enigma 2021
· Jan 2021
Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks
Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks
Sanghyun Hong · USENIX Security 2019
· Aug 2019
Sponsors