TRUE AI Lab, Aug 2025 — Group Lunch at ICML 2025

TRUE AI Lab, Aug 2025 — Group Lunch at ICML 2025

TRUE AI Lab, Dec 2024 — Zach Presented His NeurIPS Paper

TRUE AI Lab, Dec 2024 — Zach Presented His NeurIPS Paper

TRUE AI Lab, Dec 2022 — Sanghyun Presented NeurIPS Paper (Oral)

TRUE AI Lab, Dec 2022 — Sanghyun Presented NeurIPS Paper (Oral)

The TRUE AI Lab

We are trustworthy and responsible (TRUE) AI lab directed by Professor Sanghyun Hong in the School of Computer Science at Oregon State University. We build security foundations for AI systems by studying how they make decisions, where they fail, and how they can be made safer. We also use those secure and safe AI systems to strengthen cybersecurity.

🛡️

AI Security/Privacy

⚙️

AI Reliability

🤖

Agentic AI Systems

🔐

AI for Cybersecurity

To learn more, see our recent research projects.

Recent Publications

Adversarial Robustness of Implicit Neural Representation Based Classifiers

Adversarial Robustness of Implicit Neural Representation Based Classifiers

ICML 2026

Jayoung Kim, Kookjin Lee, Noseong Park, and Sanghyun Hong

PDF Code Poster

When Can You Poison Rewards? A Tight Characterization of Reward Poisoning in Linear MDPs

When Can You Poison Rewards? A Tight Characterization of Reward Poisoning in Linear MDPs

ICML 2026

Jose Efraim Aguilar Escamilla, Haoyang Hong, Jiawei Li, Haoyu Zhao, Xuezhou Zhang, Sanghyun Hong, and Huazheng Wang

PDF Code Poster

AgentBreaker: A Framework for Evaluating Context-Aware Indirect Prompt Injection Risks in Modern Web Agents

ISSTA 2026

Yongbi Son, Changoo Lee, Dongwon Shin, Byoungyoung Lee, Sanghyun Hong, and Sooel Son

Site Isolation is Dead: How Site Isolation is Broken in Agentic Browsers and Extensions

Site Isolation is Dead: How Site Isolation is Broken in Agentic Browsers and Extensions

IEEE S&P 2026

Suyoung Lee, Seongho Keum, Changoo Lee, Dongwon Shin, Sanghyun Hong, Byoungyoung Lee, and Sooel Son

PDF Code

IF-Guide: Influence Function-Guided Detoxification of LLMs

IF-Guide: Influence Function-Guided Detoxification of LLMs

NeurIPS 2025

Zachary Coalson, Juhan Bae, Nicholas Carlini, and Sanghyun Hong

PDF Code

Demystifying the Resilience of Large Language Model Inference: An End-to-End Perspective

Demystifying the Resilience of Large Language Model Inference: An End-to-End Perspective

SC 2025

Yu Sun, Zachary Coalson, Shiyang Chen, Hang Liu, Sanghyun Hong, Zhao Zhang, Bo Fang, and Lishan Yang

PDF Code

Harnessing Input-adaptive Inference for Efficient VLN

Harnessing Input-adaptive Inference for Efficient VLN

ICCV 2025

Dongwoo Kang, Akhil Perincherry, Zachary Coalson, Aiden Gabriel, Stefan Lee, and Sanghyun Hong

PDF Code

Private Investigator: Extracting Personally Identifiable Information from Large Language Models Using Optimized Prompts

Private Investigator: Extracting Personally Identifiable Information from Large Language Models Using Optimized Prompts

USENIX Sec. 2025

Seongho Keum, Dongwon Shin, Leo Marchyok, Sanghyun Hong, and Sooel Son

PDF Code

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

NeurIPS 2024

Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, and Nicholas Carlini

PDF Code Poster

Parameterized Physics-informed Neural Networks for Parameterized PDEs

Award

Parameterized Physics-informed Neural Networks for Parameterized PDEs

ICML 2024 | 🏆 Oral

Woojin Cho, Minju Jo, Haksoo Lim, Kookjin Lee, Dongeun Lee, Sanghyun Hong, and Noseong Park

PDF Code Poster

View all publications →

Talks & Presentations

IF-Guide: Influence Function-Guided Detoxification of LLMs

IF-Guide: Influence Function-Guided Detoxification of LLMs

Zachary Coalson · NeurIPS 2025

SlidesLive Paper

· Dec 2025

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning

Jose Escamilla · NeurIPS 2024

SlidesLive Paper

· Dec 2024

Great Haste Makes Great Waste: Exploiting and Attacking Efficient Deep Learning

Great Haste Makes Great Waste: Exploiting and Attacking Efficient Deep Learning

Sanghyun Hong · TrustML Scientist Seminar

YouTube

· Mar 2023

Handcrafted Backdoors in Deep Neural Networks

Handcrafted Backdoors in Deep Neural Networks

Sanghyun Hong · NeurIPS 2022

SlidesLive Paper

· Dec 2022

A Sound Mind in a Vulnerable Body: Practical Hardware Attacks on Deep Learning

A Sound Mind in a Vulnerable Body: Practical Hardware Attacks on Deep Learning

Sanghyun Hong · USENIX Enigma 2021

YouTube Paper

· Jan 2021

Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks

Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks

Sanghyun Hong · USENIX Security 2019

YouTube Paper

· Aug 2019

Sponsors