Research


Research Interests

My research sits at the intersection of machine learning and security, with a focus on large language models. I’m particularly interested in:

  • Security of LLM code assistants: Understanding how AI coding assistants affect the security of the code developers actually write.
  • Secure code generation: Diagnosing why LLMs produce insecure code and building targeted, mechanistic interventions to fix it.
  • LLM attacks and defenses: Prompt injection, goal hijacking, and adversarial fine-tuning as a defense.
  • Mechanistic interpretability: Using interpretability to locate and steer the internal computations responsible for insecure behavior.

I received my PhD from NYU and was advised by Siddharth Garg and Brendan Dolan-Gavitt, and a member of the EnSuRe Research Lab.

Publications

2026

  • Sandoval, G., Dolan-Gavitt, B., & Garg, S. “Surgical Repair of Insecure Code Generation in LLMs: From Mechanistic Diagnosis to Deployment-Ready Intervention.” arXiv preprint arXiv:2604.16697. [paper]

2025

  • Sandoval, G., Fenchenko, D., & Chen, J. “Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models.” arXiv preprint arXiv:2509.14271. [paper]

2023

  • Sandoval, G., Pearce, H., Nys, T., Karri, R., Garg, S., & Dolan-Gavitt, B. “Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants.” 32nd USENIX Security Symposium (USENIX Security ‘23). [paper] [PDF] (G. Sandoval and H. Pearce contributed equally.)

Collaborations

I’m always interested in collaborating with other researchers in machine learning, security, and interpretability. Feel free to reach out if you’re working on related problems.