David Glukhov

PhD Student at University of Toronto & Vector Institute.

selfie.jpg

I’m David, a first year PhD student in Computer Science at the University of Toronto and the Vector Institute working with Prof. Nicolas Papernot and Prof. Vardan Papyan.

I am interested in formalizing desiderata of secure and reliable generative AI. In this pursuit, I have formalized the commonly described goal of preventing adversaries from learning problematic things through an information-theoretic lens, demonstrating empirical and theoretical limitations of current approaches for safety evaluations and defense methods, and provably demonstrating a safety-utility tradeoff. To illustrate the challenge, I have proposed mosaic prompts, an attack method consisting of decomposing an impermissible task into dual-use, permissible sub-tasks posed to a victim model, enabling jailbreak-free attacks which bypass extant defense methods. I am now looking into “hallucinations” in generative models, with the aim of understanding why, when, and how they occur.

selected publications

  1. Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?
    A. Einstein*†, B. Podolsky*, and N. Rosen*
    Phys. Rev., New Jersey. More Information can be found here , May 1935