CSCI 699: Adversarial and Trustworthy Foundation Models (Spring 2026)

USC banner

To Students: This website is under construction; please check back frequently. Course logistics may change.

Catalogue Description: Security, privacy, and trust in large language, vision-language, and agent models; adversarial attacks and defenses.

Course Description: This advanced graduate seminar examines security, privacy, and trust in large-scale AI systems, including large language models (LLMs), vision-language models (VLMs), and autonomous agents. The course integrates theoretical foundations with recent research on adversarial attacks and defenses, robustness evaluation, alignment, transparency, and trustworthy deployment of foundation models.

Recommended Preparation: Deep learning/machine learning knowledge on the level of CSCI 566 and CSCI 567 (not a prerequisite though); familiarity with Python and modern AI frameworks (e.g., PyTorch). Prior exposure to large-scale ML, trustworthy AI, or security topics is helpful but not required.

Basic logistics:

Learning Objectives: Students will learn to evaluate security, privacy, and robustness properties of foundation-model systems; analyze threat models and attack vectors; design experiments and metrics; assess defenses and alignment methods; and complete a research-grade project connected to adversarial and trustworthy AI.

Assessment: Paper Presentations (30%), Homework/Experimental Assignments (10%), Semester Research Project (50%), Participation (10%).

Course Format: Weekly seminar with student-led paper presentations, in-class discussion, and a semester-long project. Materials and submissions will be managed via Brightspace and related tooling (e.g., Gradescope/GitHub Classroom when applicable).