Curriculum Vitae

View PDF
shashwatnow@gmail.com

Education

PhD in Artificial Intelligence, 2024-present
ELLIS Institute and Max Planck Institute for Intelligent Systems, Tübingen
Advisors: Jonas Geiping and Douwe Kiela
B.Tech. and M.S. (by Research) in Computer Science Engineering, 2019–2024
International Institute of Information Technology (IIIT), Hyderabad
GPA: 9.60/10
Thesis: New Frontiers for Machine Unlearning, advised by Prof. Ponnurangam K.

Experience

Researcher, Stanford Existential Risk Institute ML Alignment Theory Scholars (SERI MATS), July–Dec 2023
Mentor: Dan Hendrycks
Quantitative Research Intern, Central Research Team, Millennium India, May–June 2023
Project: AutoML for Tree-based and linear ensembles to find alpha across datasets
Research Intern, Social Choice Theory, LAMSADE, CNRS, May–July 2022
Advisors: Jerome Lang, Dominik Peters
Research Assistant, Language Evolution, Santa Fe Institute, July–Sept 2021
Mentor: Tanmoy Chakroborty
Developer, Distributed Computing Laboratory, Summer@EPFL, May–June 2021
Mentors: Matteo Monti, Rachid Guerraroui
Research Developer, Apertium, Google Summer of Code, April–Aug 2020
Mentors: Mikel Forcada, Jorge Gracia

Publications

  1. Great Models Think Alike and this Undermines AI Oversight
    Shashwat Goel, Joschka Strüber, Ilze Amanda Auzina, Karuna Chandra, P. Kumaraguru, Douwe Kiela, Ameya Prabhu, Matthias Bethge, Jonas Geiping
    ArXiv preprint, 2025.
    [code], [tool], [data]

  2. Corrective Machine Unlearning
    Shashwat Goel*, Ameya Prabhu*, Philip Torr, P. Kumaraguru, Amartya Sanyal
    Transactions on Machine Learning Research (TMLR) 2024
    Workshop on Data-centric Machine Learning (DMLR) - Recommended for Journal (Top 15) at the 12th International Conference on Representation Learning (ICLR), 2024.

    [twitter], [code]

  3. The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
    Center for AI Safety, Scale AI
    International Conference on Machine Learning (ICML), 2024.
    [media], [webpage], [code]

  4. Proportional Aggregation of Preferences for Sequential Decision Making
    Nikhil Chandak, Shashwat Goel, Dominik Peters
    Outstanding Paper Award (top 3 out of 12,000+ submissions) at 38th Annual Conference of the Association for the Advancement of Artificial Intelligence (AAAI), 2024.
    [twitter], [talk]

  5. Representation Engineering: A Top-Down Approach to AI Transparency
    Center for AI Safety
    ArXiv, 2023.
    [talk], [webpage], [code]

  6. Probing Negation in Language Models
    Shashwat Singh*, Shashwat Goel*, Saujas Vaduguru, Ponnurangam Kumaraguru
    8th Workshop on Representation Learning for NLP (RepL4NLP)
    61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.

    [code]

  7. Towards Adversarial Evaluations of Inexact Machine Unlearning
    Shashwat Goel*, Ameya Prabhu*, Amartya Sanyal, Ser-Nam Lim, Phillip Torr, Ponnurangam Kumaraguru
    ArXiv, 2023.
    [code]

Honours and Awards

Teaching Experience

Academic Service and Outreach

University Groups

last updated: July 23, 2024