Curriculum Vitae

View PDF
shashwatnow@gmail.com

Education

PhD in Artificial Intelligence, 2024-present
ELLIS, Max Planck Institute for Intelligent Systems, Tübingen
Topic: Scaling Supervision for AI Advisors: Jonas Geiping and Douwe Kiela
B.Tech. and M.S. (by Research) in Computer Science Engineering, 2019–2024
International Institute of Information Technology (IIIT), Hyderabad
GPA: 9.60/10
Thesis: New Frontiers for Machine Unlearning, advised by Prof. Ponnurangam K.

Experience

(Incoming) Research Scientist Intern, Meta GenAI, London, June 2025 - October 2025
Project: Scalable Oversight
Researcher, Stanford Existential Risk Institute ML Alignment Theory Scholars (SERI MATS), July–Dec 2023
Mentor: Dan Hendrycks
Quantitative Research Intern, Central Research Team, Millennium India, May–June 2023
Project: AutoML for Tree-based and linear ensembles to find alpha across datasets
Research Intern, Social Choice Theory, LAMSADE, CNRS, May–July 2022
Advisors: Jerome Lang, Dominik Peters
Research Assistant, Language Evolution, Santa Fe Institute, July–Sept 2021
Mentor: Tanmoy Chakroborty
Developer, Distributed Computing Laboratory, Summer@EPFL, May–June 2021
Mentors: Matteo Monti, Rachid Guerraroui
Research Developer, Apertium, Google Summer of Code, April–Aug 2020
Mentors: Mikel Forcada, Jorge Gracia

Publications

  1. Answer Matching Outperforms Multiple Choice for Language Model Evaluations
    Nikhil Chandak*, Shashwat Goel*, Ameya Prabhu, Moritz Hardt, Jonas Geiping
    ICML Assessing World Models Workshop, 2025.

  2. Pitfalls in Evaluating Language Model Forecasters
    Daniel Paleka*, Shashwat Goel*, Jonas Geiping, Florian Tramèr
    ICML Assessing World Models Workshop, 2025.

  3. Measuring Belief Updates in Curious Agents
    Joschka Strüber, Ilze Amanda Auzina, Shashwat Goel, Susanne Keller, Jonas Geiping, Ameya Prabhu, Matthias Bethge
    (Oral) ICML Assessing World Models Workshop, 2025.

  4. Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation
    Shiven Sinha, Shashwat Goel, Ponnurangam Kumaraguru, Jonas Geiping, Matthias Bethge, Ameya Prabhu
    (Oral) ICLR Scaling Self Improving Models Workshop, COLM, 2025.
    [webpage], [code], [data]

  5. Great Models Think Alike and this Undermines AI Oversight
    Shashwat Goel, Joschka Strüber, Ilze Amanda Auzina, Karuna Chandra, P. Kumaraguru, Douwe Kiela, Ameya Prabhu, Matthias Bethge, Jonas Geiping
    (Spotlight) ICML, 2025.
    [code], [tool], [data]

  6. Corrective Machine Unlearning
    Shashwat Goel*, Ameya Prabhu*, Philip Torr, P. Kumaraguru, Amartya Sanyal
    TMLR, 2024.
    [twitter], [code]

  7. The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
    Center for AI Safety, Scale AI
    ICML, 2024.
    [media], [webpage], [code]

  8. Proportional Aggregation of Preferences for Sequential Decision Making
    Nikhil Chandak, Shashwat Goel, Dominik Peters
    (Outstanding Paper Award) AAAI, 2024.
    [twitter], [talk]

  9. Representation Engineering: A Top-Down Approach to AI Transparency
    Center for AI Safety
    ArXiv, 2023.
    [talk], [webpage], [code]

Honours and Awards

Teaching Experience

Academic Service and Outreach

University Groups

last updated: July 23, 2024