Publications

My publications in machine learning and alignment are listed here. You can find a full overview of my research publications at my Google Scholar page including past work in policy and technology security.

Equal contribution is indicated by *.

2024

  1. Evaluating Frontier Models for Dangerous Capabilities
    Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah, Allan Dafoe, and Toby Shevlane
    arXiv, 2024
  2. Holistic Safety and Responsibility Evaluations of Advanced AI Models
    Laura Weidinger, Joslyn Barnhart, Jenny Brennan, Christina Butterfield, Susie Young, Will Hawkins, Lisa Anne Hendricks, Ramona Comanescu, Oscar Chang, Mikel Rodriguez, Jennifer Beroshi, Dawn Bloxwich, Lev Proleev, Jilin Chen, Sebastian Farquhar, Lewis Ho, Iason Gabriel, Allan Dafoe, and William Isaac
    arXiv, 2024
  3. Detecting Hallucinations in Large Language Models Using Semantic Entropy
    Sebastian Farquhar*Jannik Kossen*Lorenz Kuhn*, and Yarin Gal
    Nature, 2024

2023

  1. Discovering Agents
    Artificial Intelligence, Sep 2023
  2. Tracr: Compiled Transformers as a Laboratory for Interpretability
    David Lindner, János Kramár, Sebastian Farquhar, Matthew Rahtz, Thomas McGrath, and Vladimir Mikulik
    Neural Information Processing Systems, Sep 2023
  3. Challenges With Unsupervised LLM Knowledge Discovery
    Sebastian Farquhar*, Vikrant Varma*, Zachary Kenton*, Johannes Gasteiger, Vladimir Mikulik, and Rohin Shah
    arXiv, Sep 2023
  4. Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
    Lorenz KuhnYarin Gal, and Sebastian Farquhar
    ICLR (notable-top-25%), Sep 2023
  5. Do Bayesian Neural Networks Need To Be Fully Stochastic?
    Mrinank SharmaSebastian FarquharEric Nalisnick, and Tom Rainforth
    AI Stats (Oral), Sep 2023
  6. Model Evaluation for Extreme Risks
    Toby Shevlane, Sebastian Farquhar, Ben Garfinkel, Mary Phuong, Jess Whittlestone, Jade Leung, Daniel Kokotajlo, Nahema Marchal, Markus Anderljung, Noam Kolt, Lewis Ho, Divya Siddarth, Shahar Avin, Will Hawkins, Been Kim, Iason Gabriel, Vijay Bolina, Jack Clark, Yoshua Bengio, Paul Christiano, and Allan Dafoe
    arXiv, Sep 2023
  7. Prediction-Oriented Bayesian Active Learning
    AI Stats, Sep 2023

2022

  1. CLAM: Selective Clarification for Ambiguous Questions with Large Language Models
    Lorenz KuhnYarin Gal, and Sebastian Farquhar
    arXiv, Dec 2022
  2. What ‘Out-of-distribution’ Is and Is Not
    Sebastian Farquhar, and Yarin Gal
    ML Safety workshop at NeurIPS, Dec 2022
  3. Prioritized Training on Points That Are Learnable, Worth Learning, and Not Yet Learned
    Soren MindermannMuhammad Razzak, Winnie Xu, Andreas KirschMrinank Sharma, Adrien Morisot, Aidan GomezSebastian FarquharJan Brauner, and Yarin Gal
    International Conference on Machine Learning, Jun 2022
  4. Understanding Approximation for Bayesian Inference in Neural Networks
    Sebastian Farquhar
    DPhil Thesis at University of Oxford, Apr 2022
  5. Prospect Pruning: Finding Trainable Weights at Initialization Using Meta-gradients
    International Conference on Learning Representations, Mar 2022
  6. Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation
    Jannik KossenSebastian FarquharYarin Gal, and Tom Rainforth
    Neural Information Processing Systems, Feb 2022
  7. Path-Specific Objectives for Safer Agent Incentives
    Sebastian FarquharRyan Carey, and Tom Everitt
    AAAI Conference on Artificial Intelligence, Feb 2022
  8. Stochastic Batch Acquisition for Deep Active Learning
    Andreas Kirsch*Sebastian Farquhar*, Parmida Atighehchian, Andrew Jesson, Frederic Branchaud-Charron, and Yarin Gal
    SubsetML ICML Workshop, Jan 2022

2021

  1. Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
    Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian FarquharAngelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal, and Dustin Tran
    arXiv, Jun 2021
  2. On Statistical Bias In Active Learning: How and When to Fix It
    Sebastian Farquhar*Yarin Gal, and Tom Rainforth*
    International Conference on Learning Representations (Spotlight), Jun 2021
  3. Active Testing: Sample-Efficient Model Evaluation
    Jannik Kossen*Sebastian Farquhar*Yarin Gal, and Tom Rainforth
    International Conference on Machine Learning, Jun 2021
  4. Evaluating Approximate Inference in Bayesian Deep Learning
    Andrew Gordon Wilson, Pavel Izmailov, Matthew D Hoffman, Yarin Gal, Yingzhen Li, Melanie F Pradier, Sharad Vikram, Andrew Foong, Sanae Lotfi, and Sebastian Farquhar
    NeurIPS Competition, Jun 2021

2020

  1. Single Shot Structured Pruning Before Training
    Jul 2020
  2. Liberty or Depth: Deep Bayesian Neural Nets Do Not Need Complex Weight Posterior Approximations
    Sebastian FarquharLewis Smith, and Yarin Gal
    Advances In Neural Information Processing Systems, Jul 2020
  3. Radial Bayesian Neural Networks: Robust Variational Inference In Big Models
    Sebastian FarquharMichael Osborne, and Yarin Gal
    Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, Jul 2020

2019

  1. Try Depth Instead of Weight Correlations: Mean-field Is Not a Restrictive Assumpiton for Variational Inference in Deep Networks
    Sebastian Farquhar, and Yarin Gal
    Bayesian Deep Learning Workshop at NeurIPS, Jul 2019
  2. Benchmarking Bayesian Deep Learning with Diabetic Retinopathy Diagnosis
    Angelos FilosSebastian FarquharAidan N GomezTim G J RudnerZachary KentonLewis SmithMilad Alizadeh, Arnoud de Kroon, and Yarin Gal
    Bayesian Deep Learning Workshop at NeurIPS, Jul 2019

2018

  1. A Unifying Bayesian View of Continual Learning
    Sebastian Farquhar, and Yarin Gal
    Bayesian Deep Learning Workshop at NeurIPS, Dec 2018
  2. Towards Robust Evaluations of Continual Learning
    Sebastian Farquhar, and Yarin Gal
    Lifelong Learning: A Reinforcement Learning Approach Workshop ICML, May 2018
  3. Differentially Private Continual Learning
    Sebastian Farquhar, and Yarin Gal
    Privacy in Machine Learning and AI workshop at ICML, Feb 2018