Schedule

July 19th, 2025

8:00 - 9:00

9:00 - 9:15

9:15 - 10:10

Cozmin Ududec (UK AISI): Transcript Analysis and How it Relates to Technical Governance

Lucilla Sioli (EU AI Office): EU Policy for General Purpose AI under the AI Act

Coffee Break 10:10 - 10:25

10:25 - 11:15

Spotlight Talks (Session 1)

1. Measuring What Matters: A Framework for Evaluating Safety Risks in Real-World LLM Applications (Goh et al.)

2. Trends in AI Supercomputers (Pilz et al.)

3. Deprecating Benchmarks: Criteria and Framework (San Joaquin et al.)

4. LLMs Can Covertly Sandbag On Capability Evaluations Against Chain-of-Thought Monitoring (Li et al.)

11:15 - 12:15

Marta Ziosi (University of Oxford)

Jat Singh (RC-Trust, Germany & University of Cambridge)

Tobin South (Stanford University)

Niloofar Mireshghallah (Meta FAIR)

12:15 - 14:00

Poster Session

Office Hours with UK AISI, EU AI Office, Canadian AISI, Singapore AISI

14:00 - 15:00

Shayne Longpre (MIT): In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI

Victor Ojewale (Brown University): Technical AI Governance in Practice: What Tools Miss, and Where We Go Next

Coffee Break & Poster Session 15:00 - 16:00

16:00 - 16:50

Spotlight Talks (Session 2)

5. Distributed and Decentralised Training: Technical Governance Challenges in a Shifting AI Landscape (Kryś et al.)

6. Evaluating LLM Agent Adherence to Hierarchical Principles: A Lightweight Benchmark for Verifying AI Safety Plan Components
(Ram Potham)

7. CALMA: Context‑Aligned Axes for Language Model Alignment
(Soni et al.)

8. Hardware-Enabled Mechanisms for Verifying Responsible AI Development (O'Gara et al.)