8:00 - 9:00
9:00 - 9:15
9:15 - 10:10
Lucilla Sioli (EU AI Office)
Cozmin Ududec (UK AISI): Transcript Analysis and How it Relates to Technical Fovernance
Coffee Break 10:10 - 10:25
10:25 - 11:15
Spotlight Talks (Session 1)
1. Measuring What Matters: A Framework for Evaluating Safety Risks in Real-World LLM Applications (Goh et al.)
2. CALMA: Context‑Aligned Axes for Language Model Alignment
(Soni et al.)
3. Deprecating Benchmarks: Criteria and Framework (San Joaquin et al.)
4. LLMs Can Covertly Sandbag On Capability Evaluations Against Chain-of-Thought Monitoring (Li et al.)
11:15 - 12:15
Marta Ziosi (Oxford)
Jat Singh (Cambridge)
TBC
12:15 - 14:00
Poster Session
Office Hours with UK AISI, EU AI Office, CAISI
14:00 - 15:00
Shayne Longpre (MIT): In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI
Victor Ojewale (Brown): Technical AI Governance in Practice: What Tools Miss, and Where We Go Next
Coffee Break & Poster Session 15:00 - 16:00
16:00 - 16:50
Spotlight Talks (Session 2)
5. Distributed and Decentralised Training: Technical Governance Challenges in a Shifting AI Landscape (Kryś et al.)
6. Evaluating LLM Agent Adherence to Hierarchical Principles: A Lightweight Benchmark for Verifying AI Safety Plan Components
(Ram Potham)
7. Trends in AI Supercomputers (Pilz et al.)
8. Hardware-Enabled Mechanisms for Verifying Responsible AI Development (O'Gara et al.)
16:50 - 17:00