6 Podcast Episodes
“Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla” by Neel Nanda, Tom Lieberum, Matthew Rahtz, János Kramár, Geoffrey Irving, Rohin Shah, vlad_m
“Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla” by Neel Nanda, Tom Lieberum, Matthew Rahtz, János Kramár, Geoffrey Irving, Rohin Shah, vlad_m
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Cross-posting a paper from the Goo... Read more
20 Jul 2023
•
[Week 5] “AI safety via debate” by Geoffrey Irving, Paul Christiano and Dario Amodei
[Week 5] “AI safety via debate” by Geoffrey Irving, Paul Christiano and Dario Amodei
Abstract: To make AI systems broadly useful for challenging real-world tasks, we need them to learn complex human goals ... Read more
12 May 2023
•
AF - AXRP Episode 16 - Preparing for Debate AI with Geoffrey Irving by DanielFilan
AF - AXRP Episode 16 - Preparing for Debate AI with Geoffrey Irving by DanielFilan
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more
1 Jul 2022
•
54mins
16 - Preparing for Debate AI with Geoffrey Irving
16 - Preparing for Debate AI with Geoffrey Irving
Many people in the AI alignment space have heard of AI safety via debate - check out AXRP episode 6 if you need a prime... Read more
1 Jul 2022
•
1hr 4mins
AF - Learning the smooth prior by Geoffrey Irving
AF - Learning the smooth prior by Geoffrey Irving
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more
29 Apr 2022
•
18mins
AIAP: AI Alignment through Debate with Geoffrey Irving
AIAP: AI Alignment through Debate with Geoffrey Irving
See full article here: https://futureoflife.org/2019/03/06/ai-alignment-through-debate-with-geoffrey-irving/"To make AI ... Read more
7 Mar 2019
•
1hr 10mins