44 Podcast Episodes
“Mech Interp Puzzle 2: Word2Vec Style Embeddings” by Neel Nanda
“Mech Interp Puzzle 2: Word2Vec Style Embeddings” by Neel Nanda
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Code can be found here. No prior k... Read more
28 Jul 2023
•
“Tiny Mech Interp Projects: Emergent Positional Embeddings of Words” by Neel Nanda
“Tiny Mech Interp Projects: Emergent Positional Embeddings of Words” by Neel Nanda
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This post was written in a rush an... Read more
20 Jul 2023
•
“Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla” by Neel Nanda, Tom Lieberum, Matthew Rahtz, János Kramár, Geoffrey Irving, Rohin Shah, vlad_m
“Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla” by Neel Nanda, Tom Lieberum, Matthew Rahtz, János Kramár, Geoffrey Irving, Rohin Shah, vlad_m
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Cross-posting a paper from the Goo... Read more
20 Jul 2023
•
Neel Nanda - Mechanistic Interpretability
Neel Nanda - Mechanistic Interpretability
In this wide-ranging conversation, Tim Scarfe interviews Neel Nanda, a researcher at DeepMind working on mechanistic int... Read more
18 Jun 2023
•
4hr 10mins
Concrete open problems in mechanistic interpretability | Neel Nanda | EAG London 23
Concrete open problems in mechanistic interpretability | Neel Nanda | EAG London 23
https://www.youtube.com/watch?v=dR-dju32ViI
17 Jun 2023
•
52mins
Episode 220 w Neel Nanda & Friends
Episode 220 w Neel Nanda & Friends
Patreon: www.patreon.com/thetastelessgentlemenAlex:www.instagram.com/tasteless_alex/Dom:www.instagram.com/djdomking/ www... Read more
18 May 2023
•
1hr 4mins
Neel Nanda on Math, Tech Progress, Aging, Living up to Our Values, and Generative AI
Neel Nanda on Math, Tech Progress, Aging, Living up to Our Values, and Generative AI
Neel Nanda joins the podcast for a lightning round on mathematics, technological progress, aging, living up to our value... Read more
23 Feb 2023
•
34mins
Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability
Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability
Neel Nanda joins the podcast to talk about mechanistic interpretability and how it can make AI safer. Neel is an indepen... Read more
16 Feb 2023
•
1hr 1min
Neel Nanda on What is Going on Inside Neural Networks
Neel Nanda on What is Going on Inside Neural Networks
Neel Nanda joins the podcast to explain how we can understand neural networks using mechanistic interpretability. Neel i... Read more
9 Feb 2023
•
1hr 4mins
Neel Nanda
Neel Nanda
The hilarious Neal Nanda joins Todd in the barn!See Privacy Policy at https://art19.com/privacy and California Privacy N... Read more
10 Jun 2022
•
1hr 34mins