15 Podcast Episodes
The current alignment plan, and how we might improve it | Buck Shlegeris | EAG Bay Area 23
The current alignment plan, and how we might improve it | Buck Shlegeris | EAG Bay Area 23
Watch on Youtube In this session, Buck is discussing how he thinks we should try to align artificial general intelligen... Read more
26 May 2023
•
50mins
[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021
[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021
Alternative title: “When should you assume that what could go wrong, will go wrong?” Thanks to Mary Phuong and Ryan Gree... Read more
13 May 2023
•
[Week 4] “Supervising strong learners by amplifying weak experts” by Paul Christiano, Buck Shlegeris & Dario Amodei
[Week 4] “Supervising strong learners by amplifying weak experts” by Paul Christiano, Buck Shlegeris & Dario Amodei
Abstract: Many real world learning tasks involve complex or hard-to-specify objectives, and using an easier-to-specify p... Read more
12 May 2023
•
[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021
[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021
Alternative title: “When should you assume that what could go wrong, will go wrong?” Thanks to Mary Phuong and Ryan Gree... Read more
11 May 2023
•
AF - Polysemanticity and Capacity in Neural Networks by Buck Shlegeris
AF - Polysemanticity and Capacity in Neural Networks by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more
7 Oct 2022
•
4mins
SERI 2022: AI alignment and Redwood Research | Buck Shlegeris (CTO)
SERI 2022: AI alignment and Redwood Research | Buck Shlegeris (CTO)
Buck Shlegeris is the CTO of Redwood Research. Buck previously worked at MIRI, studied computer science and physics at t... Read more
12 Aug 2022
•
29mins
Taking pleasure in being wrong (with Buck Shlegeris)
Taking pleasure in being wrong (with Buck Shlegeris)
Read the full transcript here. How hard is it to arrive at true beliefs about the world? How can you find enjoyment in b... Read more
8 Jun 2022
•
1hr 16mins
AF - Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing by Buck Shlegeris
AF - Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more
2 Jun 2022
•
4mins
AF - The prototypical catastrophic AI action is getting root access to its datacenter by Buck Shlegeris
AF - The prototypical catastrophic AI action is getting root access to its datacenter by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more
2 Jun 2022
•
3mins
AF - The case for becoming a black-box investigator of language models by Buck Shlegeris
AF - The case for becoming a black-box investigator of language models by Buck Shlegeris
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more
6 May 2022
•
4mins