168 - How to Solve AI Alignment with Paul Christiano

Paul Christiano runs the Alignment Research Center, a non-profit research organization whose mission is to align future machine learning systems with human interests.
Apr 22, 2023

Inside the episode

Paul previously ran the language model alignment team at OpenAI, the creators of ChatGPT.

Today, we’re hoping to explore the solution-landscape to the AI Alignment problem, and hoping Paul can guide us on that journey.

In today’s episode, Paul answers many questions, but the overarching ones are:

1) How BIG is the AI Alignment problem?

2) How HARD is the AI Alighment problem?

3) How SOLVABLE is the AI Alignment problem?

Does humanity have a chance? Tune in to hear Paul’s thoughts.


0:00 Intro

9:20 Percentage Likelihood of Death by AI

11:24 Timing

19:15 Chimps to Human Jump

21:55 Thoughts on ChatGPT

27:51 LLMs & AGI

32:49 Time to React?

38:29 AI Takeover

41:51 AI Agency

49:35 Loopholes

51:14 Training AIs to Be Honest

58:00 Psychology

59:36 How Solvable Is the AI Alignment Problem?

1:03:48 The Technical Solutions (Scalable Oversight)

1:16:14 Training AIs to be Bad?!

1:18:22 More Solutions

1:21:36 Stabby AIs

1:26:03 Public vs. Private (Lab) AIs

1:28:31 Inside Neural Nets

1:32:11 4th Solution

1:35:00 Manpower & Funding

1:38:15 Pause AI?

1:43:29 Resources & Education on AI Safety

1:46:13 Talent  

1:49:00 Paul’s Day Job

1:50:15 Nobel Prize

1:52:35 Treating AIs with Respect

1:53:41 Uptopia Scenario

1:55:50 Closing & Disclaimers


Alignment Research Center


Paul Christiano’s Website


