I'm Switching Into AI Safety
๐ Abstract
The article discusses the author's decision to switch from the robotics team to the AI safety team within Google DeepMind. It covers the author's personal reasons for the change, their research interests in AI safety, and their views on the importance of AI safety.
๐ Q&A
[01] The Boring Personal Reasons
1. What were the author's reasons for leaving the robotics team?
- The author had been working on the robotics team for 8 years and felt the need to mix things up.
- The author had been considering the change for about 3 years but was occupied with other commitments like the MIT Mystery Hunt.
- The author believes they have enough of a safety net to be okay if the change doesn't work out.
- The author was persuaded by the argument that specialization is rewarded in research, but ultimately decided that their experience could transfer to non-robotics fields as those fields start facing robotics-style challenges.
[02] The Spicier Research Interests Reasons
1. What are the author's views on the different approaches to robot learning?
- The author is a fan of reinforcement learning due to its generality and potential to exceed human ability, though imitation learning methods have become more dominant in recent years.
- The author is interested in the characteristics of the software-hardware boundary and believes that as software-only agents become more capable, there will be more opportunities to focus on improving the higher-level reasoning of the agents rather than the hardware.
2. What is the author's perspective on the future of AI and the importance of domain expertise?
- The author believes that as language models start exhibiting agentic behaviors, the valuable work will be in deep domain expertise to provide feedback on the models' outputs and datasets.
- The author sees this as an opportunity to have a greater impact by switching to the AI safety field earlier rather than later.
[03] The Full Spice "Why Safety" Reasons
1. How does the author distinguish between the AI safety research field and the AI safety community?
- The author finds the AI safety community worthwhile to engage with in moderation, but does not feel a strong affinity or hatred towards it.
- The author clarifies that their interest in AI safety is not an endorsement of the broader meme space that the topic originated from.
2. What are the core tenets of the author's views on AI safety?
- It is easy to have an objective that is not the same as what the system is optimizing for.
- It's easy for systems to generalize poorly due to insufficient evaluation coverage or not asking the right questions.
- The author is not convinced that current tooling will scale to supervise superhuman AI systems.
- The author expects superhuman AI in their lifetime and believes there is a non-negligible risk of negative outcomes.
- The author believes that even if full AGI is not achieved, transformative AI systems can still pose significant risks.
3. How does the author view the current state of AI safety research and its relationship to broader AI development?
- The author acknowledges arguments that much of the current AI safety work is not making direct progress, but believes that aiming for safety, even if confused, is better than not aiming at all.
- The author is aware of the criticisms of "safetywashing" but believes most people working on AI safety are genuine or simply confused rather than insincere.