Specification gaming: the flip side of AI ingenuity

Source: Data-Efficient Deep Reinforcement Learning for Dexterous Manipulation (Popov et al, 2017)
Source: Faulty Reward Functions in the Wild (Amodei & Clark, 2016)
Source: Deep Reinforcement Learning From Human Preferences (Christiano et al, 2017)
Source: AI Learns to Walk (Code Bullet, 2019)
  • How do we faithfully capture the human concept of a given task in a reward function?
  • How do we avoid making mistakes in our implicit assumptions about the domain, or design agents that correct mistaken assumptions instead of gaming them?
  • How do we avoid reward tampering?
Sources: Montezuma, Hero, Private Eye — Reward learning from human preferences and demonstrations in Atari (Ibarz et al, 2018). Gripper — Learning a high diversity of object manipulations through an evolutionary-based babbling (Ecarlat et al, 2015). Qbert — Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari (Chrabaszcz et al, 2018). Pong, Robot hand — Deep Reinforcement Learning From Human Preferences (Christiano et al, 2017). Ceiling — Genetic Algorithm Physics Exploiting (Higueras, 2015). Pole-vaulting — Towards efficient evolutionary design of autonomous robots (Krcah, 2008). Self-driving car — tweet by Mat Kelcey (Udacity, 2017). Montezuma — Go-Explore: a New Approach for Hard-Exploration Problems (Ecoffet et al, 2019). Somersaulting — Evolved Virtual Creatures (Sims, 1994).

--

--

--

We research and build safe AI systems that learn how to solve problems and advance scientific discovery for all. Explore our work: deepmind.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Women Leading The AI Industry: “Entrepreneurship helps you realize yourself.” with Miku Hirano

What Are We Working Toward?

The Future of Artificial Intelligence

What NLP Training Can Do for Mental Health

Picture of Rebecca Lockwood in a colourful dress, sitting on a sofa smiling as she looks at her open apple laptop screen. What NLP Training Can Do for You. NLP training, NLP coaching, and NLP techniques from, mother, an award-winning Master NLP Mindset Coach, Rebecca Lockwood.

AI Problems

Artificial Intelligence on Blockchain

Women Leading The AI Industry:”I feel that many women in tech, including myself sometimes, suffer…

Which Speech Recognition API to choose for your project?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
DeepMind Safety Research

DeepMind Safety Research

We research and build safe AI systems that learn how to solve problems and advance scientific discovery for all. Explore our work: deepmind.com

More from Medium

Four Architectures that Showcase Meta AI’s Progress in Multimodal Deep Learning

[ACM TELO 2021 / NeurIPS 2020 Works] Reusability and Transferability of Macro Actions for…

What is AI ( RL ) & Benign AI | AI — Friend / Foe ? | ft Moonfall 2022

DeepMind’s PoG Excels in Perfect and Imperfect Information Games, Advancing Research on General…