Specification gaming: the flip side of AI ingenuity

Source: Data-Efficient Deep Reinforcement Learning for Dexterous Manipulation (Popov et al, 2017)
Source: Faulty Reward Functions in the Wild (Amodei & Clark, 2016)
Source: Deep Reinforcement Learning From Human Preferences (Christiano et al, 2017)
Source: AI Learns to Walk (Code Bullet, 2019)
Sources: Montezuma, Hero, Private Eye — Reward learning from human preferences and demonstrations in Atari (Ibarz et al, 2018). Gripper — Learning a high diversity of object manipulations through an evolutionary-based babbling (Ecarlat et al, 2015). Qbert — Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari (Chrabaszcz et al, 2018). Pong, Robot hand — Deep Reinforcement Learning From Human Preferences (Christiano et al, 2017). Ceiling — Genetic Algorithm Physics Exploiting (Higueras, 2015). Pole-vaulting — Towards efficient evolutionary design of autonomous robots (Krcah, 2008). Self-driving car — tweet by Mat Kelcey (Udacity, 2017). Montezuma — Go-Explore: a New Approach for Hard-Exploration Problems (Ecoffet et al, 2019). Somersaulting — Evolved Virtual Creatures (Sims, 1994).

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store