Alignment of Language Agents

Forms of misspecification in machine learning, together with examples in the language agent setting.
  • Training data misspecification can occur because we lack control over the data that enters a large scale text dataset scraped from the web, containing hundreds of billions of words, which contains many unwanted biases.
  • Training process misspecification can occur when a learning algorithm designed for solving one kind of problem is applied to a different kind of problem in which some assumptions no longer apply. For example, a question-answering system applied to a setting where the answer can affect the world, may be incentivised to create self-fulfilling prophecies.
  • Distributional shift misspecification can occur when we deploy the AI agent to the real world, which may differ from the training distribution. For example, the chatbot Tay worked fine in its training environment, but quickly turned toxic when released to the wider internet which included users who attacked the service.
Issues that can arise from any of the forms of misspecification, together with examples for language agents.

--

--

--

We research and build safe AI systems that learn how to solve problems and advance scientific discovery for all. Explore our work: deepmind.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How A.I. and NFTs are changing the way we create art.

Teaching machines to understand — and summarize — text

Using natural language processing to route around NYC’s subway disruptions

AI and Museum Collections.

The Kindness Bot

Robots in difficult times

AI: past, present and future on Strategy

The 3 Keys to Learning Efficiently, According to the World’s Fastest Learner: AI

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
DeepMind Safety Research

DeepMind Safety Research

We research and build safe AI systems that learn how to solve problems and advance scientific discovery for all. Explore our work: deepmind.com

More from Medium

How Google Fired Its Employee Tells Us We Need to Regulate Artificial Intelligence

Analyzing the Number of Seasons for a TV Show using Data Science

Op-Ed: A Warning about Self-Driving Cars

The unpredictable challenge: from now and on