News  |  ,   |  November 22, 2019

A new way of designing algorithms to avoid specific misbehaviors

News article by By Tom Abate.
Published on the Stanford University – Engineering website.

Artificial intelligence has moved into the commercial mainstream thanks to the growing prowess of machine learning algorithms that enable computers to train themselves to do things like drive cars, control robots or automate decision-making.

But as AI starts handling sensitive tasks, such as helping pick which prisoners get bail, policy makers are insisting that computer scientists offer assurances that automated systems have been designed to minimize, if not completely avoid, unwanted outcomes such as excessive risk or racial and gender bias.

A team led by researchers at Stanford and the University of Massachusetts Amherst published a paper in November 2019 in Science suggesting how to provide such assurances. The paper outlines a new technique that translates a fuzzy goal, such as avoiding gender bias, into the precise mathematical criteria that would allow a machine-learning algorithm to train an AI application to avoid that behavior.

“We want to advance AI that respects the values of its human users and justifies the trust we place in autonomous systems,” said Emma Brunskill, an assistant professor of computer science at Stanford and senior author of the paper. [ . . .]

Runtime 4 minutes

This method could help robots, self-driving cars and other intelligent machines safeguard against undesirable outcomes such as racial and gender bias.

Preventing undesirable behavior of intelligent machines. Article by Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, Stephen Giguere, Yuriy Brun and Emma Brunskill.