David Gray Grant

Publications

"Embedded EthiCS: Integrating Ethics Broadly Across Computer Science Education" (with Barbara Grosz et al.), Communications of the Association for Computing Machinery, 2019 [final]

"Ethics and Artificial Intelligence in Public Health Social Work," in Artificial Intelligence and Social Work, ed. Milind Tambe and Eric Rice (Cambridge University Press), 2018 [penultimate]

Papers in Progress

"Ethics for Artificial Agents" [draft] [abstract]

"Explanation and Machine Learning" [abstract]

"What is an Automated Decision System?" (with Jay Hodges and Milo Phillips-Brown)

"Preventing Algorithmic Discrimination" [abstract]

Dissertation Summary

Ethics for Artificial Agents

In March of 2018, one of Uber's autonomous vehicles struck and killed a pedestrian. Throughout that same year, Facebook's automated systems failed to detect a coordinated effort to persecute the Rohingya minority in Myanmar, contributing significantly to a major humanitarian crisis. In both cases, an autonomous software system failed to behave as expected, resulting in tragic consequences.

My dissertation focuses on theoretical and applied problems in machine ethics, a relatively new subfield of computer ethics that focuses on the ethical challenges involved in designing autonomous software agents. Autonomous software agents are artificial agents that are capable of deciding what to do in novel situations even when they have not been given explicit instructions about how to proceed by their human operators. Machine ethicists ask both how, from a moral point of view, artificial agents ought to make these decisions, and how, from an engineering point of view, they can be designed to make decisions that way.

In the first part of my dissertation, I consider what moral standards apply to the behavior of artificial agents. According to a prominent view that I call the agential theory of machine ethics, the answer is straightforward: the same moral standards that apply to the behavior of human agents. To find how we ought to design an artificial agent to act in a given situation, we need only ask how a human agent would be obligated to act in that situation. I distinguish two motivations for this view. The first is that artificial agents are moral agents, and so are subject to the same moral standards as human agents. The second is that the agential theory yields plausible results in concrete cases like those mentioned above.

I argue that both motivations fail. The first fails because artificial agents are not moral agents. Being a moral agent requires the capacity to form, and be guided by, your beliefs about what you ought to do. It is deeply implausible that contemporary artificial agents such as autonomous cars and robot vacuum cleaners possess this capacity. The second fails because the agential theory often makes incorrect predictions about how an artificial agent ought to be designed to act, as I demonstrate with a series of counterexamples. For example, because patients respond very differently to human therapists and "virtual therapists" (as recent empirical research demonstrates), different standards of conduct are appropriate for the two types of agents. I conclude that answering questions about how artificial agents ought to be designed will require developing new, domain-specific moral frameworks that take into account the morally significant differences between human and artificial agents.

Once we determine what moral standards should govern the behavior of an artificial agent, a new problem arises: how can we ensure that the agent's behavior will respect those standards? The second part of my dissertation explores this question by developing and analyzing a case study based on research I conducted at USC's Center for Artificial Intelligence in Society. The case study focuses on the development of an artificial agent to help plan public health interventions targeting homeless youth in Los Angeles. I argue that interventions planned by the agent must respect two moral demands that sometimes conflict: maximizing population-level benefits, and respecting each individual homeless youth's right not to be harmed. After outlining a framework to guide tradeoffs between these two demands (drawing on the public health ethics literature), I consider how the agent might be designed to balance them appropriately (drawing on the AI safety literature). In brief, the strategy I suggest is to use a combination of (1) simple rules the agent can readily apply to rule out obviously unacceptable intervention plans and (2) "safety conditions" that prompt human users for further input in more difficult cases. The result is a "division of epistemic labor" between artificial agents (which can apply simple rules very rapidly) and their human users (who are far more sensitive to moral nuances). This simple but effective strategy generalizes to a wide range of applied problems in machine ethics.