The Intuitive Appeal of Explainable Machines

Abstract

As algorithmic decision-making has become synonymous with inexplicable decision-making, we have become obsessed with opening the black box. This Article responds to a growing chorus of legal scholars and policymakers demanding explainable machines. Their instinct makes sense; what is unexplainable is usually unaccountable. But the calls for explanation are a reaction to two distinct but often conflated properties of machine-learning models: inscrutability and non intuitiveness. Inscrutability makes one unable to fully grasp the model, while non intuitiveness means one cannot understand why the model’s rules are what they are. Solving inscrutability alone will not resolve law and policy concerns; accountability relates not merely to how models work, but whether they are justified.

In this Article, we first explain what makes models inscrutable as a technical matter. We then explore two important examples of existing regulation-by-explanation and techniques within machine learning for explaining inscrutable decisions. We show that while these techniques might allow machine learning to comply with existing laws, compliance will rarely be enough to assess whether decision-making rests on a justifiable basis.

We argue that calls for explainable machines have failed to recognize the connection between intuition and evaluation and the limitations of such an approach. A belief in the value of explanation for justification assumes that if only a model is explained, problems will reveal themselves intuitively. Machine learning, however, can uncover relationships that are both non-intuitive and legitimate, frustrating this mode of normative assessment. If justification requires understanding why the model’s rules are what they are, we should seek explanations of the process behind a model’s development and use, not just explanations of the model itself. This Article illuminates the explanation-intuition dynamic and offers documentation as an alternative approach to evaluating machine learning models.

Full abstract and research here:

http://blog.experientia.com/paper-intuitive-appeal-explainable-machines/

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification∗

Joy Buolamwini joyab@mit.edu MIT Media Lab 75 Amherst St. Cambridge, MA 02139

Timnit Gebru timnit.gebru@microsoft.com Microsoft Research 641 Avenue of the Americas, New York, NY 10011

Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. Using the dermatologist approved Fitzpatrick Skin Type classification system, we characterize the gender and skin type distribution of two facial analysis benchmarks, IJB-A and Adience. We find that these datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) and introduce a new facial analysis dataset which is balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%. The substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms.

Keywords: Computer Vision, Algorithmic Audit, Gender Classification

Full Research: http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf

ETHICALLY ALIGNED DESIGN A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems - IEEE

Introduction As the use and impact of autonomous and intelligent systems (A/IS) become pervasive, we need to establish societal and policy guidelines in order for such systems to remain human-centric, serving humanity’s values and ethical principles. These systems have to behave in a way that is beneficial to people beyond reaching functional goals and addressing technical problems. This will allow for an elevated level of trust between people and technology that is needed for its fruitful, pervasive use in our daily lives. To be able to contribute in a positive, non-dogmatic way, we, the techno-scientific communities, need to enhance our self-reflection, we need to have an open and honest debate around our imaginary, our sets of explicit or implicit values, our institutions, symbols and representations. Eudaimonia, as elucidated by Aristotle, is a practice that defines human well-being as the highest virtue for a society. Translated roughly as “flourishing,” the benefits of eudaimonia begin by conscious contemplation, where ethical considerations help us define how we wish to live. Whether our ethical practices are Western (Aristotelian, Kantian), Eastern (Shinto, Confucian), African (Ubuntu), or from a different tradition, by creating autonomous and intelligent systems that explicitly honor inalienable human rights and the beneficial values of their users, we can prioritize the increase of human well-being as our metric for progress in the algorithmic age. Measuring and honoring the potential of holistic economic prosperity should become more important than pursuing one-dimensional goals like productivity increase or GDP growth.

AI, Algorithms, Accountability of AI, Bias, Explanation, Machine Learning, ResearchBrigitte BellanDecember 17, 2017IEEE, Ethics, Well-Being, Design

Automated Experiments on Ad Privacy Settings A Tale of Opacity, Choice, and Discrimination

To partly address people’s concerns over web tracking, Google has created the Ad Settings webpage to provide information about and some choice over the profiles Google creates on users. We present AdFisher, an automated tool that explores how user behaviors, Google’s ads, and Ad Settings interact. AdFisher can run browser-based experiments and analyze data using machine learning and significance tests. Our tool uses a rigorous experimental design and statistical analysis to ensure the statistical soundness of our results. We use AdFisher to find that the Ad Settings was opaque about some features of a user’s profile, that it does provide some choice on ads, and that these choices can lead to seemingly discriminatory ads. In particular, we found that visiting webpages associated with substance abuse changed the ads shown but not the settings page. We also found that setting the gender to female resulted in getting fewer instances of an ad related to high paying jobs than setting it to male. We cannot determine who caused these findings due to our limited visibility into the ad ecosystem, which includes Google, advertisers, websites, and users. Nevertheless, these results can form the starting point for deeper investigations by either the companies themselves or by regulatory bodies.

Accountability of AI Under the Law: The Role of Explanation

The ubiquity of systems using artificial intelligence or “AI” has brought increasing attention to how those systems should be regulated. The choice of how to regulate AI systems will require care. AI systems have the potential to synthesize large amounts of data, allowing for greater levels of personalization and precision than ever before—applications range from clinical decision support to autonomous driving and predictive policing. That said, our AIs continue to lag in common sense reasoning [McCarthy, 1960], and thus there exist legitimate concerns about the intentional and unintentional negative consequences of AI systems [Bostrom, 2003, Amodei et al., 2016, Sculley et al., 2014].

Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data

Decisions based on algorithmic, machine learning models can be unfair, reproducing biases in historical data used to train them. While computational techniques are emerging to address aspects of these concerns through communities such as discrimination-aware data mining (DADM) and fairness, accountability and transparency machine learning (FATML), their practical implementation faces real-world challenges. For legal, institutional or commercial reasons, organisations might not hold the data on sensitive attributes such as gender, ethnicity, sexuality or disability needed to diagnose and mitigate emergent indirect discrimination-by-proxy, such as redlining. Such organisations might also lack the knowledge and capacity to identify and manage fairness issues that are emergent properties of complex sociotechnical systems. This paper presents and discusses three potential approaches to deal with such knowledge and information deficits in the context of fairer machine learning. Trusted third parties could selectively store data necessary for performing discrimination discovery and incorporating fairness constraints into model-building in a privacy-preserving manner. Collaborative online platforms would allow diverse organisations to record, share and access contextual and experiential knowledge to promote fairness in machine learning systems. Finally, unsupervised learning and pedagogically interpretable algorithms might allow fairness hypotheses to be built for further selective testing and exploration.

Algorithmic Accountability Reporting: On the Investigation of Black Boxes

How can we characterize the power that various algorithms may exert on us? And how can we better understand when algorithms might be wronging us? What should be the role of journalists in holding that power to account? In this report I discuss what algorithms are and how they encode power. I then describe the idea of algorithmic accountability, first examining how algorithms problematize and sometimes stand in tension with transparency.