Data-centric explanations: Explaining training data of machine learning systems to promote transparency

Anik, Md Ariful Islam

Data-centric explanations: Explaining training data of machine learning systems to promote transparency

Files

Anik_MdArifulIslam.pdf(5.52 MB)

Date

2020-11

Authors

Anik, Md Ariful Islam

Abstract

Training datasets fundamentally impact the performance of machine learning systems. Any biases introduced during training (implicit or explicit) are often reflected in the system’s behaviors leading to questions about fairness and loss of trust in the system. Yet, information on training data is rarely communicated to the stakeholders. In this thesis, I explore the concept of data-centric explanations for machine learning systems that describe the training data to end-users. I design data-centric explanations that focus on providing information on training data. Through a formative study, I investigate the potential utility of such an approach and the data-centric information that users find most compelling. In a second study, I investigate reactions to the explanations across four different system scenarios. The results show that data-centric explanations can impact how users judge the trustworthiness of a system and can assist users in assessing fairness. I discuss the implications of the findings for designing explanations to support users’ perception of machine learning systems.

Keywords

Machine Learning Systems, Explanations, Training Data, Transparency

URI

http://hdl.handle.net/1993/35244

Collections

FGS - Electronic Theses and Practica

Full item page