Quantitative Evaluation of the Expected Antagonism of Explainability and Privacy
|Termin||Fr 11. Juni 2021|
|Kurzfassung||Explainable artificial intelligence (XAI) offers a reasoning behind a model's behavior.
For many explainers this proposed reasoning gives us more information about the inner workings of the model or even about the training data. Since data privacy is becoming an important issue the question arises whether explainers can leak private data. It is unclear what private data can be obtained from different kinds of explanation. In this thesis I adapt three privacy attacks in machine learning to the field of XAI: model extraction, membership inference and training data extraction. The different kinds of explainers are sorted into these categories argumentatively and I present specific use cases how an attacker can obtain private data from an explanation. I demonstrate membership inference and training data extraction for two specific explainers in experiments. Thus, privacy can be breached with the help of explainers.