Quantitative Evaluation of the Expected Antagonism of Explainability and Privacy
|Termin||Fr 11. Juni 2021|
|Kurzfassung||Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of my thesis is whether there exists a conflict of objectives between explainability and privacy and how to measure the effects of this conflict.
I propose two different possible types of attack that can be applied against explainers: model extraction and information about the training data. Differential privacy is introduced as a way to measure the privacy breach of these attacks. Finally, three specific use cases are presented where explainers can realistically be abused to breach differential privacy.