Quantitative Evaluation of the Expected Antagonism of Explainability and Privacy: Unterschied zwischen den Versionen

Aus SDQ-Institutsseminar
(Die Seite wurde neu angelegt: „{{Vortrag |vortragender=Martin Lange |email=martin.lange@student.kit.edu |vortragstyp=Proposal |betreuer=Clemens Müssener |termin=Institutsseminar/2021-06-11 }}“)
 
Keine Bearbeitungszusammenfassung
Zeile 5: Zeile 5:
|betreuer=Clemens Müssener
|betreuer=Clemens Müssener
|termin=Institutsseminar/2021-06-11
|termin=Institutsseminar/2021-06-11
|kurzfassung=Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of our thesis is whether there exists a conflict of objectives between explainability and privacy and how we measure the effects of this conflict. Specifically we are looking at local feature importance explainers.
We propose a use case where the prediction of a model for a person is considered their private data. An attacker might be able to gain insight into the predictions for other people by abusing their own explanation to imitate the model's behavior. We will test this use case experimentally to determine whether such an attack is possible.
}}
}}

Version vom 20. Mai 2021, 11:31 Uhr

Vortragende(r) Martin Lange
Vortragstyp Proposal
Betreuer(in) Clemens Müssener
Termin Fr 11. Juni 2021
Vortragsmodus
Kurzfassung Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of our thesis is whether there exists a conflict of objectives between explainability and privacy and how we measure the effects of this conflict. Specifically we are looking at local feature importance explainers.

We propose a use case where the prediction of a model for a person is considered their private data. An attacker might be able to gain insight into the predictions for other people by abusing their own explanation to imitate the model's behavior. We will test this use case experimentally to determine whether such an attack is possible.