A comparative study of subgroup discovery methods: Unterschied zwischen den Versionen

Aus IPD-Institutsseminar
Zur Navigation springen Zur Suche springen
 
Zeile 4: Zeile 4:
 
|vortragstyp=Bachelorarbeit
 
|vortragstyp=Bachelorarbeit
 
|betreuer=Vadim Arzamasov
 
|betreuer=Vadim Arzamasov
|termin=Institutsseminar/2021-02-19
+
|termin=Institutsseminar/2021-02-19 Zusatztermin
 
|kurzfassung=Subgroup discovery is a data mining technique that is used to extract interesting relationships in a dataset related to to a target variable. These relationships are described in the form of rules. Multiple SD techniques have been developed over the years.  This thesis establishes a comparative study between a  number of these techniques in order to identify the state-of-the-art methods. It also analyses the effects discretization has on them as a preprocessing step . Furthermore, it investigates the effect of hyperparameter optimization on these methods.  
 
|kurzfassung=Subgroup discovery is a data mining technique that is used to extract interesting relationships in a dataset related to to a target variable. These relationships are described in the form of rules. Multiple SD techniques have been developed over the years.  This thesis establishes a comparative study between a  number of these techniques in order to identify the state-of-the-art methods. It also analyses the effects discretization has on them as a preprocessing step . Furthermore, it investigates the effect of hyperparameter optimization on these methods.  
  
 
Our analysis showed that PRIM, DSSD, Best Interval and FSSD outperformed the other subgroup discovery methods evaluated in this study and are to be considered state-of-the-art  . It also shows that discretization offers an efficiency improvement on methods that do not employ internal discretization. It has a negative impact on the quality of subgroups generated by methods that perform it internally. The results finally demonstrates that Apriori-SD and SD-Algorithm were the most positively affected by the hyperparameter optimization.
 
Our analysis showed that PRIM, DSSD, Best Interval and FSSD outperformed the other subgroup discovery methods evaluated in this study and are to be considered state-of-the-art  . It also shows that discretization offers an efficiency improvement on methods that do not employ internal discretization. It has a negative impact on the quality of subgroups generated by methods that perform it internally. The results finally demonstrates that Apriori-SD and SD-Algorithm were the most positively affected by the hyperparameter optimization.
 
}}
 
}}

Aktuelle Version vom 17. Februar 2021, 09:56 Uhr

Vortragende(r) Mohamed Amine Chalghoum
Vortragstyp Bachelorarbeit
Betreuer(in) Vadim Arzamasov
Termin Fr 19. Februar 2021
Kurzfassung Subgroup discovery is a data mining technique that is used to extract interesting relationships in a dataset related to to a target variable. These relationships are described in the form of rules. Multiple SD techniques have been developed over the years. This thesis establishes a comparative study between a number of these techniques in order to identify the state-of-the-art methods. It also analyses the effects discretization has on them as a preprocessing step . Furthermore, it investigates the effect of hyperparameter optimization on these methods.

Our analysis showed that PRIM, DSSD, Best Interval and FSSD outperformed the other subgroup discovery methods evaluated in this study and are to be considered state-of-the-art . It also shows that discretization offers an efficiency improvement on methods that do not employ internal discretization. It has a negative impact on the quality of subgroups generated by methods that perform it internally. The results finally demonstrates that Apriori-SD and SD-Algorithm were the most positively affected by the hyperparameter optimization.