Active Learning for experimental exploration: Unterschied zwischen den Versionen

Aus SDQ-Institutsseminar
Keine Bearbeitungszusammenfassung
Keine Bearbeitungszusammenfassung
 
Zeile 6: Zeile 6:
|termin=Institutsseminar/2023-05-12
|termin=Institutsseminar/2023-05-12
|vortragsmodus=in Präsenz
|vortragsmodus=in Präsenz
|kurzfassung=In this thesis, we are working with rankings. A ranking is obtained by applying a set of
|kurzfassung=A ranking is the result of running an experiment, a set of encoders is applied to an
encoders to an experimental condition (dataset, model, tuning, scoring) and rank them
experimental condition (dataset, model, tuning, scoring) and are then ranked according to
according to their averaged cv score. Furthermore, we can aggregate a set of rankings into a
their performance.
single consensus ranking, i.e. by taking the mean or median rank for each encoder. The goal
To draw conclusions about the performance of the encoders for a set of experimental
of the thesis is to explore the space of possible consensus rankings, while running as few
conditions, one can aggregate the rankings into a consensus ranking. (i.e. taking the median
experiments as possible because it can be a time-consuming task.
rank)
To make predictions on the consensus rankings, we employ a model capable of predicting
The goal of the thesis is to explore the space of consensus rankings and find all possible
the ranking of encoders given an experimental condition:
consensus rankings.
(dataset, model, tuning, scoring) → associated ranking of encoders
However, running an experiment is a very time-consuming task. Therefore we utilize Active
We can use this model to make predictions on the consensus rankings by taking a set of
Learning, to avoid running unnecessary experiments. In Active Learning, the learner can
experimental conditions {E_1,...,E_N}, predict their rankings and aggregating the predictions
choose the data it is trained on and achieves greater accuracy with fewer labeled data.
into a consensus ranking.
For this task, we evaluated different models (Decision Trees, Random Forests, SVM) using
various encoding schemes (One-hot, BaseN, Label), with and without the use of meta
features and using kendalls tau as evaluation metric. The DecisionTree achieved the best
results thus far.
To this model, we apply active learning to avoid running unnecessary experiments. In active
learning, the model can decide which data points should be labeled next and subsequently
decide the data it is trained on, in order to achieve greater accuracy with fewer labeled
training instances. In our case, labeling data points is equivalent to obtaining the ranking of
encoders of an experimental condition. Thus, we are minimizing the amount of experiments
to be run.
}}
}}

Aktuelle Version vom 8. Mai 2023, 17:53 Uhr

Vortragende(r) Steven Lorenz
Vortragstyp Proposal
Betreuer(in) Federico Matteucci
Termin Fr 12. Mai 2023
Vortragsmodus in Präsenz
Kurzfassung A ranking is the result of running an experiment, a set of encoders is applied to an

experimental condition (dataset, model, tuning, scoring) and are then ranked according to their performance. To draw conclusions about the performance of the encoders for a set of experimental conditions, one can aggregate the rankings into a consensus ranking. (i.e. taking the median rank) The goal of the thesis is to explore the space of consensus rankings and find all possible consensus rankings. However, running an experiment is a very time-consuming task. Therefore we utilize Active Learning, to avoid running unnecessary experiments. In Active Learning, the learner can choose the data it is trained on and achieves greater accuracy with fewer labeled data.