Active Learning for experimental exploration: Unterschied zwischen den Versionen

Aktuelle Version vom 8. Mai 2023, 17:53 Uhr

Vortragende(r)	Steven Lorenz
Vortragstyp	Proposal
Betreuer(in)	Federico Matteucci
Termin	Fr 12. Mai 2023
Vortragsmodus	in Präsenz
Kurzfassung	A ranking is the result of running an experiment, a set of encoders is applied to an experimental condition (dataset, model, tuning, scoring) and are then ranked according to their performance. To draw conclusions about the performance of the encoders for a set of experimental conditions, one can aggregate the rankings into a consensus ranking. (i.e. taking the median rank) The goal of the thesis is to explore the space of consensus rankings and find all possible consensus rankings. However, running an experiment is a very time-consuming task. Therefore we utilize Active Learning, to avoid running unnecessary experiments. In Active Learning, the learner can choose the data it is trained on and achieves greater accuracy with fewer labeled data.

@@ Zeile 6: / Zeile 6: @@
 |termin=Institutsseminar/2023-05-12
 |vortragsmodus=in Präsenz
-|kurzfassung=In this thesis, we are working with rankings. A ranking is obtained by applying a set of
+|kurzfassung=A ranking is the result of running an experiment, a set of encoders is applied to an
-encoders to an experimental condition (dataset, model, tuning, scoring) and rank them
+experimental condition (dataset, model, tuning, scoring) and are then ranked according to
-according to their averaged cv score. Furthermore, we can aggregate a set of rankings into a
+their performance.
-single consensus ranking, i.e. by taking the mean or median rank for each encoder. The goal
+To draw conclusions about the performance of the encoders for a set of experimental
-of the thesis is to explore the space of possible consensus rankings, while running as few
+conditions, one can aggregate the rankings into a consensus ranking. (i.e. taking the median
-experiments as possible because it can be a time-consuming task.
+rank)
-To make predictions on the consensus rankings, we employ a model capable of predicting
+The goal of the thesis is to explore the space of consensus rankings and find all possible
-the ranking of encoders given an experimental condition:
+consensus rankings.
-(dataset, model, tuning, scoring) → associated ranking of encoders
+However, running an experiment is a very time-consuming task. Therefore we utilize Active
-We can use this model to make predictions on the consensus rankings by taking a set of
+Learning, to avoid running unnecessary experiments. In Active Learning, the learner can
-experimental conditions {E_1,...,E_N}, predict their rankings and aggregating the predictions
+choose the data it is trained on and achieves greater accuracy with fewer labeled data.
-into a consensus ranking.
-For this task, we evaluated different models (Decision Trees, Random Forests, SVM) using
-various encoding schemes (One-hot, BaseN, Label), with and without the use of meta
-features and using kendalls tau as evaluation metric. The DecisionTree achieved the best
-results thus far.
-To this model, we apply active learning to avoid running unnecessary experiments. In active
-learning, the model can decide which data points should be labeled next and subsequently
-decide the data it is trained on, in order to achieve greater accuracy with fewer labeled
-training instances. In our case, labeling data points is equivalent to obtaining the ranking of
-encoders of an experimental condition. Thus, we are minimizing the amount of experiments
-to be run.
 }}