Aus IPD-Institutsseminar
Wechseln zu: Navigation, Suche

Das Institutsseminar des Instituts für Programmstrukturen und Datenorganisation (IPD) ist eine ständige Lehrveranstaltung, die den Zweck hat, über aktuelle Forschungsarbeiten am Institut zu informieren. Insbesondere soll Studierenden am Institut die Gelegenheit gegeben werden, über ihre Bachelor- und Masterarbeiten vor einem größeren Auditorium zu berichten. Schwerpunkte liegen dabei auf der Problemstellung, den Lösungsansätzen und den erzielten Ergebnissen. Das Seminar steht aber allen Studierenden und Mitarbeiter/-innen des KIT sowie sonstigen Interessierten offen.

Ort Gebäude 50.34, Seminarraum 348
Zeit jeweils freitags, 11:30–13:00 Uhr

Die Vorträge müssen den folgenden zeitlichen Rahmen einhalten:

  • Masterarbeit: 30 Minuten Redezeit + 15 Minuten Diskussion
  • Bachelorarbeit: 20 Minuten Redezeit + 10 Minuten Diskussion
  • Proposal: 12 Minuten Redezeit + 8 Minuten Diskussion

Weitere Informationen: https://sdqweb.ipd.kit.edu/wiki/Institutsseminar. Bei Fragen und Anmerkungen können Sie eine E-Mail an das Institutsseminar-Team schreiben.

Nächste Vorträge

Freitag, 22. November 2019, 11:30 Uhr, Raum 348 (Gebäude 50.34)
Vortragende(r) Marco Heyden
Titel Anytime Tradeoff Strategies with Multiple Targets
Vortragstyp Masterarbeit
Betreuer(in) Edouard Fouché
Kurzfassung Modern applications typically need to find solutions to complex problems under limited time and resources. In settings, in which the exact computation of indicators can either be infeasible or economically undesirable, the use of “anytime” algorithms, which can return approximate results when interrupted, is particularly beneficial, since they offer a natural way to trade computational power for result accuracy.

However, modern systems typically need to solve multiple problems simultaneously. E.g. in order to find high correlations in a dataset, one needs to examine each pair of variables. This is challenging, in particular if the number of variables is large and the data evolves dynamically.

This thesis focuses on the following question: How should one distribute resources at anytime, in order to maximize the overall quality of multiple targets? First, we define the problem, considering various notions of quality and user requirements. Second, we propose a set of strategies to tackle this problem. Finally, we evaluate our strategies via extensive experiments.

Vortragende(r) Florian Kalinke
Titel Subspace Search in Data Streams
Vortragstyp Masterarbeit
Betreuer(in) Edouard Fouché
Kurzfassung Modern data mining often takes place on high-dimensional data streams, which evolve at a very fast pace: On the one hand, the "curse of dimensionality" leads to a sparsely populated feature space, for which classical statistical methods perform poorly. Patterns, such as clusters or outliers, often hide in a few low-dimensional subspaces. On the other hand, data streams are non-stationary and virtually unbounded. Hence, algorithms operating on data streams must work incrementally and take concept drift into account.

While "high-dimensionality" and the "streaming setting" provide two unique sets of challenges, we observe that the existing mining algorithms only address them separately. Thus, our plan is to propose a novel algorithm, which keeps track of the subspaces of interest in high-dimensional data streams over time. We quantify the relevance of subspaces via a so-called "contrast" measure, which we are able to maintain incrementally in an efficient way. Furthermore, we propose a set of heuristics to adapt the search for the relevant subspaces as the data and the underlying distribution evolves.

We show that our approach is beneficial as a feature selection method and as such can be applied to extend a range of knowledge discovery tasks, e.g., "outlier detection", in high-dimensional data-streams.

Freitag, 22. November 2019, 11:30 Uhr, Raum 010 (Gebäude 50.34)

Keine Vorträge