Semantische Suche

Freitag, 13. Mai 2022, 11:30 Uhr

iCal (Download)
Ort: Raum 010 (Gebäude 50.34)

Vortragende(r) Manuel Müllerschön
Titel Developing a Framework for Mining Temporal Data from Twitter as Basis for Time-Series Correlation Analysis
Vortragstyp Proposal
Betreuer(in) Fabian Richter
Vortragsmodus in Präsenz
Kurzfassung In the last decade, ample research has been produced regarding the value of user-generated data from microblogs as a basis for time series analysis in various fields.In this context, the objective of this thesis is to develop a domain-agnostic framework for mining microblog data (i.e., Twitter). Taking the subject related postings of a time series (e.g., inflation) as its input, the framework will generate temporal data sets that can serve as basis for time series analysis of the given target time series (e.g., inflation rate).

To accomplish this, we will analyze and summarize the prevalent research related to microblog data-based forecasting and analysis, with a focus on the data processing and mining approach. Based on the findings, one or several candidate frameworks are developed and evaluated by testing the correlation of their generated data sets against the target time series they are generated for.

While summative research on microblog data-based correlation analysis exists, it is mainly focused on summarizing the state of the field. This thesis adds to the body of research by applying summarized findings and generating experimental evidence regarding the generalizability of microblog data mining approaches and their effectiveness.

Vortragende(r) Moritz Teichner
Titel Standardized Real-World Change Detection Data
Vortragstyp Proposal
Betreuer(in) Florian Kalinke
Vortragsmodus in Präsenz
Kurzfassung The reliable detection of change points is a fundamental task when analysing data across many fields, e.g., in finance, bioinformatics, and medicine.

To define “change points”, we assume that there is a distribution, which may change over time, generating the data we observe. A change point then is a change in this underlying distribution, i.e., the distribution coming before a change point is different from the distribution coming after. The principled way to compare distributions, and to find change points, is to employ statistical tests.

While change point detection is an unsupervised problem in practice, i.e., the data is unlabelled, the development and evaluation of data analysis algorithms requires labelled data. Only few labelled real world data sets are publicly available and many of them are either too small or have ambiguous labels. Further issues are that reusing data sets may lead to overfitting, and preprocessing (e.g., removing outliers) may manipulate results. To address these issues, van den Burg et al. publish 37 data sets annotated by data scientists and ML researchers and use them for an assessment of 14 change detection algorithms. Yet, there remain concerns due to the fact that these are labelled by hand: Can humans correctly identify changes according to the definition, and can they be consistent in doing so?

The goal of this Bachelor's thesis is to algorithmically label their data sets following the formal definition and to also identify and label larger and higher-dimensional data sets, thereby extending their work. To this end, we leverage a non-parametric hypothesis test which builds on Maximum Mean Discrepancy (MMD) as a test statistic, i.e., we identify changes in a principled way. We will analyse the labels so obtained and compare them to the human annotations, measuring their consistency with the F1 score. To assess the influence of the algorithmic and definition-conform annotations, we will use them to reevaluate the algorithms of van den Burg et al. and compare the respective performances.

Freitag, 20. Mai 2022, 11:30 Uhr

iCal (Download)
Ort: MS Teams
Webkonferenz: https://sdqweb.ipd.kit.edu/wiki/SDQ-Oberseminar/Microsoft_Teams

Vortragende(r) Jonathan Schenkenberger
Titel Architectural Generation of Context-based Attack Paths
Vortragstyp Masterarbeit
Betreuer(in) Maximilian Walter
Vortragsmodus online
Kurzfassung In industrial processes (Industry 4.0) and other fields in our lives like the energy or health sector, the confidentiality of data becomes increasingly important. For the protection of confidential information on critical systems, it is crucial to be able to find relevant attack paths in different access-control contexts to a critical element. In order to minimize costs, it is important to already consider this issue in the design phase of the software architecture. There are already approaches considering the topic of attack path generation. However, they do not consider software architecture modeling or they do not consider both vulnerabilities and access control mechanisms. Hence, this thesis presents an approach for finding all potential attack paths in a software architecture model considering access control and vulnerabilities. However, all attack paths are often to many, so the approach presented here introduces and utilizes meaningful filter criteria based on wide-spread vulnerability classification standards.
Vortragende(r) Limanan Nursalim
Titel Automated Test Selection for CI Feedback on Model Transformation Evolution
Vortragstyp Masterarbeit
Betreuer(in) Timur Sağlam
Vortragsmodus online
Kurzfassung The development of the transformation model also comes with the appropriate system-level testing to verify its changes. Due to the complex nature of the transformation model, the number of tests increases as the structure and feature description become more detailed. However, executing all test cases for every change is costly and time-consuming. Thus, it is necessary to conduct a selection for the transformation tests. In this presentation, you will be introduced to a change-based test prioritization and transformation test selection approach for early fault detection.

Freitag, 3. Juni 2022, 11:30 Uhr

iCal (Download)
Ort: Raum 348 (Gebäude 50.34)
Webkonferenz: https://kit-lecture.zoom.us/j/67744231815

Vortragende(r) Tobias Haßberg
Titel Development of an Active Learning Approach for One Class Classifi cation using Bayesian Uncertainty
Vortragstyp Masterarbeit
Betreuer(in) Bela Böhnke
Vortragsmodus in Präsenz
Kurzfassung In One-Class classification, the classifier decides if points belong to a specific class. In this thesis, we propose an One-Class classification approach, suitable for active learning, that models for each point, a prediction range in which the model assumes the points state to be. The proposed classifier uses a Gaussian process. We use the Gaussian processes prediction range to derive a certainty measure, that considers the available labeled points for stating its certainty. We compared this approach against baseline classifiers and show the correlation between the classifier's uncertainty and misclassification ratio.

Freitag, 24. Juni 2022, 11:30 Uhr

iCal (Download)
Ort: MS Teams
Webkonferenz: https://sdqweb.ipd.kit.edu/wiki/SDQ-Oberseminar/Microsoft_Teams

Vortragende(r) Kevin Werber
Titel Assessing Word Similarity Metrics For Traceability Link Recovery
Vortragstyp Bachelorarbeit
Betreuer(in) Jan Keim
Vortragsmodus online
Kurzfassung The software development process usually involves different artifacts that each describe different parts of the whole software system. Traceability Link Recovery is a technique that aids the development process by establishing relationships between related parts from different artifacts. Artifacts that are expressed in natural language are more difficult for machines to understand and therefore pose a challenge to this link recovery process. A common approach to link elements from different artifacts is to identify similar words using word similarity measures. ArDoCo is a tool that uses word similarity measures to recover trace links between natural language software architecture documentation and formal architectural models. This thesis assesses the effect of different word similarity measures on ArDoCo. The measures are evaluated using multiple case studies. Precision, recall, and encountered challenges for the different measures are reported as part of the evaluation.

Freitag, 24. Juni 2022, 11:30 Uhr

iCal (Download)
Ort: Raum 348 (Gebäude 50.34)
Webkonferenz: https://kit-lecture.zoom.us/j/67744231815

Vortragende(r) Tobias Hombücher
Titel Generalized Monte Carlo Dependency Estimation and Anytime Supervised Filter Feature Selection
Vortragstyp Masterarbeit
Betreuer(in) Edouard Fouché
Vortragsmodus online
Kurzfassung Dependency estimation is an important problem in statistics and is applied frequently in data science. As modern datasets can be very large, dependency estimators should be efficient and leverage as much information from data as possible. Traditional bivariate and multivariate dependency estimators are only capable to estimate dependency between two or n one-dimensional datasets, respectively. In this thesis, we are interested in how to develop estimators that can estimate the dependency between n multidimensional datasets, which we call "generalized dependency estimators".

We extend the recently introduced methodology of Monte Carlo Dependency Estimation (MCDE), an effective and efficient traditional multivariate dependency estimator. We introduce Generalized Monte Carlo Dependency Estimation (gMCDE) and focus in particular on the highly relevant subproblem of generalized dependency estimation, known as canonical dependency estimation, which aims to estimate the dependency between two multidimensional datasets. We demonstrate the practical relevance of Canonical Monte Carlo Dependency Estimation (cMCDE) by applying it to feature selection, introducing two methodologies for anytime supervised filter feature selection, Canonical Monte Carlo Feature Selection (cMCFS) and Canonical Multi Armed Bandit Feature Selection (cMABFS). cMCFS directly applies the methodology of cMCDE to feature selection, while cMABFS treats the feature selection problem as a multi armed bandit problem, which utilizes cMCDE to determine relevant features.

Vortragende(r) Jonas Zoll
Titel Injection Molding Simulation based on Graph Neural Networks (GNNs)
Vortragstyp Bachelorarbeit
Betreuer(in) Daniel Ebi
Vortragsmodus in Präsenz
Kurzfassung Numerical filling simulations are an important tool for the development of injection molding parts. Existing simulations rely on numerical solvers based on the finite element method. These solvers are reliable and precise, but very computationally expensive even on simple part geometries.

In this thesis, we aim to develop a faster injection molding simulation based on Graph Neural Networks (GNNs) as a surrogate model. Our approach learns a simulation as a composition of three functions: an encoder, a processor and a decoder. The encoder takes in a graph representation of a 3D geometry of an injection molding part and returns a numeric embedding of each node in the graph. The processor updates the embeddings of each node multiple times based on its neighbors. The decoder then decodes the final embeddings of each node into physically meaningful variables, say, the fill state of the node. Our model can predict the progression of the flow front during a time step with a fixed size. To simulate a full mold filling process, our model is applied sequentially until the entire mold is filled. Our architecture is applicable to any kind of material, geometry and injection process parameters. We evaluate our architecture by its accuracy and runtime when predicting node properties. We also evaluate our models transfer learning ability on a real world injection molding part.

Vortragende(r) Mingzhe Tao
Titel Meta-learning for Encoder Selection
Vortragstyp Proposal
Betreuer(in) Federico Matteucci
Vortragsmodus in Präsenz
Kurzfassung In the real world, mixed-type data is commonly used, which means it contains both categorical and numerical data. However, most algorithms can only learn from numerical data. This makes the selection of encoder becoming very important. In this presentation, I will present an approach by using ideas from meta-learning to predict the performance from the meta-features and encoders.