Performance Modeling of Data-Intensive Applications
|Betreuer||Wenden Sie sich bei Interesse oder Fragen bitte an: |
For this thesis, you will decide on a performance critical part of a data processing pipeline, e.g., distributed data processing frameworks such as Apache Spark or databases such as Cassandra. To quantify and predict the impact of changing the configuration, usage or deployment of this subsystem, you will benchmark two or more different subsystems that fulfill the same role and build a model that represents the change of configuration, usage or deployment on the delays this system incurs on the overall processing. This model will then be included in a discrete event simulation (DES) built at the chair and applied on an example system to evaluate the accuracy of the model.