Data-Preparation for Machine-Learning Based Static Code Analysis
Zur Navigation springen
Zur Suche springen
Vortragende(r) | Felix Griesau | |
---|---|---|
Vortragstyp | Masterarbeit | |
Betreuer(in) | Robert Heinrich | |
Termin | Fr 1. April 2022 | |
Vortragsmodus | online | |
Kurzfassung | Static Code Analysis (SCA) has become an integral part of modern software development, especially since the rise of automation in the form of CI/CD. It is an ongoing question of how machine learning can best help improve SCA's state and thus facilitate maintainable, correct, and secure software. However, machine learning needs a solid foundation to learn on. This thesis proposes an approach to build that foundation by mining data on software issues from real-world code. We show how we used that concept to analyze over 4000 software packages and generate over two million issue samples. Additionally, we propose a method for refining this data and apply it to an existing machine learning SCA approach. |