An automatic crawler for extracting highly ranked Q&A

Ausschreibung
Typ Hilfskräfte
Betreuer

Hamideh Hajiabadi

We are seeking a student research assistant (studentische Hilfskraft) to participate in the development of a virtual assistant for newcomers to biological light microscopy. This virtual assistant should include knowledge from existent online discussions (forums, tutorials, Q&A pages) and automatically provide relevant content to users with specific problems or use case questions. In this position, you would be specifically working on collecting a dataset, by extracting relevant content from online platforms related to the topic of biological light microscopy. This step can be done automatically by using a tool or online APIs. In the second step, the posts regarding image analysis shall be separated and then, the separated posts shall be classified in several categories. The categories are initially predefined and might be refined in the process of developing the virtual assistant. This stage can be done either manually by simply reviewing the posts or automatically using pre-released text classification tools. The ability to independently program in at least one common language, preferably Python, is required. As the project content is related to biology, a background in or willingness to engage with biology topics is desirable, as well as basic knowledge of working with datasets in general or in the realm of image processing and analysis.