Suche mittels Attribut

Diese Seite stellt eine einfache Suchoberfläche zum Finden von Objekten bereit, die ein Attribut mit einem bestimmten Datenwert enthalten. Andere verfügbare Suchoberflächen sind die Attributsuche sowie der Abfragengenerator.

Eine Liste aller Seiten, die das Attribut „Kurzfassung“ mit dem Wert „In machine learning, simpler, interpretable models require significantly more training data than complex, opaque models to achieve reliable results. This is a problem when gathering data is a challenging, expensive or time-consuming task. Data synthesis is a useful approach for mitigating these problems. An essential aspect of tabular data is its heterogeneous structure, as it often comes in ``mixed data´´, i.e., it contains both categorical and numerical attributes. Most machine learning methods require the data to be purely numerical. The usual way to deal with this is a categorical encoding. In this thesis, we evaluate a proposed tabular data synthesis pipeline consisting of a categorical encoding, followed by data synthesis and an optional relabeling of the synthetic data by a complex model. This synthetic data is then used to train a simple model. The performance of the simple model is used to quantify the quality of the generated data. We surveyed the current state of research in categorical encoding and tabular data synthesis and performed an extensive benchmark on a motivated selection of encoders and generators.“ haben. Weil nur wenige Ergebnisse gefunden wurden, werden auch ähnliche Werte aufgelistet.

Hier sind 2 Ergebnisse, beginnend mit Nummer 1.

Attribut: Wert:

Liste der Ergebnisse

Benchmarking Tabular Data Synthesis Pipelines for Mixed Data + (In machine learning, simpler, interpretabl … In machine learning, simpler, interpretable models require significantly more training data than complex, opaque models to achieve reliable results. This is a problem when gathering data is a challenging, expensive or time-consuming task. Data synthesis is a useful approach for mitigating these problems.</br></br>An essential aspect of tabular data is its heterogeneous structure, as it often comes in ``mixed data´´, i.e., it contains both categorical and numerical attributes. Most machine learning methods require the data to be purely numerical. The usual way to deal with this is a categorical encoding.</br></br>In this thesis, we evaluate a proposed tabular data synthesis pipeline consisting of a categorical encoding, followed by data synthesis and an optional relabeling of the synthetic data by a complex model. This synthetic data is then used to train a simple model. The performance of the simple model is used to quantify the quality of the generated data. We surveyed the current state of research in categorical encoding and tabular data synthesis and performed an extensive benchmark on a motivated selection of encoders and generators.ated selection of encoders and generators.)

Benchmarking Tabular Data Synthesis Pipelines for Mixed Data +