Gravierende Abweichungen im Dataset von Google's Emotions erkannt

Bei Surgehq.ai berichtet Edwin Chen über eine Stichprobe von fehlerhaft klassifizierten Emotionsäußerungen. Dazu wurde das Dataset von Google’s Emotions mit der Klassifizierung von Personen verglichen. Das Ergebnis zeigt, dass etwa ein Drittel der schriftlichen Äußerungen durch Google’s Emotions falsch kategorisiert wurde.

[…] A whopping 30% of the dataset is severely mislabeled! (We tried training a model on the dataset ourselves, but noticed deep quality issues. So we took 1000 random comments, asked Surgers whether the original emotion was reasonably accurate, and found strong errors in 308 of them.) How are you supposed to train and evaluate machine learning models when your data is so wrong? […]

Weitere Aspekte im Artikel:

Google’s Flawed Data Labeling Methodology
The Importance of High-Quality Data

Zum Artikel bei Surgehq.ai 30% of Google’s Emotions Dataset is Mislabeled

KI und Emotionen Roboter Laufbursche bedient Hotelgäste

Artikelaufrufe: 470

Bildquellen

smartphone-4621687_1920: geralt | Pixabay

Gravierende Abweichungen im Dataset von Google’s Emotions erkannt

One Pingback