JournalArticle
  • Marco Kretschmann
  • Andreas Fischer
  • Benedikt Elser
Extracting Keywords from Publication Abstracts for an Automated Researcher Recommendation System , vol4
  • 2020

DOI: 10.1007/s42354-019-0227-2

This paper presents an automated keyword assignment system for scientific abstracts. That system is applied to paper abstracts collected in a local publication database and used to drive a researcher recommendation system. Problems like low data volume and missing keywords are discussed. For remediation, training is performed on an extended data set based on large online publication databases. Additionally a closer look at label imbalance in the dataset is taken. Ten multi-label classification algorithms for assigning keywords from a given catalogue to a scientific abstract are compared. The usage of binary relevance as transformation method with LightGBM as classifier yields the best results. Random oversampling before the training phase additionally increases the F1-Score by around 5-6%.
JournalArticle
  • Marco Kretschmann
  • Andreas Fischer
  • Benedikt Elser
Extracting Keywords from Publication Abstracts for an Automated Researcher Recommendation System , vol4
  • 2020

DOI: 10.1007/s42354-019-0227-2

This paper presents an automated keyword assignment system for scientific abstracts. That system is applied to paper abstracts collected in a local publication database and used to drive a researcher recommendation system. Problems like low data volume and missing keywords are discussed. For remediation, training is performed on an extended data set based on large online publication databases. Additionally a closer look at label imbalance in the dataset is taken. Ten multi-label classification algorithms for assigning keywords from a given catalogue to a scientific abstract are compared. The usage of binary relevance as transformation method with LightGBM as classifier yields the best results. Random oversampling before the training phase additionally increases the F1-Score by around 5-6%.
Lecture
  • Benedikt Elser
Die digitale Wagenreihung bei der Deutschen Bahn
  • 2018
Lecture
  • Benedikt Elser
Die digitale Wagenreihung bei der Deutschen Bahn
  • 2018

Projekte

FreshAnalytics, Industriewerkstatt 4.0, WeisDas


Kernkompetenzen

  • Big Data Computing
  • NoSQL Datenbanken
  • Container Orchestrierung
  • Rechnernetze, TCP/IP
  • Overlay Netze, Peer-To-Peer Systeme
  • Graphalgorithmen, Graphanalyse