PhD students 1st cohort
Dr. Johannes Hellrich
Curriculum Vitae
Dr. Johannes Hellrich (born 1982) earned a Bachelor in German language studies and history at the University of Regensburg from 2006 to 2010. From 2010 to 2012 he completed a Master of German Language Studies at FSU Jena. During his studies he was employed as an assistant at the Institute for German Studies at the University of Regensburg and in the JULIE project laboratory at FSU Jena. From 2012 to 2015 Johannes was research assistant to the chair of computational linguistics as part of the MANTRA EU project. He was a PhD student at the ‘Romanticism as a Model’ research training group from October 2015 and is scientific staff for the project JULIE Lab at Friedrich-Schiller-Universität Jena since October 2018. He was awarded his PhD in March 2019.
PhD project (finished)
Automatic Analysis of Diachronic Semantic Deimensions of the Lexicon of Romanticism
The aim of this research is to quantify meaning differences that are relevant for Romanticism. Words that are frequently used in the literature of Romanticism as well as the words ‘romantic’ and ‘Romanticism’ are considered relevant here. Studies will encompass the last two centuries and several European languages with primarily diachronic, but also synchronic, comparisons planned.
This is achieved by using computational linguistics methods to approximate the meaning of words by their contexts. The resulting high dimensional vector space, in the simplest case determined directly by word co-occurrences, can be used to generate low dimensional word representations. Such representations can be mathematically compared to quantify both synchronic and diachronic differences in meaning. One such method is the artificial neural network–based word2vec algorithm.
Initial research will focus on the multilingual Google Books N-gram corpus, which was prepared by digitizing library stocks and contains about 4% of all books printed to date. An expansion to synchronic corpora, e.g., harvested with a Web crawler, would improve the accuracy for contemporary everyday language. Expected results for the German language encompass not only the increasing trivialization of the word ‘romantic’ over the last 200 years but also the influence of nationalism during this time. Methodological challenges include not only measuring these shifts in meaning, yet also evaluating such an approach and finding a way to integrate quantitative results into the qualitative framework of the humanities.
Publications
Aufsätze
Johannes Hellrich/Sven Buechel/Udo Hahn: Inducing Affective Lexical Semantics in Historical Language, in: arXiv, 21.06.2018
Johannes Hellrich/Sven Buechel/Udo Hahn: JeSemE: A Website for Diachronic Changes in Word Meaning and Emotion, in: Proceedings of COLING 2018 Santa Fa (2018), to appear.
Johannes Hellrich/Christoph Rzymski/Vitus Vestergaard: The Trans-Medial Fight for Glory, in: Medievalism and Metal Music Studies: Throwing Down the Gauntlet, ed. by Ruth Barratt-Peacock and Ross Hagen, forthcoming Bingley 2019.
Johannes Hellrich/Christoph Rzymski: Computational Detection of Medieval References in Metal, in: Medievalism and Metal Music Studies: Throwing Down the Gauntlet, ed. by Ruth Barratt-Peacock and Ross Hagen, forthcoming Bingley 2019.
Kleinere Beiträge
Johannes Hellrich et al.: Visualizing Semantic Metadata from Biological Publications, in: Proceedings of the International Workshop on Intelligent Exploration of Semantic Data (IESD 2012) at EKAW 2012, Galway City, Ireland, 2012.
Johannes Hellrich/Udo Hahn: Biomedical Term Acquisition Based on Aligned Parallel Corpora, in: 58. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS), Lübeck, Germany, 2013.
Dietrich Rebholz-Schuhmann et al.: Multilingual Semantic Resources and Parallel Corpora in the Biomedical Domain: the CLEF-ER Challenge, in: CLEF 2013 Evaluation Labs and Workshop Online Working Notes, Valencia, Spain, 2013.
Johannes Hellrich/Udo Hahn: The JULIE LAB MANTRA System for the CLEF-ER 2013 Challenge, in: CLEF 2013 Evaluation Labs and Workshop Online Working Notes, Valencia, Spain, 2013.
Dietrich Rebholz-Schuhmann et al.: Entity Recognition in Parallel Multilingual Biomedical Corpora: The CLEF-ER Laboratory Overview, in: P. Forner et al. (Eds.): Information Access Evaluation. Multilinguality, Multimodality, and Visualization – 4th International Conference of the CLEF Initiative, CLEF 2013, Valencia, Spain, September 23-26, 2013, pp. 353-367 (Lecture Notes in Computer Science, 8138).
Erik Faessler/Johannes Hellrich/Udo Hahn: Disclose Models, Hide the Data-How to Make Use of Confidential Corpora without Seeing Sensitive Raw Data, in: Calzolari N. et al. (Eds.): Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 26-31 May, Reykjavik, Iceland, 2014. pp. 4230-4237.
Johannes Hellrich et al.: Collaboratively Annotating Multilingual Parallel Corpora in the Biomedical Domain-some MANTRAs, in: Calzolari N. et al. (Eds.): Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 26-31 May, Reykjavik, Iceland, 2014. pp. 4033-4040.
Johannes Hellrich/Udo Hahn: Enhancing Multilingual Biomedical Terminologies via Machine Translation from Parallel Corpora, in: Elisabeth Métais, Mathieu Roche/Maguelonne Teisseire (Eds.): Natural Language Processing and Information Systems – 19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014, Montpellier, France, June 18-20, 2014, pp. 9-20 (Lecture Notes in Computer Science, 8455).
Johannes Hellrich/Udo Hahn: Exploiting Parallel Corpora to Scale Multilingual Biomedical Terminologies, in: Christian Lovis et al. (Eds.): e-Health – For Continuity of Care [= Proceedings of MIE2014], 2014, pp. 575-578 (Studies in Health Technology and Informatics, 205).
Anne H. Schneider/Johannes Hellrich/Saturnino Luz: Word, Syllable and Phoneme Based Metrics Do Not Correlate with Human Performance in ASR-Mediated Tasks, in: Adam Przepiórkowski/Maciej Ogrodniczuk (Eds.): Advances in Natural Language Processing – 9th International Conference on NLP, PolTAL 2014, Warsaw, Poland, September 17-19, 2014, pp. 392-399 (Lecture Notes in Computer Science, 8686).
Johannes Hellrich/Udo Hahn: Fostering Multilinguality in the UMLS: A Computational Approach to Terminology Expansion for Multiple Languages, in: AMIA 2014 – Proceedings of the AMIA 2014 Annual Symposium, Washington D.C., November 15-19, 2014, pp. 655-660; 660a-660d.
Jose Antonio Miñarro-Giménez/Johannes Hellrich/Stefan Schulz: Acquisition of Character Translation Rules for Supporting SNOMED CT Localizations, in: Ronald Cornet et al. (Eds.): Digital Healthcare Empowering Europeans [= Proceedings of MIE2015], 2015, pp. 597-601 (Studies in Health Technology and Informatics, 210).
Johannes Hellrich et al.: Sharing Models and Tools for Processing German Clinical Text, in: Ronald Cornet, Lăcrămioara Stoicu-Tivadar et al. (Eds.): Digital Healthcare Empowering Europeans [= Proceedings of MIE2015], 2015, pp. 734-738 (Studies in Health Technology and Informatics, 210).
Johannes Hellrich/Udo Hahn: Adding Multilingual Terminological Resources to Parallel Corpora for Statistical Machine Translation Deteriorates System Performance: A Negative Result from Experiments in the Biomedical Domain, in: Pavel Král/Václav Matoušek (Eds.): Text, Speech, and Dialogue, 18th International Conference, TSD 2015 Pilsen, Czech Republic, September 14-17, 2015, Proceedings, 2015, pp. 506-514 (Lecture Notes in Computer Science, 9302).
Johannes Hellrich et al.: JUFIT: A Configurable Rule Engine for Filtering and Generating New Multilingual UMLS Terms, in: AMIA Annual Symposium Proceedings 2015, 2015, pp. 604-610.
Johannes Hellrich/Udo Hahn: Romantik im Wandel der Zeit – eine quantitative Untersuchung. In: Digital Humanities im deutschsprachigen Raum 2016, Modellierung – Vernetzung – Visualisierung, Die Digital Humanities als fächerübergreifendes Forschungsparadigma, Leipzig, Germany, 7.-12. März, 2016, pp. 325-326 [Poster].
Udo Hahn et al.: UIMA-Based JCoRe 2.0 Goes GitHub and Maven Central ― State-of-the-Art Software Resource Engineering and Distribution of NLP Pipelines, in: Nicoletta Calzolari et al. (Eds.): Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, 2016, pp. 2502-2509.
Johannes Hellrich/Udo Hahn: Measuring the Dynamics of Lexico-Semantic Change Since the German Romantic Period, in: Digital Humanities 2016: Conference Abstracts, Krakow, Poland, 11.-16. Juli 2016, pp. 545-547.
Johannes Hellrich/Udo Hahn: An Assessment of Experimental Protocols for Tracing Changes in Word Semantics Relative to Accuracy and Reliability, in: Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) @ ACL2016. Berlin, Germany, 11. August, 2016, pp. 111-117.
Sven Buechel/Johannes Hellrich/Udo Hahn: Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis, in: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) @ COLING 2016. Osaka, Japan, December 11, 2016, pp. 54-61.
Johannes Hellrich/Udo Hahn: Bad Company—Neighborhoods in Neural Embedding Spaces Considered Harmful, in: COLING 2016. Osaka, Japan, December 13-16, 2016, pp. 2785-2796.
Johannes Hellrich/Franz Matthies/Udo Hahn: UIMA als Plattform für die nachhaltige Software-Entwicklung in den Digital Humanities, in: Digital Humanities im deutschsprachigen Raum 2017, Digitale Nachhaltigkeit, Konferenzabstracts, Bern, 13.-18. Februar 2017, S. 279-281.
Johannes Hellrich/Udo Hahn: Exploring Diachronic Lexical Semantics with JeSemE, in: ACL 2017, System Demonstrations. Vancouver, Canada, July 30 – August 4, 2017, pp. 31-36.
Johannes Hellrich/Udo Hahn: Don’t Get Fooled by Word Embeddings-Better Watch their Neighborhood, in: Digital Humanities 2017. Montreal, Canada, August 8-11, 2017. pp. 250-252.
Sven Buechel/Johannes Hellrich/Udo Hahn: The Course of Emotion in Three Centuries of German Text-A Methodical Framework, in: Digital Humanities 2017. Montreal, Canada, August 8-11, 2017. pp. 176-179.
Johannes Hellrich/Sven Buechel/Maria Moritz: A human-interpretable method to predict paraphrasticality, in: LaTeCH-CLfL 2018 – Proceedings of the 2nd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature @ COLING 2018. Santa Fe, New Mexiko, USA, August 25, 2018, S. 113-118.
Johannes Hellrich/Sven Buechel/Maria Moritz: Towards a Metric for Paraphrastic Modification, in: Digital Humanities 2018 Mexico City (2018), S. 457-460
Johannes Hellrich/Udo Hahn/Alexander Stöger: Wenn der Funke überspringt – Word Embeddings im Dienst der Wissenschaftsgeschichte, in: DHd 2018 Köln, Kritik der digitalen Vernunft, S. 331-335