Chapter 18: Topic Modeling

by Roman Egger

Salzburg University of Applied Sciences, Innovation and Management in Tourism

Due to the rapid growth of texts in today’s society, much of which is produced via online social networks in the form of user-generated content, extracting useful information from unstructured text poses quite a challenge. However, thanks to the rapid development of natural language processing algorithms, including topic modelling techniques that help to discover latent topics in text documents such as online reviews or Twitter and Facebook posts, this challenge can be confronted. As such, topic modelling approaches have been gaining popularity in the field of tourism; yet, often little insight is given into the creation process and the quality of topic modeling results. Thus, this chapter aims to introduce several topic modelling algorithms, to explain their intuition in a brief and concise manner, and to provide tips and hints in relation to the necessary (pre-) processing steps, proper hyperparameter tuning, and comprehensible evaluation of the results.