The Book

Applied Data Science in Tourism: Interdisciplinary Approaches, Methodologies and Applications

ISBN: SBN 978-3-030-88389-8  

Publisher: Springer – Series: “Tourism on the Verge (forthcoming – early 2022)

Editor: Roman Egger

Data Science has brought about marvelous opportunities for many industries, and tourism is no exception. Although tourism is known as an interdisciplinary field, integrating sociology, economics, geography, psychology, and communication sciences, tourism researchers have long been constrained by the repertoire of traditional research methodologies. Besides the widely applied (classic) quantitative and qualitative approaches, advancements, especially in quantitative methods, can indeed be seen over time. In this era of digitization, data continuously spews in new unstructured forms, alongside the traditionally structured datasets, resulting in the rise of Big Data. Meanwhile, modern advancements in computing and the rapid development of algorithms have led, and continue to lead, to the emergence of advanced analytics, going beyond conventional business intelligence to gain deeper insights and make better predictions. As such, Data Science is more than just a set of methods and tools used to elevate the typical ways in which empirical research is done and to allow researchers to find answers to previously unknown questions. However, potentially because of this field’s vastness and ‘messiness‘ and due to the unstructured nature of the data, which may fuel confusion and uncertainty, Data Science has yet to be completely embraced by tourism scholars. At the same time, since Data Science has altered epistemological foundations, the interplay between Data Science and theory deserves much more attention.

By learning how to develop research questions that can be supported by concrete theories, Data Science can help researchers in better understanding data, uncovering unknown relationships and patterns, and improving data visualization. In tourism, examples of applications involving Data Science include route optimization, real-time analysis, predictive analysis, personalization, customer sentiment analysis, alerting and monitoring systems, and much more. Nevertheless, implementing Data Science into tourism research is a challenging task as it requires an interdisciplinary understanding of computer science (Data Science’s original discipline), along with statistics and any required domain knowledge. Tourism researchers are often unaware of these up-and-coming techniques and are unfamiliar with their usage, contributions, advantages, pitfalls, and limitations.

Thus, this book is intended to serve as a starting point to bridge Data Science with the tourism industry and to assist both researchers and practitioners alike. It aims to present an overview of Data Science techniques relevant for tourism by offering the theoretical foundation as well as a how-to-approach to these concepts, ultimately facilitating readers in developing their research projects. Naturally, this book cannot cover individual chapters and topics in complete detail; instead, the goal is to provide readers with the general necessary knowledge needed to make accurate decisions regarding the choice of method.

Chapters will also include additional material such as R-Code, Jupyter Notebooks, Workflows, etc., which are available at: https://github.com/DataScience-in-Tourism/.


This is what luminaries in the field say about the book:

The book is a very well-structured introduction to data science – not only in tourism – and its methodological foundations, accompanied by well-chosen practical cases. It underlines an important insight: data are only representations of reality, you need methodological skills and domain background to derive knowledge from them.

Hannes Werthner (Vienna University of Technology)

Roman Egger has accomplished a difficult but necessary task: make clear how data science can practically support and foster travel and tourism research and applications. The book offers a well-taught collection of chapters giving a comprehensive and deep account of AI and data science for tourism.

Francesco Ricci (Free University of Bozen-Bolzano)

This well-structured and easy-to-read book provides a comprehensive overview of data science in tourism. It contributes largely to the methodological repository beyond traditional methods.

Rob Law (University of Macau)


Structure of the book:

Introduction: Data Science in Tourism
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Industry Insights from Data Scientists: A Q&A Session
Roman Egger (Salzburg University of Applied Sciences, Innovation and Management in Tourism), Mike O´Connor (booking.com), Liliya Lavitas (TripAdvisor), Holger Sicking (Austrian National Tourist Office), Jeroen Mulder (Air France-KLM)

Chapter 1: AI and Big Data in Tourism
Luisa Mich
University of Trento

Chapter 2: Epistemological Challenges
Roman Egger & Chung-En Yu
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 3: Interdisciplinarity in Data Science
Roman Egger & Chung-En Yu
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 4: Data Science & Ethics
Roman Egger, Larissa Neuburger & Michelle Mattutzzi
Salzburg University of Applied Sciences, Innovation and Management in Tourism
University of Florida
University of Groningen

Chapter 5: Web Mining & Data Crawling
Roman Egger, Markus Kroner, Andreas Stöckl
Salzburg University of Applied Sciences,
Legalcounsel.at
School of Informatics, Communications and Media, University of Applied Sciences Hagenberg

Chapter 6: Machine Learning: an Introduction
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 7: Feature Engineering
Pablo Duboue
Textualization Software Ltd.

Chapter 8: Clustering
Matthias Fuchs & Wolfram Höpken
Mid Sweden University, Department of Economics, Geography, Law and Tourism
University of Applied Science Ravensburg-Weingarten

Chapter 9: Dimensionality Reduction
Nikolay Oskolkov
Lund University and National Bioinformatics Infrastructure Sweden (NBIS)

Chapter 10: Classification
Ulrich Bodenhofer & Andreas Stöckl
University of Applied Sciences Hagenberg, School of Informatics, Communications and Media

Chapter 11: Regression
Ulrich Bodenhofer & Andreas Stöckl
University of Applied Sciences Hagenberg, School of Informatics, Communications and Media

Chapter 12: Hyperparameter Tuning
Pier Paolo Ippolito
SAS Institute

Chapter 13: Model Evaluation
Ajda Pretnar & Janez Demšar
University of Ljubljana, Faculty of Computer and Information Science

Chapter 14: Data Interpretability of ML-Models
Urszula Czerwinska
no academic affiliation atm

Chapter 15: Introduction: Natural Language Processing
Roman Egger & Enes Gokce
Salzburg University of Applied Sciences, Innovation and Management in Tourism
Pennsylvania State University

Chapter 16: Text Representation and Word Embeddings
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 17: Sentiment Analysis
Andrei P. Kirilenko, Svetlana Stepchenkova & Luyu Wang
University of Florida

Chapter 18: Topic Modeling
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 19: Entity Matching
Ivan Bilan
TrustYou

Chapter 20: Knowledge-Graphs
Mayank Kejriwal
University of Southern California

Chapter 21: Social Network Analysis
Rodolfo Baggio
Bocconi University

Chapter 22: Time Series Analysis
Irem Önder
Univesity of Massachusetts Amherst

Chapter 23: Agent-based Modeling
JillianStudent
Wageningen University

Chapter 24: GIS Analysis
Andrei P. Kirilenko
University of Florida

Chapter 25: Data Visualization
Johanna Schmidt
VRVis

Chapter 26: Software and Tools
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Glossary