The Book

Applied Data Science in Tourism

Interdisciplinary Approaches, Methodologies and Applications

Publisher: Springer – Series: “Tourism on the Verge

Editor: Roman Egger

Data Science has brought marvelous opportunities to many industries, and tourism is no exception. Although tourism is known as an interdisciplinary field that crosses sociology, economics, geography, psychology, and communication sciences, tourism researchers have long been constrained by the classical repertoire of research methodologies. Besides the widely applied quantitative and qualitative approaches, we could see advancements especially in quantitative methods over time. In an era of digitization, data comes in new unstructured forms along with traditionally structured datasets, which result in the rise of Big Data. Meanwhile, advancements in computing and the rapid development of algorithms lead to the emergence of advanced analytics that goes beyond conventional business intelligence to gain deeper insights and make predictions. Data Science is more than a set of methods and tools in elevating the typical ways of doing empirical research, allowing researchers to even find answers for previously unknown questions. However, Data Science is yet to be embraced by tourism scholars potentially because of the bigness, messiness, and unstructured nature of data that fuel confusion and uncertainty. At the same time, because Data Science has altered the epistemological foundations, the interplay between Data Science and theory deserves much attention.

By learning how to develop research questions that can be supported by theories, Data Science helps researchers better understand the data, uncover unknown relationships and patterns, and improve data visualization. In tourism, examples of Data Science applications include route optimization, real-time analysis, predictive analysis, personalization, customer sentiment analysis, alerting and monitoring systems, and much more. Nevertheless, adopting Data Science in tourism is not an easy task as it requires an interdisciplinary understanding between computer sciences as its original discipline. Tourism researchers are often not aware of these upcoming techniques and not familiar with their usage, contributions, advantages, pitfalls, and limitations.

This book is intended to serve as a starting point that connects Data Science to the tourism industry, being helpful for both, researchers and practitioners alike. It aims to present an overview of Data Science techniques relevant for tourism by offering a theoretical foundation for these concepts and a how-to-approach which facilitates readers in developing their research projects. Of course, this book cannot claim to cover the individual chapters and topics in their completeness. Rather, the aim is to provide the reader with the necessary knowledge to facilitate the decision regarding the choice of method.

Chapters will also include additional material such as R-Code, Jupyter Notebooks, Workflows etc.

The following chapters are confirmed:

Introduction: Data Science in Tourism
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Industry Insights from Data Scientists: A Q&A Session
Roman Egger (Salzburg University of Applied Sciences, Innovation and Management in Tourism), Mike O´Connor (booking.com), Liliya Lavitas (Tripavisor), Holger Sicking (Austrian National Tourist Office), Jeroen Mulder (Air France-KLM)

Chapter 1: AI and Big Data in Tourism
Luisa Mich
University of Trento

Chapter 2: Epistemological Challenges
Roman Egger & Chung-En Yu
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 3: Interdisciplinarity in Data Science
Roman Egger & Chung-En Yu
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 4: Data Science & Ethics
Roman Egger, Larissa Neuburger & Michelle Mattutzzi
Salzburg University of Applied Sciences, Innovation and Management in Tourism
University of Florida
University of Groningen

Chapter 5: Web Mining & Data Crawling
Roman Egger, Markus Kroner, Andreas Stöckl
Salzburg University of Applied Sciences,
Legalcounsel.at
School of Informatics, Communications and Media, University of Applied Sciences Hagenberg

Chapter 6: Machine Learning: a Primer
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 7: Feature Engineering
Pablo Duboue
Textualization Software Ltd.

Chapter 8: Unsupervised Machine Learning
Clustering
Mathias Fuchs & Wolfram Höpken
Department of Economics, Geography, Law and Tourism, Mid Sweden University
University of Applied Science Ravensburg-Weingarten

Chapter 9: Dimensionality Reduction
Nikolay Oskolkov
Lund University and National Bioinformatics Infrastructure Sweden (NBIS)

Chapter 10: Supervised Machine Learning
Classification
Ulrich Bodenhofer & Andreas Stöckl
School of Informatics, Communications and Media, University of Applied Sciences Hagenberg

Chapter 11: Regression
Ulrich Bodenhofer & Andreas Stöckl
School of Informatics, Communications and Media, University of Applied Sciences Hagenberg

Chapter 12: Hyperparameter Tuning
Pier Paolo Ippolito
SAS Institute

Chapter 13: Model Evaluation
Ajda Pretnar
Faculty of Computer and Information Science, University of Ljubljana

Chapter 14: Data Interpretability of ML-Models
Urszula Czerwinska
no academic affiliation atm

Chapter 15: Introduction: Natural Language Processing
Roman Egger & Enes Gokce
Salzburg University of Applied Sciences, Innovation and Management in Tourism
Pennsylvania State University

Chapter 16: Text Representation and Word Embeddings
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 17: Sentiment Analysis
Andrei P. Kirilenko, Svetlana Stepchenkova & Luyu Wang
University of Florida

Chapter 18: Topic Modeling
Roman Egger
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Chapter 19: Entity Matching
Ivan Bilan
TrustYou

Chapter 20: Knowledge-Graphs
Mayank Kejriwal
University of Southern California

Chapter 21: Social Network Analysis
Rodolfo Baggio
Bocconi University

Chapter 22: Time Series Analysis
Irem Önder
Univesity of Massachusetts Amherst

Chapter 23: Agent-based Modeling
JillianStudent
Wageningen University

Chapter 24: GIS Analysis
Andrei P. Kirilenko
University of Florida

Chapter 25: Data Visualization
Johanna Schmidt
VRVis

Chapter 26: Software and Tools
Roman Egger & …..
Salzburg University of Applied Sciences, Innovation and Management in Tourism

Glossary