Monthly Archives: December 2021

PERCEPTION OF DARK TOURISM: Automated Text Analysis of Users Comments


by Aleksandra Kleshcheva & Roman Egger

Full paper published in Zeitschrift für Tourismuswissenschaft (Journal of Tourism Science)
De Gruyter:

In recent years, numerous studies have been conducted on dark tourism. However, while several issues have been analysed and discussed much remains to be done in the field of motivational factors. The existing literature provides average data about motivational issues using traditional methods of gathering information (interviews, surveys, etc.) and rarely investigates dark tourism through social media. Therefore, this study seeks to understand what motivates people to visit dark tourism sites such as the Chernobyl exclusion zone by applying an automated text analytics approach.

The primary goal of the study was to provide a clear picture of tourists’ perception of the Chernobyl Nuclear Power Plant. Tripadvisor was chosen as a source for data collection as tourists are increasingly sharing their experiences and leaving feedback online. Several natural language processing methods, such as topic modelling (LDA) and sentiment analysis, were applied to extract the primary motivators behind a visit to Chernobyl.

Owning to the unstructured and complex nature of the reviews collected from Tripadvisor, the data was preprocessed. At the initial stage, the text preprocessing pipeline was applied to obtain meaningful data from unstructured text. At the initial stage, a word-cloud was generated to show word frequencies. Based on this, a list of stop words was made and undesired words (e.g. we, me, and), numbers, and brackets were eliminated, lemmatization was conducted. The remaining text was turned to lowercase, all diacritics and accents were transformed to a basic format. Word clouds showing the importance of data preprocessing are presented below:

Figure 1 Word cloud before preprocessing
Figure 2 Word cloud after preprocessing
Figure3 Word cloud after preprocessing

Following next, topic modelling with LDA technique was performed. The main advantage of topic modelling is a derivation of hidden patterns that could not be observed with human interaction (Blei, 2012; Joshi, 2018; Bansal, 2016; Rajasundari et al., 2017). Also, algorithms of topic modelling can be applied to massive collections of files and adapted to many kinds of data (Blei, 2012). LDA analysis suggested 5 potential topics with specific sets of keywords having the highest probability. Topic labelling was grounded on the qualitative analysis of reviews. To assign a theme to each topic, some of the reviews were read manually. Indeed, this approach can be more effective than the results provided by the software.

To interpret topics and to visualize topic models, pyLDAvis, a web-based interactive Python visualisation, was utilised.

Figure 4 Intertopic distance map (interactive version is available:

First, pyLDAvis presents a general view of the topic models and shows how they relate to each other. Second, it provides a panel with the most relevant terms for each individual topic allowing for a detailed analysis and interpretation of the topic models (Sievert and Shirley, 2014). A proper topic model will have little or no overlapping (Li, 2020). It can be seen from the map that four generated topic models have little overlapping as they describe emotions and experience. Topic model five has no terms in common with other topic models. The reviews of this topic mainly describe locations (Pripyat, Chernobyl, zone) and technical details (reactor, radiation, power, plant). Also, the visualisation tool provides data about the size of topic models. In other words, it shows how reviews are distributed between topic models or what topic was based on the biggest number of reviews. The biggest topic model is Topic 3, the smallest – Topic 2. It is assumed that reviews belong to a more differentiated tourist type visitors wrote mostly about organizational details of tours and tour structure, less often – about emotional experience.

Topic modelling results present five main topics discussed by tourists. Based on the current study results, the main motivational factors of being interested in Chernobyl and visiting this location are historical experience, emotional experience, sharing experience, and educational experience. It is not surprising that many reviews had nothing in common with motivations but contained information about the organisation, tour structures, and schedule. It shows that the organisational moment and planning are essential for making a decision.

Furthermore, sentiment analysis was conducted to uncover consumers’ positive and negative feelings based on their reviews. Being a field of research in NLP, sentiment analysis extracted and classified opinions and attitudes to the location and detected and analyzed emotions people have had during their trip to Chernobyl. At this stage, VADER, a lexicon- and rule-based algorithm for sentiment analysis, was used. VADER assigns a score to a particular list of words and indicates its compound feeling where +1 is the most positive and -1 is the most negative. The total sentiment score is positive but close to neutral. It explains that people felt both positive and negative emotions while visiting or reflecting on Chernobyl. The negative compound scores show that the Chernobyl accident and the zone are considered dark tourism attractions in common sense. However, Chernobyl is not a synonym for death anymore. After visiting this place, positive emotions can be interpreted as visitors’ desire to learn new things, share knowledge and emotions, and friendly staff.

This study follows an interdisciplinary research approach applying innovative data analytics methods to investigate dark tourism through social media. By implementing NLP methods, this study reveals tourists’ perceptions from online reviews, which are not easy to discover by traditional approaches. Today, travel blogs and review platforms are not the only tools to express an opinion. Therefore, it is necessary to discover the potential of online reviews by adopting new approaches from computational social science. Moreover, the results provide guidelines to tourism managers in monitoring new trends in tourism, understanding tourists’ needs, and wishes, and evaluating the quality of products or services.