Even before public announcements of the first cases of COVID-19 in Europe were made, at the end of January 2020, signals that something strange was happening were already circulating on social media. A new study of researchers at IMT School for Advanced Studies Lucca, published in Scientific Reports, has identified tracks of increasing concern about pneumonia cases on posts published on Twitter in seven countries, between the end of 2019 and the beginning of 2020. The analysis of the posts shows that the “whistleblowing” came precisely from the geographical regions where the primary outbreaks later developed.
To conduct the research, the authors first created a unique database with all the messages posted on Twitter containing the keyword “pneumonia” in the seven most spoken languages of the European Union – English, German, French, Italian, Spanish, Polish, and Dutch – from December 2014 until 1 March 2020. The word “pneumonia” was chosen because the disease is the most severe condition induced by the SARS-CoV-2, and also because the 2020 flu season was milder than the previous ones, so there was no reason to think it to be responsible for all the mentions and worries. The researchers then made a number of adjustments and corrections to the posts in the database to avoid overestimating the number of tweets mentioning pneumonia between December 2019 and January 2020, that is to say in the weeks between the World Health Organization (WHO) announcement that the first “cases of pneumonia of unknown etiology” had been identified – on 31 December 2019 – and the official recognition of COVID19 as a serious transmissible disease, on 21 January 2020. In particular, all the tweets and retweets containing links to news about the emerging virus were eliminated from the database to exclude from the count the mass media coverage of the emerging pandemic.
The analysis of the authors shows an increase in tweets mentioning the keyword “pneumonia” in most of the European countries included in the study as early as January 2020, such as to indicate an ongoing concern and public interest in pneumonia cases. In Italy, for example, where the first lock-down measures to contain COVID-19 infections were introduced on 22 February 2020, the increase rate in mentions of pneumonia during the first few weeks of 2020 differs substantially from the rate observed in the same weeks in 2019. That is to say that potentially hidden infection hotspots were identified several weeks before the announcement of the first local source of a COVID-19 infection (20 February, Codogno, Italy). France exhibited a similar pattern, whereas Spain, Poland, and the UK witnessed a delay of 2 weeks.
The authors also geo-localized over 13,000 pneumonia-related tweets in this same period, and discovered that they came exactly from the regions where the first cases of infections were later reported, such as the Lombardia region in Italy, Madrid, Spain, and Île de France.
Following the same procedure used for the keyword “pneumonia”, the researchers also produced a new dataset containing the keyword “dry cough”, one of the other symptoms later associated with the COVID-19 syndrome. Even then, they observed the same pattern, namely an abnormal and statistically significant increase in the number of mentions of the word during the weeks leading up to the surge of infections in February 2020.
“Our study adds on to the existing evidence that social media can be a useful tool of epidemiological surveillance. They can help intercept the first signs of a new disease, before it proliferates undetected, and also track its spread” says Massimo Riccaboni, full professor of Economics at the IMT School, who coordinated the research.
This is especially true in a situation like the current pandemic, when lapses in identifying early-warning signals left many national governments blind to the unprecedented scale of the looming public health emergency. In a successive phase of the pandemic, monitoring social media could help public health authorities mitigate the risks of contagion resurgence, for example by adopting stricter measures of social distancing where the infections appear to be increasing, or vice versa relaxing them in other regions. These tools could also pave the way to an integrated epidemiological surveillance system globally managed by international health organizations.