Global Data Science Project for COVID-19

The GDSP (Global Data Science Project) for COVID-19 is led by Dr. Dario Garcia Gasulla, Dr. Sergio Alvarez from the Barcelona Supercomputing Center, and advised by Dr. Manuel García-Herranz from the UNICEF Office of Innovation. The purpose of the GDSP is to quantitatively measure secondary impacts of the COVID-19 pandemic on our societies and inform public and private decision makers to make effective and appropriate policy decisions. Our international team focuses on various societal aspects including mobility, health, education, online behavior, finance, and the economy.  The team consists of volunteer data scientists from various countries including US, Japan, Spain, France, Lithuania and China. Team members are listed here.

* Disclaimer:  Each member does not represent his/her organization and join this project in a voluntary basis in his/her personal time. All the outcomes has nothing to do with their organizations.

Project Scope


1. Quantifying Physical Distancing

How does human mobility change over time ?

Physical distancing is key to avoid or slow down the spread of viruses. Each country has taken different policies and actions to restrict human mobility. In this project, we investigate how policies and actions affect human mobility in certain cities and countries. By referencing our analysis of policy and secondary impacts, we hope that decision makers can make effective and appropriate actions. Furthermore, by analyzing human mobility, we also aim to develop a physical distancing risk index to monitor the risk on areas with high population densities and probability of contraction. For further information and our up-to-date analytics results, please go here.

2. Emotion Analysis

For health, we have focused on emotion changes that people have experienced during this pandemic. Emotion changes have stemmed from various reasons such as unemployment, implementation of stay-at-home policies, fear of the virus, etc. We quantify emotion changes by using social media data, including Twitter and Instagram. Since the breakout of COVID-19, we have seen an increase in online discussions that use hashtags such as #COVID-19 and #depression. We believe it is vital to visualize and analyze the differences in people’s perceptions towards COVID-19.. We also hope to analyze overall responses to the pandemic by sentiment: sadness, depression, isolation, happiness, etc. Further detailed analysis will also look into specific keywords and corresponding trends. For further information and our up-to-date analytics results, please go here.

3. Impact on Human Life

Due to physical distancing and lockdown policies, people have begun relying on video conferencing tools for meetings, lectures, and conversations among friends more frequently than usual. Children are especially affected by the quarantine since many must refrain from going to their classrooms and take classes online. By leveraging various data sources, we will analyze how daily behavior has been affected by this pandemic, and also compare behaviors among different countries and cities. We will also measure online e-commerce and consumer behavior by analyzing sites such as Amazon. For further information and our up-to-date analytics results, please go here.

4. Prediction of Infected Cases

We will also attempt to predict future infections by consolidating various data sources. One of our target data is from crowd-sourcing data via social network services. We collect online posts about certain symptoms relevant to COVID-19 such as fevers and coughs under the assumption that there is approximately a 12-14 day window before people are found COVID-19 positive, and a 5 day incubation period. After collecting relevant posts and tweets, demographic features such as user location and age, combined with the number of confirmed cases, we will use machine learning algorithms to predict the number of COVID-19 cases that will occur in the future.

The latest Summary Report (June 9, 2020) available here

Privacy

At GDSP we are committed to protecting and respecting your privacy. We leverage anonymized data without any personal identifiers for our analysis. The use of data is in compliance with existing laws and ethical standards. If you have any questions related to privacy, please contact us.