Behind the Paper

Behind CoronaNet: How we built our dataset

By: Cindy Cheng, Luca Messerschmidt, Joan Barceló, Allison Spencer Hartnett, Vanja Grujic, Robert Kubinec, Timothy Model, and Caress Schenk

Published in Social Sciences

Dec 09, 2020

Cindy Cheng, Luca Messerschmidt, Robert Kubinec, Joan Barceló, Caress Schenk, Allison Spencer Hartnett, Vanja Grujic & Timothy Model

8 contributors

Like Be the first to like this

Explore the Research

Like most people, at the beginning of the year, we had read about the SARS-CoV-2 outbreak in China with abstract concern, then, with increasing unease as cases sprang up outside China. The cold gravity of the situation finally hit home when the World Health Organization (WHO) declared a global pandemic on March 11. Governments everywhere began implementing a series of policies unimaginable only a few weeks before: on March 19, the state of Bavaria (Germany) instituted a lockdown shortly after the United Arab Emirates had already closed all schools by March 9.

In those early days, we saw our travel plans evaporate as borders closed and case counts went up. We coped with the uncertainty of the nature and effects of the virus by obsessively reading variations of the same vague reports, accomplishing little but wracked nerves and a sense of helplessness. In the vacuum of clear guidance from the WHO as to what policies were effective against COVID-19 or not, Robert Kubinec and Cindy Cheng independently started collecting data on government responses to COVID-19 and having learned of each other’s efforts through what is now the COVID-19 Social Science Research Tracker, made contact on Monday, March 23.

What happened next was the seemingly improbable takeoff of the CoronaNet Research Project. This academic collaboration has produced a database of, at the time of writing, more than 45,000 policies documenting government responses to the COVID-19 pandemic, an article in Nature Human Behaviour describing the effort and numerous working papers using the data. No week on the project has been more head-spinning than its first. On Tuesday, March 24, both Joan Barceló and Luca Messerschmidt had signed on to the effort, Cindy drafted the codebook by Wednesday, Robert began training thirty-odd research assistants (RAs) on Thursday, and we launched our data collection effort via a Qualtrics survey by Saturday. By the first week of April, Allison Hartnett had also joined the effort to guide and manage our data validation efforts. Since then, we have welcomed three new co-PIs who have played invaluable roles in ensuring the success of the project; Tim Model has professionalized our data pipeline and Caress Schenk and Vanja Grujic have bolstered our RA training and management.

From the start, we faced immense organizational challenges and we dealt with them by building streamlined internal structures and institutions to manage them. The three main challenges were, and continue to be: recruitment, training and motivation of RAs. With regards to recruitment, a week after the launch of the survey, 30 RAs became 210 and we have since held steady at around 500 active RAs spanning 18 different time zones. Though at first, Luca bore the brunt of this logistical challenge, by early May, we developed a project management team to deal with the recruitment and onboarding of new RAs.

Training different people from all over the world to document policies in a consistent way has presented its own challenge. Initially, we required that all RAs watch a recording of our original training video and use a centralized platform, Slack, for communication. We built on this foundation by developing additional tools and mechanisms to support and train the RAs, including more comprehensive training materials, a Shiny App to help RAs visualize the data and a training assessment to evaluate RA skills.

Motivating RAs to continue with the project as volunteers is perhaps the biggest challenge of all. While we continuously search for funding opportunities to compensate as many RAs as we can, we deal with the immediate reality of working with volunteers in a number of ways. Early on, we had organized a select group of RAs to help monitor general well-being and ensure efficient communication and feedback. By late May, we further instituted a system of regional and country managers, which has proven to be instrumental in supporting, monitoring and managing the work of RAs. To motivate and empower research assistants to not only build their data collection skills but to also develop their own research skills, we encourage our junior scholars to leverage our data in their research by providing them with workshops and technical skills training (e.g. in R and statistics to do so). We also provide a platform for disseminating RA research on our website and a forthcoming working paper series.

In the midst of these efforts, we wrote up a paper about our collective data collection efforts and in the blink of an academic eye, published it in June. Publication is just the start. The project has continued to move forward, and is ever-evolving toward greater collaboration, communication and community, both externally and internally.

Externally, we have received and continue to receive incalculable support from our home institutions, both in terms of material funding as well as support, advice, and counseling from our academic betters and peers. Our ability to identify raw policies to code was significantly bolstered by cooperation with first Jataware, and then Overton, both of which make use of machine-learning algorithms to find, respectively, news reports (Jataware) and government policies (Overton) related to COVID-19. Our most significant collaboration to date has been our participation in PERISCOPE, an academic consortium of 32 EU universities investigating the behavioural and socio-economic impacts of COVID-19. Funded by an EU Horizon 2020 grant, it will keep CoronaNet's data collection effort going for three more years. In the future, we aim to further strengthen international collaboration and coordination with other projects which gather information on COVID-19 policies because we recognize that all of us not only face the same issues --- recruiting, training and motivating RAs --- but are driven by the same spirit --- providing a public dataset which can help forward research and knowledge of the COVID-19 pandemic.

Internally, the goal has always been to come to a common understanding of how a policy should be coded. Doing so depends on the ability to pass on the specific knowledge of coding policies in a particular country as well as the general knowledge of how the project has evolved overall.

Though early on, we often compared our RAs to an army of coders to explain how the data was being gathered, we have come to appreciate that such metaphors miss the mark. While the project does depend on guiding an enormous number of diligent and public-spirited RAs to document government policies all over the world in a standardized way, organizing this work requires a great deal of flexibility and bottom-up communication. Indeed, though ultimately only one RA is responsible for coding a given policy, preparing the way to do so takes a community of scholars working together to assign and distribute the work, give advice on how to code different policies across countries and share ideas and information about new policies on the horizon. Everybody plays an important part in ensuring that there is a common understanding of how to code a policy and thus to the success of the project. With the end of the pandemic still in the distance, we anticipate further challenges ahead but are confident in the robustness of our community to handle them and invite anyone so inclined to join our efforts.

Multiple Contributors

Cindy Cheng, Luca Messerschmidt, Robert Kubinec, Joan Barceló, Caress Schenk, Allison Spencer Hartnett, Vanja Grujic & Timothy Model

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Society

Humanities and Social Sciences > Society

Nature Human Behaviour

Nature Human Behaviour

Drawing from a broad spectrum of social, biological, health, and physical science disciplines, this journal publishes research of outstanding significance into any aspect of individual or collective human behaviour.

More about the journal

Latest Content

Events

Meeting the Challenge of Infectious Diseases in a Changing World

Events

Nanobiotechnology for Precision Medicine and Tissue Engineering

Events

Redefining Healthcare in the Age of AI

Behind the paper: Greener hybrid polypropylene composites using flax, basalt and rice husk powder

Opportunities, From the Editors

Call for papers: 6G technologies

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Behind CoronaNet: How we built our dataset

Share this post

Share with...

...or copy the link