Supporting OpenMined's Privacy Preserving COVID-19 Tools10 Sept, 2020
Abstract: The following article is written by OpenMined, the open-source community whose goal it is to make the world more privacy-preserving. OpenMined were one of four recipients of a grant from 100x Group (originally made via 100x Group’s HDR Global Trading’s COVID-19 Response Fund). The article outlines how 100x’s grant has helped establish the financial stability OpenMined needs to focus 100% on producing the technology needed to tackle COVID-19 without compromising on privacy.
COVID-19 Tech: An urgent need for privacy-preserving technology
Right now, COVID-19 apps are being built around the world to help societies mitigate the social, economic and epidemic threats they face. Data privacy is crucial for these apps. Not only is privacy a human right, but it is also needed for establishing trust — and therefore, compliance — in these COVID-19 apps.
During this time of urgent pandemic, policy makers have come to doubt that we can afford to maintain privacy in the tech we build. This is not true - we know that every single COVID-19 use-case can be accomplished while protecting privacy.
If time was not an element, OpenMined would already be in a great position to deliver these services to the general public using our extensive volunteer community. OpenMined is a community of 7,300+ engineers, researchers, marketers, and hackers dedicated to lowering the barrier-to-entry to private AI technologies through open-source code and free education. We currently have 8 development teams, 6 community teams, and 2 research teams with 100+ people meeting every week.
But at the same time, many developers have lost financial footing due to the simultaneous financial crisis. Due to this, the vast majority of our engineers were volunteers working limited hours on their projects. Even among these volunteers, many have experienced significant displacement in their own personal and professional lives (loss of job, loss of workspace, additional family burdens, etc.). Many who want to help cannot, and those that can can only do so with limited time and energy.
Thanks to 100x Group’s donation, we’ve gained the ability to upgrade many of our development (coding) and education (writing) communities to full and part time hours for an extended period of time. By providing stability in the lives of our members, we can focus them on delivering the code resources and educational material (for how to use the resources) in the fastest way possible.
Our COVID-19 Tech Efforts Thanks to 100x Group’s Support
Specifically, instead of focusing on building privacy-preserving COVID-19 apps directly, we have been focusing on lowering the barrier-to-entry to other projects implementing their apps in a privacy preserving way. We focus on privacy preserving components useful to the primary three COVID-19 app use cases: symptom survey, contact tracing, and proof-of-health (identity passport).
In addition to building general-purpose components for other projects to import, we consult directly on a variety of projects (MIT SafePaths, SafePass, COVI-ID, CoEpi, TCN Coalition, CoronaTrace, etc.) to validate their privacy architecture and help them leverage our components if necessary. We also help funders with technical vetting.
What we’ve been building
As the space is constantly changing, our strategy is to focus on building general purpose components which lots of apps are likely to require but unlikely to be able to build or source from existing open-source repositories. While the initial inspiration for these components all came from our time spent consulting on COVID-19 apps, because of the volatility of the market and the time it takes to spin up/down teams focused on each component, we have chosen to complete these components regardless of the current “adoption sentiment” we are getting from apps at any current time. In other words, when we choose to develop a technology according to our decision framework (outlined in the list below) we are making bets and then “staying the course” until they are finished, regardless of adoption. Our decision framework for deciding whether to set up a team to build a component is the following. We build components which are:
- General-use privacy preserving components
- Can’t be deployed across all COVID-19-relevant mobile/server platforms (Android, iOS, React Native, Python, and Node.js)
- Requires rare privacy talent COVID-19 app teams are unlikely to acquire
- Leverages techniques which have strong academic (theoretical) basis.
- Relevant to primary COVID-19 app use-cases
- General enough to be useful beyond the COVID-19 pandemic.
Two tools that are important for COVID-19 applications yet useful beyond the pandemic are differential privacy and private set intersection.
We’re building an easy to use wrapper around a robust cryptography library for use in mobile apps and browsers.
The data used for many COVID-19 related apps will be sensitive data: locations, health information, etc. We must ensure a user is not affected (e.g. not harmed) by their entry or participation in an app’s database. Differential privacy is considered to be one of the state-of-the-art concepts that can help us achieve this goal.
Differential privacy is a useful component in providing privacy for a wide range of projects, however there is currently only one (to our knowledge) differential privacy library that is open-source, deployed to millions of devices already, openly licensed, and is truly robust in its implementation: Google’s Differential Privacy C++ library. However, C++ by itself doesn’t run in any of the contexts we need to run apps — mobile phones, browsers, mobile browsers, and servers. We’re creating an ensemble of new open-source libraries that wrap Google’s C++ library to enable the best-in-class cryptography implementation that Google has produced, to be run by anyone, anywhere.
- PyDP: Python wrapper for Google's Differential Privacy project
- org.openmined.dp: Google’s DP project in Java family of languages (Java, Scala, Kotlin)
- SwiftDP: Swift wrapper for Google’s Differential Privacy Project
We’ve already launched the PyDP library, and hosted our launch event on August 29.
Private Set Intersection
We’re building open-source libraries for private set intersection to run on many devices in many contexts.
Private set intersection is a powerful cryptographic technique which allows two parties to compare data with one another without exposing their raw data to the other party. In the context of COVID-19 contact tracing the two parties are:
- A centralised data store: Containing locations which infected patients visited prior to their isolation.
- Many individual mobile phones: Containing location information of the individual owner of the phone.
To determine whether an individual has visited a location previously visited by a known patient there has to be a comparison between the two data sources. Private set intersection allows the two groups to determine if there are any common locations between them. Crucially, private set intersection prevents the centralised data store seeing what’s on the user’s phone and the user seeing what’s on the server. This results in a contact tracing app which does not need to publish infected patient locations publicly- where they could be exploited or abused. Furthermore, the app does not need to store all individual user locations in a centralised data store- preventing the creation of a treasure trove of personally identifying information.