Now you can request additional data and/or customized columns!
Try It Now!Files | Size | Format | Created | Updated | License | Source |
---|---|---|---|---|---|---|
9 | 30MB | csv zip | 3 years ago | 3 years ago | Open Data Commons Public Domain Dedication and License |
Download files in this dataset
File | Description | Size | Last changed | Download |
---|---|---|---|---|
key-countries-pivoted | 13kB | csv (13kB) , json (34kB) | ||
countries-aggregated | 1MB | csv (1MB) , json (4MB) | ||
reference | 394kB | csv (394kB) , json (1MB) | ||
time-series-19-covid-combined | 3MB | csv (3MB) , json (9MB) | ||
us_confirmed | 81MB | csv (81MB) , json (194MB) | ||
us_deaths | 84MB | csv (84MB) , json (208MB) | ||
us_simplified | 37MB | csv (37MB) , json (119MB) | ||
worldwide-aggregate | 11kB | csv (11kB) , json (26kB) | ||
data_zip | Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. | 29MB | zip (29MB) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
Date | 1 | string (default) | |
China | 2 | integer (default) | |
US | 3 | integer (default) | |
United_Kingdom | 4 | integer (default) | |
Italy | 5 | integer (default) | |
France | 6 | integer (default) | |
Germany | 7 | integer (default) | |
Spain | 8 | integer (default) | |
Iran | 9 | integer (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
Date | 1 | string (default) | |
Country | 2 | string (default) | |
Confirmed | 3 | integer (default) | |
Recovered | 4 | integer (default) | |
Deaths | 5 | integer (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
UID | 1 | integer (default) | |
iso2 | 2 | string (default) | |
iso3 | 3 | string (default) | |
code3 | 4 | integer (default) | |
FIPS | 5 | string (default) | |
Admin2 | 6 | string (default) | |
Province_State | 7 | string (default) | |
Country_Region | 8 | string (default) | |
Lat | 9 | number (default) | |
Long_ | 10 | number (default) | |
Combined_Key | 11 | string (default) | |
Population | 12 | integer (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
Date | 1 | string (default) | |
Country/Region | 2 | string (default) | |
Province/State | 3 | string (default) | |
Lat | 4 | number (default) | |
Long | 5 | number (default) | |
Confirmed | 6 | integer (default) | |
Recovered | 7 | integer (default) | |
Deaths | 8 | integer (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
UID | 1 | integer (default) | |
iso2 | 2 | string (default) | |
iso3 | 3 | string (default) | |
code3 | 4 | integer (default) | |
FIPS | 5 | number (default) | |
Admin2 | 6 | string (default) | |
Lat | 7 | number (default) | |
Combined_Key | 8 | string (default) | |
Date | 9 | date (%Y-%m-%d) | |
Case | 10 | integer (default) | |
Long | 11 | number (default) | |
Country/Region | 12 | string (default) | |
Province/State | 13 | string (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
UID | 1 | integer (default) | |
iso2 | 2 | string (default) | |
iso3 | 3 | string (default) | |
code3 | 4 | integer (default) | |
FIPS | 5 | number (default) | |
Admin2 | 6 | string (default) | |
Lat | 7 | number (default) | |
Combined_Key | 8 | string (default) | |
Population | 9 | integer (default) | |
Date | 10 | date (%Y-%m-%d) | |
Case | 11 | integer (default) | |
Long | 12 | number (default) | |
Country/Region | 13 | string (default) | |
Province/State | 14 | string (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
Date | 1 | date (%Y-%m-%d) | |
FIPS | 2 | number (default) | |
Admin2 | 3 | string (default) | |
Province/State | 4 | string (default) | |
Confirmed | 5 | integer (default) | |
Deaths | 6 | integer (default) | |
Population | 7 | integer (default) | |
Country/Region | 8 | string (default) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
Date | 1 | string (default) | |
Confirmed | 2 | integer (default) | |
Recovered | 3 | integer (default) | |
Deaths | 4 | integer (default) | |
Increase rate | 5 | number (default) |
Use our data-cli tool designed for data wranglers:
data get https://datahub.io/core/data
data info core/data
tree core/data
# Get a list of dataset's resources
curl -L -s https://datahub.io/core/data/datapackage.json | grep path
# Get resources
curl -L https://datahub.io/core/data/r/0.csv
curl -L https://datahub.io/core/data/r/1.csv
curl -L https://datahub.io/core/data/r/2.csv
curl -L https://datahub.io/core/data/r/3.csv
curl -L https://datahub.io/core/data/r/4.csv
curl -L https://datahub.io/core/data/r/5.csv
curl -L https://datahub.io/core/data/r/6.csv
curl -L https://datahub.io/core/data/r/7.csv
curl -L https://datahub.io/core/data/r/8.zip
If you are using R here's how to get the data you want quickly loaded:
install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")
json_file <- 'https://datahub.io/core/data/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))
# get list of all resources:
print(json_data$resources$name)
# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
if(json_data$resources$datahub$type[i]=='derived/csv'){
path_to_file = json_data$resources$path[i]
data <- read.csv(url(path_to_file))
print(data)
}
}
Note: You might need to run the script with root permissions if you are running on Linux machine
Install the Frictionless Data data package library and the pandas itself:
pip install datapackage
pip install pandas
Now you can use the datapackage in the Pandas:
import datapackage
import pandas as pd
data_url = 'https://datahub.io/core/data/datapackage.json'
# to load Data Package into storage
package = datapackage.Package(data_url)
# to load only tabular data
resources = package.resources
for resource in resources:
if resource.tabular:
data = pd.read_csv(resource.descriptor['path'])
print (data)
For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):
pip install datapackage
To get Data Package into your Python environment, run following code:
from datapackage import Package
package = Package('https://datahub.io/core/data/datapackage.json')
# print list of all resources:
print(package.resource_names)
# print processed tabular data (if exists any)
for resource in package.resources:
if resource.descriptor['datahub']['type'] == 'derived/csv':
print(resource.read())
If you are using JavaScript, please, follow instructions below:
Install data.js
module using npm
:
$ npm install data.js
Once the package is installed, use the following code snippet:
const {Dataset} = require('data.js')
const path = 'https://datahub.io/core/data/datapackage.json'
// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
const dataset = await Dataset.load(path)
// get list of all resources:
for (const id in dataset.resources) {
console.log(dataset.resources[id]._descriptor.name)
}
// get all tabular data(if exists any)
for (const id in dataset.resources) {
if (dataset.resources[id]._descriptor.format === "csv") {
const file = dataset.resources[id]
// Get a raw stream
const stream = await file.stream()
// entire file as a buffer (be careful with large files!)
const buffer = await file.buffer
// print data
stream.pipe(process.stdout)
}
}
})()
Coronavirus disease 2019 (COVID-19) time series listing confirmed cases, reported deaths and reported recoveries. Data is disaggregated by country (and sometimes subregion). Coronavirus disease (COVID-19) is caused by the Severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) and has had a worldwide effect. On March 11 2020, the World Health Organization (WHO) declared it a pandemic, pointing to the over 118,000 cases of the Coronavirus illness in over 110 countries and territories around the world at the time.
This dataset includes time series data tracking the number of people affected by COVID-19 worldwide, including:
Data is in CSV format and updated daily. It is sourced from this upstream repository maintained by the amazing team at Johns Hopkins University Center for Systems Science and Engineering (CSSE) who have been doing a great public service from an early point by collating data from around the world.
We have cleaned and normalized that data, for example tidying dates and consolidating several files into normalized time series. We have also added some metadata such as column descriptions and data packaged it.
You can view the data, its structure as well as download it in alternative formats (e.g. JSON) from the DataHub:
https://datahub.io/core/covid-19
The upstream dataset currently lists the following upstream data sources:
Aggregated data sources:
US data sources at the state (Admin1) or county/city (Admin2) level:
Non-US data sources at the country/region (Admin0) or state/province (Admin1) level:
We will endeavour to provide more detail on how regularly and by which technical means the data is updated. Additional background is available in the CSSE blog, and in the Lancet paper (DOI), which includes this figure:
This repository uses Pandas to process and normalize the data.
You first need to install the dependencies:
pip install -r requirements.txt
Then run the script:
python get_data.py
python process_us_data.py
This dataset is licensed under the Open Data Commons Public Domain and Dedication License.
The data comes from a variety public sources and was collated in the first instance via Johns Hopkins University on GitHub. We have used that data and processed it further. Given the public sources and factual nature we believe that there the data is public domain and are therefore releasing the results under the Public Domain Dedication and License. We are also, of course, explicitly licensing any contribution of ours under that license.
Notifications of data updates and schema changes
Warranty / guaranteed updates
Workflow integration (e.g. Python packages, NPM packages)
Customized data (e.g. you need different or additional data)
Or suggest your own feature from the link below