Now you can request additional data and/or customized columns!
Try It Now! Certified
Files | Size | Format | Created | Updated | License | Source |
---|---|---|---|---|---|---|
2 | 129kB | csv zip | 5 years ago | 5 years ago | public_domain_dedication_and_license | Standard and Poors Case-Shiller Indices |
Download files in this dataset
File | Description | Size | Last changed | Download |
---|---|---|---|---|
cities | Case-Shiller US home price index levels at national and city level. Monthly. | 52kB | csv (52kB) , json (183kB) | |
house-prices-us_zip | Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. | 74kB | zip (74kB) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
Date | 1 | date (%Y-%m-%d) | |
AZ-Phoenix | 2 | number | |
CA-Los Angeles | 3 | number | |
CA-San Diego | 4 | number | |
CA-San Francisco | 5 | number | |
CO-Denver | 6 | number | |
DC-Washington | 7 | number | |
FL-Miami | 8 | number | |
FL-Tampa | 9 | number | |
GA-Atlanta | 10 | number | |
IL-Chicago | 11 | number | |
MA-Boston | 12 | number | |
MI-Detroit | 13 | number | |
MN-Minneapolis | 14 | number | |
NC-Charlotte | 15 | number | |
NV-Las Vegas | 16 | number | |
NY-New York | 17 | number | |
OH-Cleveland | 18 | number | |
OR-Portland | 19 | number | |
TX-Dallas | 20 | number | |
WA-Seattle | 21 | number | |
Composite-10 | 22 | number | |
Composite-20 | 23 | number | |
National-US | 24 | number |
Use our data-cli tool designed for data wranglers:
data get https://datahub.io/core/house-prices-us
data info core/house-prices-us
tree core/house-prices-us
# Get a list of dataset's resources
curl -L -s https://datahub.io/core/house-prices-us/datapackage.json | grep path
# Get resources
curl -L https://datahub.io/core/house-prices-us/r/0.csv
curl -L https://datahub.io/core/house-prices-us/r/1.zip
If you are using R here's how to get the data you want quickly loaded:
install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")
json_file <- 'https://datahub.io/core/house-prices-us/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))
# get list of all resources:
print(json_data$resources$name)
# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
if(json_data$resources$datahub$type[i]=='derived/csv'){
path_to_file = json_data$resources$path[i]
data <- read.csv(url(path_to_file))
print(data)
}
}
Note: You might need to run the script with root permissions if you are running on Linux machine
Install the Frictionless Data data package library and the pandas itself:
pip install datapackage
pip install pandas
Now you can use the datapackage in the Pandas:
import datapackage
import pandas as pd
data_url = 'https://datahub.io/core/house-prices-us/datapackage.json'
# to load Data Package into storage
package = datapackage.Package(data_url)
# to load only tabular data
resources = package.resources
for resource in resources:
if resource.tabular:
data = pd.read_csv(resource.descriptor['path'])
print (data)
For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):
pip install datapackage
To get Data Package into your Python environment, run following code:
from datapackage import Package
package = Package('https://datahub.io/core/house-prices-us/datapackage.json')
# print list of all resources:
print(package.resource_names)
# print processed tabular data (if exists any)
for resource in package.resources:
if resource.descriptor['datahub']['type'] == 'derived/csv':
print(resource.read())
If you are using JavaScript, please, follow instructions below:
Install data.js
module using npm
:
$ npm install data.js
Once the package is installed, use the following code snippet:
const {Dataset} = require('data.js')
const path = 'https://datahub.io/core/house-prices-us/datapackage.json'
// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
const dataset = await Dataset.load(path)
// get list of all resources:
for (const id in dataset.resources) {
console.log(dataset.resources[id]._descriptor.name)
}
// get all tabular data(if exists any)
for (const id in dataset.resources) {
if (dataset.resources[id]._descriptor.format === "csv") {
const file = dataset.resources[id]
// Get a raw stream
const stream = await file.stream()
// entire file as a buffer (be careful with large files!)
const buffer = await file.buffer
// print data
stream.pipe(process.stdout)
}
}
})()
Case-Shiller Index of US residential house prices. Data comes from S&P Case-Shiller data and includes both the national index and the indices for 20 metropolitan regions. The indices are created using a repeat-sales methodology.
As per the home page for Indices on S&P website:
The S&P/Case-Shiller U.S. National Home Price Index is a composite of single-family home price indices for the nine U.S. Census divisions and is calculated monthly. It is included in the S&P/Case-Shiller Home Price Index Series which seeks to measure changes in the total value of all existing single-family housing stock.
Documentation of the methodology can be found at: http://www.spindices.com/documents/methodologies/methodology-sp-cs-home-price-indices.pdf
Key points are (excerpted from methodology):
To download and process the data do:
python scripts/process.py
Updated data files will then be in data
directory.
Note: the URLs and structure of the source data have evolved over time with the source data URLs changing on every release.
Originally (2013) the site provided a table of links but these are not direct file URLs and you have dig around in S&P’s javascript to find the actual download locations. As of mid-2014 the data is consolidated in one primary XLS but the HTML you see in your browser and the source HTML are different. In addition, the actual location of the XLS file continues to change on each release.
Any rights of the maintainer are licensed under the PDDL. Exact legal status of source data (and hence of resulting processe data) is unclear but could have a presumption of public domain given its factual nature and US provenance. However, the current application of PDDL is indicative of maintainers best-guess (and comes with no warranty).
Notifications of data updates and schema changes
Warranty / guaranteed updates
Workflow integration (e.g. Python packages, NPM packages)
Customized data (e.g. you need different or additional data)
Or suggest your own feature from the link below