Now you can request additional data and/or customized columns!

Try It Now!

US House Price Index (Case-Shiller)

Certified

core

Files Size Format Created Updated License Source
2 129kB csv zip 6 years ago 6 years ago public_domain_dedication_and_license Standard and Poors Case-Shiller Indices
Case-Shiller Index of US residential house prices. Data comes from S&P Case-Shiller data and includes both the national index and the indices for 20 metropolitan regions. The indices are created using a repeat-sales methodology. Data As per the home page for Indices on S&P website: > The read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
cities Case-Shiller US home price index levels at national and city level. Monthly. 52kB csv (52kB) , json (183kB)
house-prices-us_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 74kB zip (74kB)

cities  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Date 1 date (%Y-%m-%d)
AZ-Phoenix 2 number
CA-Los Angeles 3 number
CA-San Diego 4 number
CA-San Francisco 5 number
CO-Denver 6 number
DC-Washington 7 number
FL-Miami 8 number
FL-Tampa 9 number
GA-Atlanta 10 number
IL-Chicago 11 number
MA-Boston 12 number
MI-Detroit 13 number
MN-Minneapolis 14 number
NC-Charlotte 15 number
NV-Las Vegas 16 number
NY-New York 17 number
OH-Cleveland 18 number
OR-Portland 19 number
TX-Dallas 20 number
WA-Seattle 21 number
Composite-10 22 number
Composite-20 23 number
National-US 24 number

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/core/house-prices-us
data info core/house-prices-us
tree core/house-prices-us
# Get a list of dataset's resources
curl -L -s https://datahub.io/core/house-prices-us/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/core/house-prices-us/r/0.csv

curl -L https://datahub.io/core/house-prices-us/r/1.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/core/house-prices-us/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/core/house-prices-us/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/core/house-prices-us/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/core/house-prices-us/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

Case-Shiller Index of US residential house prices. Data comes from S&P Case-Shiller data and includes both the national index and the indices for 20 metropolitan regions. The indices are created using a repeat-sales methodology.

Data

As per the home page for Indices on S&P website:

The S&P/Case-Shiller U.S. National Home Price Index is a composite of single-family home price indices for the nine U.S. Census divisions and is calculated monthly. It is included in the S&P/Case-Shiller Home Price Index Series which seeks to measure changes in the total value of all existing single-family housing stock.

Documentation of the methodology can be found at: http://www.spindices.com/documents/methodologies/methodology-sp-cs-home-price-indices.pdf

Key points are (excerpted from methodology):

  • The indices use the “repeat sales method” of index calculation which uses data on properties that have sold at least twice, in order to capture the true appreciated value of each specific sales unit.
  • The quarterly S&P/Case-Shiller U.S. National Home Price Index aggregates nine quarterly U.S. Census division repeat sales indices using a base period a nd estimates of the aggregate value of single family housing stock for those periods.
  • The S&P/Case - Shiller Home Price Indices originated in the 1980s by Case Shiller Weiss’s research principals, Karl E. Case and Robert J. Shiller. At the time, Case and Shiller developed the repeat sales pricing technique. This methodology is recognized as the most reliable means to measure housing price movements and is used by other home price ind ex publishers, including the Office of Federal Housing Enterprise Oversight (OFHEO)

Preparation

To download and process the data do:

python scripts/process.py

Updated data files will then be in data directory.

Note: the URLs and structure of the source data have evolved over time with the source data URLs changing on every release.

Originally (2013) the site provided a table of links but these are not direct file URLs and you have dig around in S&P’s javascript to find the actual download locations. As of mid-2014 the data is consolidated in one primary XLS but the HTML you see in your browser and the source HTML are different. In addition, the actual location of the XLS file continues to change on each release.

License

Any rights of the maintainer are licensed under the PDDL. Exact legal status of source data (and hence of resulting processe data) is unclear but could have a presumption of public domain given its factual nature and US provenance. However, the current application of PDDL is indicative of maintainers best-guess (and comes with no warranty).


Keywords and keyphrases: case shiller index data, house price index, us house price index.
Datapackage.json

Request Customized Data


Notifications of data updates and schema changes

Warranty / guaranteed updates

Workflow integration (e.g. Python packages, NPM packages)

Customized data (e.g. you need different or additional data)

Or suggest your own feature from the link below