Using Census Data

in Research

An Introduction to Statistics Canada

Census Data

 

 

Cody Fullerton

Data & Social Science Librarian

University of Manitoba Libraries

Overview:

  1. Data Liberation Initiative
  2. Geography Terminology
  3. Statistics Canada Data Portal
  4. Microdata
  5. <odesi> & Nesstar
  6. Beyond 20/20 Demonstration

Data Liberation Initiative (DLI)

  • The DLI is a partnership between post-secondary institutions and StatCan for improving access to Canadian data resources.
  • The program began as a way for Canadian universities to collectively purchase StatCan files for use by researchers.
  • It has evolved into training and support for university data librarians so they can best serve their communities.
  • As members, we have free access to PUMFs, specialized aggregate tables, and PCCF.

Geography Terminology

Census Metropolitan Area & Census Agglomeration (CMA/CA)

Area consisting of one or more neighbouring municipalities situated around a core. A census metropolitan area must have a total population of at least 100,000 of which 50,000 or more live in the core. A census agglomeration must have a core population of at least 10,000.

Census Tract

Census tracts (CTs) are small, relatively stable geographic areas that usually have a population between 2,500 and 8,000 persons. They are located in census metropolitan areas and in census agglomerations that had a core population of 50,000 or more in the previous census.

Census Subdivision (CSD)

Area that is a municipality or an area that is deemed to be equivalent to a municipality for statistical reporting purposes (e.g., as an Indigenous reservations or an unorganized territory). Municipal status is defined by laws in effect in each province and territory in Canada.

Dissemination Area (DA)

Small area composed of one or more neighbouring blocks, with a population of 400 to 700 persons. All of Canada is divided into dissemination areas.

Dissemination Block (DB)

An area equivalent to a city block bounded by intersecting streets. These areas cover all of Canada.

Postal Codes

A quick note on Postal Codes:

  • They are not typically used as census geography
  • Sometimes data is sorted by Forward Sortation Area (FSA), the first three digits of a postal code
  • If a researcher needs data by postal code they need to use the Postal Code Conversion File (PCCF)

Census vs National Household Survey (NHS)

  • In 2011 the long form census was made optional and replaced by the mandatory short-form NHS
  • Data that the 2011 Census is accurate, but the short form survey includes less variables and as a result, is much less useful to the government and researchers.

Statistics Canada Data Portal

Find everything from federal census data, to provincial data, to local district data.

  1. Geography 
  2. Data
  3. 2016 Census Profile

What is Microdata?

In the study of survey and census data, microdata is information at the level of individual respondents.

Advantages:

Census results are most commonly published as aggregates both for privacy reasons and because of the large quantities of data involved; microdata for one census can easily contain millions of records, each with several dozen data items.

Summarizing results to an aggregate level results in information loss. For instance, if statistics for education and employment are aggregated separately, they cannot be used to explore a relationship between them. Access to microdata allows researchers much more freedom to investigate such interactions and perform detailed analysis.

Disadvantages:

Microdata analysis requires a well developed understanding of statistics and the software that you're using.

Common software choices for analyzing microdata:

  • SAS
  • STATA
  • SPSS
  • Excel
  • Beyond 20/20

<odesi>

<odesi> (Ontario Data Documentation, Extraction Service and Infrastructure) is a digital repository for social science data, including polling data. It is a web-based data exploration, extraction and analysis tool. It provides researchers the ability to search for variables across thousands of datasets. There are both microdata and aggregate data available, in a range of formats.

Nesstar &RDS

Nesstar is a web-based exploration, extraction and analysis tool for social science data. The NESSTAR data portal consists of Public Use Microdata Files (PUMF) and Master Files (RDC).

Rich Data Services (RDS) is an analytical platform for PUMFs and their metadata. The RDS Explorer and Tabulation Engine's interfaces allow users to browse, interact, and download data and metadata for online or offline analysis.

Beyond 20/20

Demo & Activity

Questions?

Using Data in Census Research

By codyfullerton