Cancer Center

EmailEmail    |   Bookmark Page Bookmark  |   RSS Feeds RSS  |   Print Page Print  

Databases Available through the Cancer Center Observational Methods Shared Resource

(updated March 2014)

Cancer Registries           Administrative Datasets           Sociodemographic Databases 


CANCER REGISTRIES

National Cancer Data Base (NCDB) Participant User File (PUF)

Topic/focus:

Breast Cancer

Year(s):

1998-2011

Source:

American College of Surgeons – Commission on Cancer

Study and sample characteristics:

Approximately 2.7 million women diagnosed with breast cancer in U.S.  between 1998-2011

Universe:

70 percent of all newly diagnosed cases of cancer in the US are diagnosed and treated at CoC-accredited cancer programs and reported to the NCDB. The NCDB, begun in 1989, contains case reports on over 29 million cancers diagnosed between 1985 and 2010.

Variables:

See first link below for available variables.

Access:

Data Use Agreement and Formal Review

Key web links:

http://ncdbpufbeta.facs.org

http://www.facs.org/cancer/ncdb/

 

Summary:

Distributed Participant User Files (PUFs) are organ specific and contain sociodemographic, tumor, treatment, survival and hospital information.

North American Association of Central Cancer Registries (NAACCR)

Topic/focus:

Breast Cancer

Year(s):

2005-2009

Source:

North American Association of Central Cancer Registries

Study and sample characteristics:

Reported breast cancer

Universe:

Cancer registries in US states

Access:

Publically available downloads of aggregated data

Key web links:

http://www.naaccr.org/ and http://faststats.naaccr.org/

Summary:

Established in 1987, NAACCR, Inc. is a collaborative umbrella organization that develops and promotes uniform data standards for cancer registration; provides education and training; certifies population-based registries; aggregates and publishes data from central cancer registries; and promotes the use of cancer surveillance data and systems for cancer control and epidemiologic research, public health programs, and patient care to reduce the burden of cancer in North America. All central cancer registries in the United States and Canada are members.

State age-specific (crude) incident rates of ductal carcinoma in situ (DCIS) and invasive breast cancer by age group (women only, age 65+).  No data for Kansas or Vermont (data suppressed).

 

Aggregated data for other cancers are publically available.

Surveillance Epidemiology and End Results (SEER) Program

Topic/focus:

Most cancers

Year(s):

1973-2010

Source:

National Cancer Institute and the Centers for Disease Control and Prevention

Study and sample characteristics:

See summary below.

Universe:

The SEER research data include SEER incidence and population data associated by age, sex, race, year of diagnosis, and geographic areas (including SEER registry and county).

Variables:

See link below for available variables.

Access:

Publically available but requires Data Use Agreement

Key web links:

http://seer.cancer.gov/about/

Summary:

The current release includes data for cases diagnosed between 1973-2010, and is based on the November 2012 data submission from 18 SEER registries, who contribute cases from different years of diagnoses based on when they joined the SEER program.

SEER-Medicare: Breast Cancer

Topic/focus:

Breast Cancer

Year(s):

2005-2009

Source:

National Cancer Institute and the Centers for Disease Control and Prevention

Study and sample characteristics:

Cancer registry supplemented with Medicare claims

Universe:

Breast cancer and 5% cancer- free Medicare random sample of participants residing in SEER registry areas:  CA, CT, Detroit metro area, GA, HI, IA, KY, LA, NJ, NM, Seattle/Puget Sound and UT

Variables:

See below links.

Access:

Requires a data use agreement and formal review

Key web links:

http://seer.cancer.gov/ and http://healthservices.cancer.gov/seermedicare/

Summary:

The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death.

Medicare claim files available  are from 2000-2010, unless otherwise noted:

MedPAR, Outpatient SAF, Carrier SAF and PDE (2007-2010). Restricted information includes patient's ZIP code and Census Tract.

SEER-Medicare: Breast Cancer

Topic/focus:

Breast Cancer

Year(s):

1973-2005

Source:

National Cancer Institute and the Centers for Disease Control and Prevention

Study and sample characteristics:

Cancer registry supplemented with Medicare claims

Universe:

Breast cancer and 5% cancer free Medicare random sample residing in SEER registry areas:  CA, CT, Detroit metro area, GA, HI, IA, KY, LA, NJ, NM, Seattle/Puget Sound and UT

Variables:

See links below.

Access:

Requires a data use agreement and formal review

 

Key web links:

 

http://seer.cancer.gov/ and http://healthservices.cancer.gov/seermedicare/

Summary:

The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death. 

Medicare claim files are available 1991-2007, unless otherwise noted:

MedPAR (1986-2007), Outpatient SAF, Carrier SAF , Hospice SAF, DME (1994-2007)

SEER-Medicare: Extrahepatic biliary cancer

Topic/focus:

Extra-hepatic biliary cancer

Year(s):

1991-2011

Source:

National Cancer Institute and the Centers for Disease Control and Prevention

Study and sample characteristics:

Cancer registry supplemented with Medicare claims, and AHA hospital data

Universe:

Extra-hepatic biliary cancer and 5% cancer free Medicare random sample of participants residing in SEER registry areas:  CA, CT, Detroit metro area, GA, HI, IA, KY, LA, NJ, NM, Seattle/Puget Sound and UT

Variables:

See below links.

Access:

Requires a data use agreement and formal review

 

Key web links:

 

http://seer.cancer.gov/ and http://healthservices.cancer.gov/seermedicare/

Summary:

The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death. 

Medicare claim files available (1991-2011claim years ): MedPAR, Outpatient SAF, Carrier SAF

Breast Cancer Patient Survey Cohort

Topic/focus:

Breast Cancer:  Treatment, processes of care, quality of life, medications, hospital characteristics, outcomes

Year(s):

2002-2008

Source:

Medicare Files (see  Medicare below)

Patient Survey

American Hospital Association (AHA) Annual Survey

American Medical Association (AMA) Physician Masterfile

Centers for Medicare and Medicaid Services (CMS) Provider of Service File

State tumor registries (CA, FL, IL, NY)

Study and sample characteristics:

3083 women (65+ years-old) who underwent incident breast cancer surgery in 2003 and were followed annually with telephone surveys through 2008 which collected sociodemographic, treatment and quality of life information. Patient information was linked to a variety of data sources (listed above). Driving distance to nearest hospital and high volume hospital have been calculated.

Universe:

Cancer registries in CA, FL, IL and NY, patients participating in survey

Variables:

Tumor information based on state tumor registries; treatment based on claims, outcomes based on claims and self-report, treatment based on self-reported survey, outcomes  based on survey.

Access:

Contact Tina Yen, Director of Observational Methods Shared Resource

Key web links:

See Medicare, AHA, AMA, NAACCR links

Summary:

3,083 women (65+ years-old) treated in 2003 for incident breast cancer for whom extensive information has been collected via annual surveys and Medicare claims through 2008. Patients have been linked to their providers (hospital and surgeon) and provider-level information is available (AHA and AMA).

 

ADMINISTRATIVE DATASETS

Chronic Condition Warehouse (CCW)

Topic/focus:

Chronic Condition Warehouse for Breast Cancer

Year(s):

2001-2008

Source:

Centers for Medicare & Medicaid Services

Study and sample characteristics:

Medicare claims for breast cancer diagnosis patients in the CCW

Universe:

Medicare claims for patients diagnosed 2001-2008 with follow-up through 2011 (and pre-2001: 1999-2000)

 

Access:

Requires a data use agreement and formal review

Key web links:

Key web links: https://www.ccwdata.org and http://www.resdac.org/

Summary:

A collection of medical claims from the Medicare insurance program which has coverage for virtually all medical care (including Part D for pharmaceuticals 2006+) for those eligible and enrolled. Medicare provides detailed information about Medicare beneficiaries including eligibility/enrollment (as well as Medicaid), geographic location and mortality.  Medicare claim files available (1999-2011): Inpatient SAF, Outpatient SAF, Carrier SAF and PDE.

 

Claims are unavailable for those enrolled in a Medicare HMO (conversely, HMO patients' Part D claims are available). For Part D patients, claims are unavailable for those enrolled in Creditable Coverage or Retiree Drug Subsidy. 

Medicare

Topic/focus:

Medicare National Insurance Program for the Elderly and Disabled

Year(s):

1999-2003

Source:

Center for Medicare & Medicaid Services

Study and sample characteristics:

Medicare claims

Universe:

100% census of Medicare claims (1999-2003 unless otherwise noted), FL, IL, NY and WI (1999-2002)

Access:

Requires a data use agreement and formal review

Key web links:

http://www.resdac.org/

Summary:

A  population-based large administrative databank of medical claims from the Medicare insurance program which has coverage for virtually all non-pharmaceutical medical care (prior to Part D in 2006+) for those eligible and enrolled (claims are unavailable for those enrolled in a Medicare HMO).  Medicare provides detailed information about Medicare beneficiaries including eligibility/enrollment (as well as Medicaid), geographic location and mortality.  Medicare claim files available from 1999-2003, unless otherwise noted: Inpatient SAF, Outpatient SAF, Carrier SAF, DME SAF (2003), HHA SAF (2003), Hospice SAF (2003) and SNF SAF (2003).

Note:  2002 SAF claims are also available for CA only.

 

PROVIDER-RELATED DATASETS

 American Hospital Association (AHA) Annual Survey Database

Topic/focus:

Hospital information

Year(s):

1996, 1998, 2000-2011

Source:

American Hospital Association (AHA)

Study and sample characteristics:

 Survey of 6500+ hospitals

Universe:

All United States hospitals

Variables:

1000+, including organizational structure, facility and service lines, inpatient and outpatient utilization, expenses, physician arrangements, staffing, corporate and purchasing affiliations and geographic indicators.

Access:

Data use agreement required

Cost:

~$8,000 for one year of data

Key web links:

http://www.ahadataviewer.com/book-cd-products/aha-survey/

Summary:

Conducted annually by the AHA since 1946, the AHA Annual Survey is one of the most comprehensive sources for individual hospital data available. Hospitals report data for a complete fiscal year. Contains data items on organizational structure, facilities, services, community orientation, utilization, financing, and personnel.

American Medical Association (AMA) Physician Master File

Topic/focus:

Provider (physician) information

Year(s):

2004

Source:

American Medical Association (AMA)

Study and sample characteristics:

Survey

Universe:

United States Physicians who are members of the AMA

Variables:

~30 Variables including gender, date of graduation, specialty, etc.

Access:

Restricted Use

Key web links:

http://www.ama-assn.org/ama/pub/about-ama/physician-data-resources/physician-masterfile.page

Summary:

Established by the American Medical Association (AMA) in 1906, the Physician Masterfile was initially developed as a record keeping device supporting membership and mailing activities. Since then, the Masterfile has expanded to include significant education, training and professional certification information on virtually all Doctors of Medicine (MD) and Doctors of Osteopathic Medicine (DO) in the United States, Puerto Rico, Virgin Islands, and certain Pacific Islands.  The Physician Masterfile includes current and historical data for more than 1.4 million physicians, residents, and medical students in the United States. This figure includes approximately 411,000 graduates of foreign medical schools who reside in the United States and who have met the educational and credentialing requirements necessary for recognition.

Area Resource File (ARF)

Topic/focus:

Cost and Utilization, Demographics, Healthcare Facilities and Services, Vital events, Other

Year(s):

2008

Source:

Health Resources and Services Administration (HRSA)

Study and sample characteristics:

Survey and vital records

Universe:

All United States counties

Access:

Current data is available online after completion of a data use agreement

Key web links:

http://arf.hrsa.gov/

Summary:

Produced annually, the ARF is a county-level compilation of existing data from numerous sources including the American Hospital Association, the American Medical Association, the U.S. Census Bureau, the National Center for Health Statistics, and the Health Care Financing Administration. The ARF is cumulative, with the completeness and frequency of data elements varying by source.  The ARF contains data items on health professions, health professions training, health facilities, hospitalization utilization, hospital expenditures, population characteristics and economic data, and environment. Also available are geographic descriptors--such as Federal Information Processing Standards (FIPS) codes and Metropolitan Area (MA) codes--that allow for aggregation of county data into other geographic groupings.

 

SOCIODEMOGRAPHIC DATABASES

Rural-Urban Commuting Area Codes (RUCA)

Topic/focus:

Rural-Urban

Year(s):

2006

Source:

WWAMI Rural Health Research Center

(WWAMI = Washington, Wyoming, Alaska, Montana, Idaho)

Study and sample characteristics:

2006 ZIP code areas (approximated from Census tracts using Claritas Inc. crosswalk)

Universe:

2000 Census commuting data for U.S. residents and 2006 ZIP code areas

Access:

Publically available

Key web links:

http://depts.washington.edu/uwruca/

Summary:

RUCAs, Rural-Urban Commuting Area Codes, are a newer Census tract-based classification scheme that utilizes the standard Bureau of Census Urbanized Area and Urban Cluster definitions in combination with work commuting information to characterize all of the nation's Census tracts regarding their rural and urban status and relationships. In addition, a ZIP Code RUCA approximation has been developed.

U. S. Census Bureau Data

Topic/focus:

Demographics

Year(s):

2000

Source:

U.S. Census Bureau

Study and sample characteristics:

Demographic and economic information determined by survey

Universe:

U.S. Residents

Access:

Publically available downloads from the website

Cost:

Free

Key web links:

U. S. Census Bureau Data http://www.census.gov/main/www/access.html

Summary:

Census of Population and Housing data is derived from the 2000 decennial census. The decennial census was redesigned following the 2000 census.  The Census Bureau also provides statistics from the American Community Survey and the Economic Census. Other resources include GIS mapping tools (e.g. shapefiles and spatial data) and interactive population and economic maps.

 

 

CANCER REGISTRIES

National Cancer Data Base (NCDB) Participant User File (PUF)

North American Association of Central Cancer Registries (NAACCR)

Surveillance Epidemiology and End Results (SEER) Program

SEER-Medicare: Breast Cancer

SEER-Medicare: Extrahepatic biliary cancer

Breast Cancer Patient Survey Cohort

ADMINISTRATIVE DATASETS

Chronic Condition Warehouse (CCW)

Medicare

American Hospital Association (AHA) Annual Survey Database

American Medical Association (AMA) Physician Master File

Area Resource File (ARF)

SOCIODEMOGRAPHIC DATABASES

Rural-Urban Commuting Area Codes (RUCA)

U. S. Census Bureau Data

Area Resource File (ARF)

webmaster@mcw.edu
© 2014 Medical College of Wisconsin
Page Updated 05/22/2014