Cancer Research Data Catalog
Databases available through the Cancer Center Observational Methods Shared Resource
Cancer Registries
Topic/focus: |
Breast Cancer: Treatment, processes of care, quality of life, medications, hospital characteristics, outcomes |
Year(s): |
2002-2008 |
Source: |
Medicare Files (see Medicare below) Patient Survey American Hospital Association (AHA) Annual Survey American Medical Association (AMA) Physician Masterfile Centers for Medicare and Medicaid Services (CMS) Provider of Service File State tumor registries (CA, FL, IL, NY) |
Study and sample characteristics: |
3083 women (65+ years-old) who underwent incident breast cancer surgery in 2003 and were followed annually with telephone surveys through 2008 which collected sociodemographic, treatment and quality of life information. Patient information was linked to a variety of data sources (listed above). Driving distance to nearest hospital and high volume hospital have been calculated. |
Universe: |
Cancer registries in CA, FL, IL and NY, patients participating in survey |
Variables: |
Tumor information based on state tumor registries; treatment based on claims, outcomes based on claims and self-report, treatment based on self-reported survey, outcomes based on survey. |
Access: |
Contact Tina Yen, Director of Observational Methods Shared Resource |
Key web links: |
See Medicare, AHA, AMA, NAACCR links |
Summary: |
3,083 women (65+ years-old) treated in 2003 for incident breast cancer for whom extensive information has been collected via annual surveys and Medicare claims through 2008. Patients have been linked to their providers (hospital and surgeon) and provider-level information is available (AHA and AMA). |
Topic/focus: |
Breast Cancer |
Year(s): |
1998-2011 |
Source: |
American College of Surgeons – Commission on Cancer |
Study and sample characteristics: |
Approximately 2.7 million women diagnosed with breast cancer in U.S. between 1998-2011 |
Universe: |
70 percent of all newly diagnosed cases of cancer in the US are diagnosed and treated at CoC-accredited cancer programs and reported to the NCDB. The NCDB, begun in 1989, contains case reports on over 29 million cancers diagnosed between 1985 and 2010. |
Variables: |
See first link below for available variables. |
Access: |
Data Use Agreement and Formal Review |
Key web links: |
|
Summary: |
Distributed Participant User Files (PUFs) are organ specific and contain sociodemographic, tumor, treatment, survival and hospital information. |
Topic/focus: |
Breast Cancer |
Year(s): |
2005-2009 |
Source: |
North American Association of Central Cancer Registries |
Study and sample characteristics: |
Reported breast cancer |
Universe: |
Cancer registries in US states |
Access: |
Publicly available downloads of aggregated data |
Key web links: |
|
Summary: |
Established in 1987, NAACCR, Inc. is a collaborative umbrella organization that develops and promotes uniform data standards for cancer registration; provides education and training; certifies population-based registries; aggregates and publishes data from central cancer registries; and promotes the use of cancer surveillance data and systems for cancer control and epidemiologic research, public health programs, and patient care to reduce the burden of cancer in North America. All central cancer registries in the United States and Canada are members. State age-specific (crude) incident rates of ductal carcinoma in situ (DCIS) and invasive breast cancer by age group (women only, age 65+). No data for Kansas or Vermont (data suppressed). Aggregated data for other cancers are publicly available. |
Topic/focus: |
Breast Cancer |
Year(s): |
2005-2009 |
Source: |
National Cancer Institute and the Centers for Disease Control and Prevention |
Study and sample characteristics: |
Cancer registry supplemented with Medicare claims |
Universe: |
Breast cancer and 5% cancer- free Medicare random sample of participants residing in SEER registry areas: CA, CT, Detroit metro area, GA, HI, IA, KY, LA, NJ, NM, Seattle/Puget Sound and UT |
Variables: |
See below links. |
Access: |
Requires a data use agreement and formal review |
Key web links: |
SEER and SEER Medicare |
Summary: |
The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death. Medicare claim files available are from 2000-2010, unless otherwise noted: MedPAR, Outpatient SAF, Carrier SAF and PDE (2007-2010). Restricted information includes patient's ZIP code and Census Tract. |
Topic/focus: |
Breast Cancer |
Year(s): |
1973-2005 |
Source: |
National Cancer Institute and the Centers for Disease Control and Prevention |
Study and sample characteristics: |
Cancer registry supplemented with Medicare claims |
Universe: |
Breast cancer and 5% cancer free Medicare random sample residing in SEER registry areas: CA, CT, Detroit metro area, GA, HI, IA, KY, LA, NJ, NM, Seattle/Puget Sound and UT |
Variables: |
See links below. |
Access: |
Requires a data use agreement and formal review |
Key web links: |
SEER and SEER Medicare |
Summary: |
The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death. Medicare claim files are available 1991-2007, unless otherwise noted: MedPAR (1986-2007), Outpatient SAF, Carrier SAF, Hospice SAF, DME (1994-2007) |
Topic/focus: |
Extra-hepatic biliary cancer |
Year(s): |
1991-2011 |
Source: |
National Cancer Institute and the Centers for Disease Control and Prevention |
Study and sample characteristics: |
Cancer registry supplemented with Medicare claims, and AHA hospital data |
Universe: |
Extra-hepatic biliary cancer and 5% cancer free Medicare random sample of participants residing in SEER registry areas: CA, CT, Detroit metro area, GA, HI, IA, KY, LA, NJ, NM, Seattle/Puget Sound and UT |
Variables: |
See below links. |
Access: |
Requires a data use agreement and formal review |
Key web links: |
SEER and SEER Medicare |
Summary: |
The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death. Medicare claim files available (1991-2011 claim years): MedPAR, Outpatient SAF, Carrier SAF |
Topic/focus: |
Most cancers |
Year(s): |
1973-2010 |
Source: |
National Cancer Institute and the Centers for Disease Control and Prevention |
Study and sample characteristics: |
See summary below. |
Universe: |
The SEER research data include SEER incidence and population data associated by age, sex, race, year of diagnosis, and geographic areas (including SEER registry and county). |
Variables: |
See link below for available variables. |
Access: |
Publically available but requires Data Use Agreement |
Key web links: |
|
Summary: |
The current release includes data for cases diagnosed between 1973-2010, and is based on the November 2012 data submission from 18 SEER registries, who contribute cases from different years of diagnoses based on when they joined the SEER program. |
Administrative Datasets
Topic/focus: |
Chronic Condition Warehouse for Breast Cancer |
Year(s): |
2001-2008 |
Source: |
Centers for Medicare & Medicaid Services |
Study and sample characteristics: |
Medicare claims for breast cancer diagnosis patients in the CCW |
Universe: |
Medicare claims for patients diagnosed 2001-2008 with follow-up through 2011 (and pre-2001: 1999-2000) |
Access: |
Requires a data use agreement and formal review |
Key web links: |
Key web links: Chronic Condition Warehouse and Research Data Assistance Center |
Summary: |
A collection of medical claims from the Medicare insurance program which has coverage for virtually all medical care (including Part D for pharmaceuticals 2006+) for those eligible and enrolled. Medicare provides detailed information about Medicare beneficiaries including eligibility/enrollment (as well as Medicaid), geographic location and mortality. Medicare claim files available (1999-2011): Inpatient SAF, Outpatient SAF, Carrier SAF and PDE. Claims are unavailable for those enrolled in a Medicare HMO (conversely, HMO patients' Part D claims are available). For Part D patients, claims are unavailable for those enrolled in Creditable Coverage or Retiree Drug Subsidy. |
Topic/focus: |
Medicare National Insurance Program for the Elderly and Disabled |
Year(s): |
1999-2003 |
Source: |
Center for Medicare & Medicaid Services |
Study and sample characteristics: |
Medicare claims |
Universe: |
100% census of Medicare claims (1999-2003 unless otherwise noted), FL, IL, NY and WI (1999-2002) |
Access: |
Requires a data use agreement and formal review |
Key web links: |
|
Summary: |
A population-based large administrative databank of medical claims from the Medicare insurance program which has coverage for virtually all non-pharmaceutical medical care (prior to Part D in 2006+) for those eligible and enrolled (claims are unavailable for those enrolled in a Medicare HMO). Medicare provides detailed information about Medicare beneficiaries including eligibility/enrollment (as well as Medicaid), geographic location and mortality. Medicare claim files available from 1999-2003, unless otherwise noted: Inpatient SAF, Outpatient SAF, Carrier SAF, DME SAF (2003), HHA SAF (2003), Hospice SAF (2003) and SNF SAF (2003). Note: 2002 SAF claims are also available for CA only. |
Provider-Related Datasets
Topic/focus: |
Hospital information |
Year(s): |
1996, 1998, 2000-2011 |
Source: |
American Hospital Association (AHA) |
Study and sample characteristics: |
Survey of 6500+ hospitals |
Universe: |
All United States hospitals |
Variables: |
1000+, including organizational structure, facility and service lines, inpatient and outpatient utilization, expenses, physician arrangements, staffing, corporate and purchasing affiliations and geographic indicators. |
Access: |
Data use agreement required |
Cost: |
~$8,000 for one year of data |
Key web links: |
|
Summary: |
Conducted annually by the AHA since 1946, the AHA Annual Survey is one of the most comprehensive sources for individual hospital data available. Hospitals report data for a complete fiscal year. Contains data items on organizational structure, facilities, services, community orientation, utilization, financing, and personnel. |
Topic/focus: |
Provider (physician) information |
Year(s): |
2004 |
Source: |
American Medical Association (AMA) |
Study and sample characteristics: |
Survey |
Universe: |
United States Physicians who are members of the AMA |
Variables: |
~30 Variables including gender, date of graduation, specialty, etc. |
Access: |
Restricted Use |
Key web links: |
|
Summary: |
Established by the American Medical Association (AMA) in 1906, the Physician Masterfile was initially developed as a record keeping device supporting membership and mailing activities. Since then, the Masterfile has expanded to include significant education, training and professional certification information on virtually all Doctors of Medicine (MD) and Doctors of Osteopathic Medicine (DO) in the United States, Puerto Rico, Virgin Islands, and certain Pacific Islands. The Physician Masterfile includes current and historical data for more than 1.4 million physicians, residents, and medical students in the United States. This figure includes approximately 411,000 graduates of foreign medical schools who reside in the United States and who have met the educational and credentialing requirements necessary for recognition. |
Topic/focus: |
Cost and Utilization, Demographics, Healthcare Facilities and Services, Vital events, Other |
Year(s): |
2008 |
Source: |
Health Resources and Services Administration (HRSA) |
Study and sample characteristics: |
Survey and vital records |
Universe: |
All United States counties |
Access: |
Current data is available online after completion of a data use agreement |
Key web links: |
|
Summary: |
Produced annually, the ARF is a county-level compilation of existing data from numerous sources including the American Hospital Association, the American Medical Association, the U.S. Census Bureau, the National Center for Health Statistics, and the Health Care Financing Administration. The ARF is cumulative, with the completeness and frequency of data elements varying by source. The ARF contains data items on health professions, health professions training, health facilities, hospitalization utilization, hospital expenditures, population characteristics and economic data, and environment. Also available are geographic descriptors—such as Federal Information Processing Standards (FIPS) codes and Metropolitan Area (MA) codes—that allow for aggregation of county data into other geographic groupings. |
Sociodemographic Databases
Topic/focus: |
Rural-Urban |
Year(s): |
2006 |
Source: |
WWAMI Rural Health Research Center (WWAMI = Washington, Wyoming, Alaska, Montana, Idaho) |
Study and sample characteristics: |
2006 ZIP code areas (approximated from Census tracts using Claritas Inc. crosswalk) |
Universe: |
2000 Census commuting data for U.S. residents and 2006 ZIP code areas |
Access: |
Publicly available |
Key web links: |
|
Summary: |
RUCAs, Rural-Urban Commuting Area Codes, are a newer Census tract-based classification scheme that utilizes the standard Bureau of Census Urbanized Area and Urban Cluster definitions in combination with work commuting information to characterize all of the nation's Census tracts regarding their rural and urban status and relationships. In addition, a ZIP Code RUCA approximation has been developed. |
Topic/focus: |
Demographics |
Year(s): |
2000 |
Source: |
U.S. Census Bureau |
Study and sample characteristics: |
Demographic and economic information determined by survey |
Universe: |
U.S. Residents |
Access: |
Publicly available downloads from the website |
Cost: |
Free |
Key web links: |
|
Summary: |
Census of Population and Housing data is derived from the 2000 decennial census. The decennial census was redesigned following the 2000 census. The Census Bureau also provides statistics from the American Community Survey and the Economic Census. Other resources include GIS mapping tools (e.g. shapefiles and spatial data) and interactive population and economic maps. |