Blog & News
Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files: 2020 Data Assessment
Updated January 2023:The Transformed Medicaid Statistical Information System (T-MSIS) is the largest national database of current Medicaid and Children’s Health Insurance Program (CHIP) beneficiary information collected from U.S. states, territories, and the District of Columbia (DC).1 T-MSIS data are critical for monitoring and evaluating the utilization of Medicaid and CHIP, which together provide health insurance coverage to more than 90 million people.2
T-MSIS data files are challenging to use directly for research and analytic purposes due to their size and complexity. To optimize these files for health services research, CMS repackages them into a user-friendly, research-ready format called T-MSIS Analytic Files (TAF) Research Identifiable Files (RIF). One such file, the Annual Demographic and Eligibility (DE) file, contains race and ethnicity information for Medicaid and CHIP beneficiaries. This information is vital for assessing enrollment, access to services, and quality of care across racial and ethnic subgroups in the Medicaid/CHIP population, whose members are particularly vulnerable due to limited income, physical and cognitive disabilities, old age, complex medical conditions, housing insecurity, and other social, economic, behavioral, and health needs.
Completeness of race and ethnicity data reported to CMS remains inconsistent among the states, territories, and DC. To guide researchers and other consumers in their use of T-MSIS data, CMS produces data quality assessments of the race and ethnicity data along with other data such as enrollment, claims, expenditures, and service use. The Data Quality (DQ) assessments for race and ethnicity data have been posted for data years 2014 through 2020 and indicate varying levels of “concern” regarding race and ethnicity data completeness. Some data years have multiple data versions (e.g., Preliminary, Release 1, Release 2), each with its own DQ assessment. This blog explores 2020 Data Release 1, the most recent T-MSIS race and ethnicity data for which a DQ assessment is available.
Evaluation of T-MSIS Race and Ethnicity Data
DQ assessments for each year and release of T-MSIS data are housed in the Data Quality Atlas (DQ Atlas), an online evaluation tool developed as a companion to T-MSIS data.3 The DQ Atlas assesses T-MSIS race and ethnicity data using two criteria: the percentage of beneficiaries with missing race and/or ethnicity values in the TAF; and the number of race/ethnicity categories (out of five) that differ by more than ten percentage points between the TAF and American Community Survey (ACS) data. Taken together, these two criteria indicate the level of “concern” (i.e., reliability) for states’ T-MSIS race/ethnicity data. Five “concern” categories appear in the DQ Atlas: Low Concern, Medium Concern, High Concern, Unusable, and Unclassified. States with substantial missing race/ethnicity data or race/ethnicity data that are inconsistent with the ACS – a premier source of demographic data – are grouped into either the High Concern or Unusable categories, whereas states with relatively complete race/ethnicity data or race/ethnicity data that align with ACS estimates are grouped into either the Low Concern or Medium Concern categories. The Unclassified category includes states for which benchmark data are incomplete or unavailable for a given data year and version.
To construct the external ACS benchmark for evaluating T-MSIS data, creators of the DQ Atlas combine race and ethnicity categories in the ACS to mirror race and ethnicity categories reported in the TAF (see Table 1). More information about the evaluation of T-MSIS race and ethnicity data is available in the DQ Atlas’ Background and Methods Resource.
Table 1. Crosswalk of Race and Ethnicity Variables between the TAF and ACS
Race/Ethnicity Category |
Race/Ethnicity Flag Value in TAF |
Combination of Race and Hispanic Variables in ACS |
Hispanic, all races |
7=Hispanic, all races | Hispanic, all races |
Other races, non-Hispanic |
4= American Indian and Alaska Native, non-Hispanic 5=Hawaiian/Pacific Islander 6=Multiracial, non-Hispanic |
- American Indian alone - Alaska Native alone - American Indian and Alaska Native tribes specified; or American Indian or Alaska native, non-specified and no other race - Native Hawaiian and other Pacific Islander alone - Some other race alone - Two or more races |
Source: Medicaid.gov. (n.d.). DQ Atlas: Background and methods resource [PDF file]. Available from https://www.medicaid.gov/dq-atlas/downloads/background_and_methods/TAF_DQ_Race_Ethnicity.pdf. Accessed January 5, 2023.
Quality Assessment by State
Table 2 shows the Race and Ethnicity DQ Assessments for the 2020 TAF (Data Version: Release 1). Approximately the same number of states received a rating of Low Concern (15 states), Medium Concern (17 states, including PR), and High Concern (16 states, including DC). Four states (Alabama, Kansas, Rhode Island, and Tennessee) received an “Unusable” rating, as each of these states was missing at least 50 percent of race/ethnicity data. Most of the Medium Concern states (14 of 17) fell into the subcategory denoting the higher percentage range of missing race/ethnicity data (from 10 percent up to 20 percent). A similar pattern can be seen among the High Concern states, most of which (15 of 16) fell into the subcategory denoting the highest percentage range of missing race/ethnicity data (from 20 percent up to 50 percent). The categorization criteria used to determine the levels of concern for the 2020 TAF Release 1 data are the same as those used to assess T-MSIS data from previous years and versions.
Table 2. Race and Ethnicity Data Quality Assessment, 2020 T-MSIS Analytic File (TAF) Data Release 1
Data quality assessment |
Percent of beneficiaries with missing race/ethnicity values | Number of race/ethnicity categories where TAF differs from ACS by more than 10% |
Number of states* |
States |
Low Concern | <10% | 0 | 15 | AK, CA, DE, MI, NE, NV, NM, NC, ND, OH, OK, PA, SD, VA, WA |
Medium Concern | <10% | 1 or 2 | 3 | GA, ID, IL |
10% - <20% | 0 or 1 | 14 | FL, IN, KY, ME, MN, MS, MT, NH, NJ, PR, TX, VT, WV, WI | |
High Concern | <10% | 3 or more | 0 | - |
10% - <20% | 2 or more | 1 | LA | |
20% - <50% | Any value | 15 | AZ, AR, CO, CT, DC, HI, IA, MD, MA, MO, NY, OR, SC, UT, WY | |
Unusable | >50% | Any value | 4 | AL, KS, RI, TN |
Notes: *T-MSIS includes all 50 states, the District of Columbia (DC), and the U.S. territories of Puerto Rico (PR) and the Virgin Islands (VI). A DQ assessment is not available for VI in the 2020 TAF (Data Version: Release 1) due to incomplete/unavailable data. VI is therefore the only state/territory categorized as “Unclassified” in the 2020 TAF (Data Version: Release 1), and does not appear in Table 2.
Visualizing T-MSIS Data in the DQ Atlas
The DQ Atlas enables users to generate maps that compare the quality of T-MSIS data between states across different topics, such as race/ethnicity, age, income, and gender (see Figure 1). Visualizing T-MSIS data in this manner can help researchers quickly assess the completeness of a single variable as well as the relative completeness (or incompleteness) of certain variables compared to others. For example, in the 2020 TAF Data Release 1, all states and territories received a “low concern” rating for age data, whereas only 29 states and territories received a “low concern” rating for family income.
Figure 1. Data Quality Assessments of Beneficiary Information by U.S. State/Territory
Notes: Green = low concern; yellow = medium concern; orange = high concern; red = unusable; grey = unclassified.
Source: Medicaid.gov. (n.d.). DQ Atlas: Race and Ethnicity [2020 Data set: Version: Release 1]. Available from https://www.medicaid.gov/dq-atlas/landing/topics/single/map?topic=g3m16&tafVersionId=25. Accessed January 5, 2023.
Looking Ahead
Increasingly, a wide diversity of voices from non-profits, health insurers, state-based marketplaces, and policymakers have called for improving the collection of race, ethnicity, and language data, often with the goal of advancing health equity. CMS’s efforts to improve the quality and availability of T-MSIS data reflect this nationwide movement toward data collection practices that more accurately capture the diversity of the U.S. population.
In June 2022, CMS released updated technical instructions for reporting beneficiary race, including clarification on how to report race information for beneficiaries who report multiple races. That same month, the Biden Administration announced its intent to revise the federal government’s standards for surveying race and ethnicity – a policy change that is expected to result in disaggregated categories for Hispanic individuals and people of Middle Eastern or North African descent.4
California and New York have enacted historic policies to disaggregate race/ethnicity data within the past year: California now requires state agencies to list a separate category for Black descendants of enslaved people when collecting state employee data; and New York now requires state agencies to disaggregate Asian, Native Hawaiian, and Pacific Islander data into more granular collection categories (e.g., Chinese, Filipino, Hawaiian, Samoan).5,6 It is likely that other states will follow these data disaggregation practices in the years ahead as public awareness of issues related to diversity, equity, inclusion, and racial justice continues to grow.
Sources
1 Medicaid.gov. Transformed Medicaid Statistical Information System (T-MSIS). Retrieved October 20, 2022, from https://www.medicaid.gov/medicaid/data-systems/macbis/transformed-medicaid-statistical-information-system-t-msis/index.html#
2 Medicaid.gov. September 2022 Medicaid & CHIP Enrollment Data Highlights. Retrieved on January 5, 2023, from https://www.medicaid.gov/medicaid/program-information/medicaid-and-chip-enrollment-data/report-highlights/index.html
3 Saunders, H., & Chidambaram, P. (April 28, 2022). Medicaid Administrative Data: Challenges with Race, Ethnicity, and Other Demographic Variables. Kaiser Family Foundation. Retrieved October 31, 2022, from https://www.kff.org/medicaid/issue-brief/medicaid-administrative-data-challenges-with-race-ethnicity-and-other-demographic-variables/
4 Wang, H.L. (June 15, 2022). Biden officials may change how the U.S. defines racial and ethnic groups by 2024. NPR. Retrieved November 1, 2022, from https://www.npr.org/2022/06/15/1105104863/racial-ethnic-categories-omb-directive-15
5 Diaz, J. (August 16, 2022). California becomes the first state to break down Black employee data by lineage. NPR. Retrieved November 1, 2022, from https://www.npr.org/2022/08/16/1117631210/california-becomes-the-first-state-to-break-down-black-employee-data-by-lineage
6 The New York State Senate. (December 22, 2021). Assembly Bill A6896A. Retrieved November 2, 2022, from https://www.nysenate.gov/legislation/bills/2021/A689