Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files: 2022 Data Assessment
November 26, 2024:
The Transformed Medicaid Statistical Information System (T-MSIS) is the largest national database of current Medicaid and Children’s Health Insurance Program (CHIP) beneficiary information collected from U.S. states, territories, and the District of Columbia (DC).1 T-MSIS data are critical for monitoring and evaluating the utilization of Medicaid and CHIP, which together provide health insurance coverage to almost 90 million people.2
Due to their size and complexity, T-MSIS data files are challenging to use directly for research and analytic purposes. To optimize these files for health services research, Centers for Medicare and Medicaid Services (CMS) repackages them into a user-friendly, research-ready format called T-MSIS Analytic Files (TAF) Research Identifiable Files (RIF). One such file, the Annual Demographic and Eligibility (DE) file, contains race and ethnicity information for Medicaid and CHIP beneficiaries.
This information is vital for assessing enrollment, access to services, and quality of care across racial and ethnic groups in the Medicaid/CHIP population, whose members are particularly vulnerable due to limited income, physical and cognitive disabilities, old age, complex medical conditions, unaffordable rents, and other social, economic, behavioral, and health needs.
To guide researchers and other consumers in their use of T-MSIS data, CMS produces data quality assessments of the completeness of race and ethnicity data along with other data such as enrollment, claims, expenditures, and service use. The Data Quality (DQ) assessments for race and ethnicity data have been posted for data years 2014 through 2022 and indicate varying levels of “concern” regarding race and ethnicity data completeness. Some data years have multiple data versions (e.g., Preliminary, Release 1, Release 2), each with their own DQ assessment.
While completeness of race and ethnicity data reported to CMS has historically remained inconsistent among the states, territories, and DC, SHADAC has been monitoring the quality of these data over time. We are encouraged by an improvement in quality as discussed below. This blog explores not only the 2022 Data Release 1, the most recent T-MSIS race and ethnicity data for which a DQ assessment is available, but also a brief analysis of data quality trends over time that we plan to follow in future T-MSIS file releases.
Evaluation of T-MSIS Race and Ethnicity Data
DQ assessments for each year and data version of T-MSIS data are housed in the Data Quality Atlas (DQ Atlas), an online evaluation tool developed as a companion to T-MSIS data.3 The DQ Atlas assesses T-MSIS race and ethnicity data using two criteria: the percentage of beneficiaries with missing race and/or ethnicity values in the TAF; and the number of race/ethnicity categories (out of five) that differ by more than ten percentage points between the TAF and American Community Survey (ACS) data.
Taken together, these two criteria indicate the level of “concern” (i.e., reliability) for states’ T-MSIS race/ethnicity data. To construct the external ACS benchmark for evaluating T-MSIS data, creators of the DQ Atlas combine race and ethnicity categories in the ACS to mirror race and ethnicity categories reported in the TAF (see Table 1). More information about the evaluation of T-MSIS race and ethnicity data is available in the DQ Atlas’ Background and Methods Resource.
Five “concern” categories appear in the DQ Atlas: Low Concern, Medium Concern, High Concern, Unusable, and Unclassified.
States with substantial missing race/ethnicity data or race/ethnicity data that are inconsistent with the ACS – a premier source of demographic data – are grouped into either the High Concern or Unusable categories, whereas states with relatively complete race/ethnicity data or race/ethnicity data that align with ACS estimates are grouped into either the Low Concern or Medium Concern categories. The Unclassified category includes states for which benchmark data are incomplete or unavailable for a given data year and version.
Table 1. Crosswalk of Race and Ethnicity Variables Between the TAF and ACS
Race/Ethnicity
Category |
Race/Ethnicity
Flag Value in TAF |
Combination of Race
and Hispanic Variables in ACS |
Hispanic,
all races |
7=Hispanic, all races |
Hispanic, all races |
Other races,
non-Hispanic |
4= American Indian and
Alaska Native, non-Hispanic
5=Hawaiian/Pacific Islander
6=Multiracial, non-Hispanic |
- American Indian alone
- Alaska Native alone
- American Indian and Alaska Native tribes specified; or American Indian
or Alaska native, non-specified and no other race
- Native Hawaiian and other Pacific Islander alone
- Some other race alone
- Two or more races |
Source: Medicaid.gov. (n.d.). DQ Atlas: Background and methods resource [PDF file]. Available from https://www.medicaid.gov/dq-atlas/downloads/background-and-methods/TAF-DQ-Race-Ethnicity.pdf Accessed December 1, 2023.
Quality Assessment by State
Table 2 shows the Race and Ethnicity DQ Assessments for the 2022 TAF (Data Version: Release 1). The categorization criteria used to determine the levels of concern for the 2022 TAF Release 1 data are the same as those used to assess T-MSIS data from previous years and versions. 15 states received a rating of “Low Concern.” There were 22 states (including Puerto Rico [PR]) that fell into the “Medium Concern” category.
Most of the “Medium Concern” states (17 of 22) fell into the subcategory denoting the higher percentage range of missing race/ethnicity data (from 10% up to 20%). A similar pattern can be seen among the “High Concern” states, most of which (10 of 14) fell into the subcategory denoting the highest percentage range of missing race/ethnicity data (from 20% up to 50%).
Finally, 14 states (including DC) received a rating of “High Concern.” One state (Utah) received an “Unusable” rating, meaning it was missing at least 50% of race/ethnicity data. The Virgin Islands (VI) is the only state/territory categorized as “Unclassified” in the 2022 TAF (Data Version: Release 1) due to insufficient or incomplete data, and does not appear in Table 2.
Table 2. Race and Ethnicity Data Quality Assessment, 2022 T-MSIS Analytic File (TAF) Data Release 1
Data quality
assessment |
Percent of beneficiaries with missing race/ethnicity values |
Number of race/ethnicity
categories where TAF differs from
ACS by more than 10% |
Number of
states* |
States |
Low Concern |
<10% |
0 |
15 |
AK, DE, MI, MO, NE, NV, NH, NM, NC, ND, OH, OK, PA, SD, WA |
Medium Concern |
<10% |
1 or 2 |
5 |
GA, ID, IL, KS, VA |
10% - <20% |
0 or 1 |
17 |
AL, AR, CA, CO, FL, IN, ME, MD, MN, MS, MT, NJ, PR, TX, VT, WV, WI |
High Concern |
<10% |
3 or more |
0 |
None |
10% - <20% |
2 or more |
4 |
AZ, KY, LA, RI |
20% - <50% |
Any value |
10 |
CT, DC, HI, IA, MA, NY, OR, SC, TN, WY |
Unusable |
>50% |
Any value |
1 |
UT |
Notes: *T-MSIS includes all 50 states, the District of Columbia (DC), and the U.S. territories of Puerto Rico (PR) and the Virgin Islands (VI). However, a DQ assessment is not available for VI in the 2022 TAF (Data Version: Release 1) due to incomplete/unavailable data.
Despite ongoing variation in the completeness of race and ethnicity data reported to CMS, SHADAC researchers have noted a trend toward better quality data overall, although the results in 2022 were somewhat more mixed.
Since beginning to track these quality assessments with the 2019 T-MSIS TAF release, several states have shifted up the quality assessment scale. The number of states with data of “High Concern” increased from 2021 to 2022. This primarily reflects two states (Massachusetts and Tennessee) moving from the “Unusable” category up to the “High Concern” category, which means they are reporting race and ethnicity data, even if it is of questionable quality.
Specifically, 2022 race/ethnicity TAF data from 14 states received a rating of “High Concern” compared to 11 states’ data in 2021 and 16 states’ data in 2020. The number of states with “Unusable” data has also dropped each year – 1 state’s 2022 race/ethnicity TAF data was classified as “Unusable” compared to 3 states’ data in 2021 and 4 states’ data in 2020.
Visualizing T-MSIS Data in the DQ Atlas
The DQ Atlas enables users to generate maps and tables that compare the quality of T-MSIS data between states across different topics, such as race/ethnicity, age, income, and gender (see Figure 1).
Visualizing T-MSIS data in this manner can help researchers quickly assess the completeness of a single variable as well as the relative completeness (or incompleteness) of certain variables compared to others. For example, in the 2022 TAF Data Release 1, all states and territories received a “Low Concern” rating for age data, whereas only 30 states and territories received a “Low Concern” rating for income.
Figure 1. Data Quality Assessments of Beneficiary Race/Ethnicity by U.S. State/Territory
Notes: Green = low concern; yellow = medium concern; orange = high concern; red = unusable; grey = unclassified.
Source: Medicaid.gov. (n.d.). DQ Atlas: Race and Ethnicity [2022 Data set: Version: Release 1]. Available from https://www.medicaid.gov/dq-atlas/landing/topics/single/map?topic=g3m16&tafVersionId=35 Accessed November 1, 2024.
Looking Ahead
Increasingly, a wide diversity of voices, from non-profits and health insurers to state-based marketplaces and policymakers, have called for improving data collection of race, ethnicity, and language data, often with the goal of advancing health equity. CMS’s efforts to improve the quality and availability of T-MSIS data reflect this nationwide movement toward data collection practices that more accurately capture the diversity of the U.S. population.
SHADAC was excited to see the revised Office of Management and Budget (OMB) standards related to the collection of race and ethnicity data. The proposed revisions align with available evidence, are consistent with the changes made by leading states, and, most importantly, explicitly state that these standards should serve as a minimum baseline with a call to collect and provide more granular data.
However, while these standards are specifically named as minimum reporting categories for data collection throughout the Federal Government, if adopted, they are likely to shape data collection and reporting across all sectors, including the states that collect race/ethnicity data through the Medicaid application process.
Many states express difficulties reporting data, as there is misalignment in how state eligibility systems, Medicaid Management Information System (MMIS), and T-MSIS format race and ethnicity data. Before states submit data to T-MSIS, they must reformat and aggregate data, which may affect the quality of submitted data.
One approach to improve the collection and reporting of data is providing states with an updated model application using evidence-based approaches to race and ethnicity questions that improve applicant response rate and data accuracy.
Sources
1 Medicaid.gov. Transformed Medicaid Statistical Information System (T-MSIS). Retrieved November 8, 2024. https://www.medicaid.gov/medicaid/data-systems/macbis/transformed-medicaid-statistical-information-system-t-msis/index.html#
2 Medicaid.gov. July 2024 Medicaid & CHIP Enrollment Data Highlights. Retrieved November 8, 2024. https://www.medicaid.gov/medicaid/program-information/medicaid-and-chip-enrollment-data/report-highlights/index.html
3 Saunders, H., & Chidambaram, P. (April 28, 2022). Medicaid Administrative Data: Challenges with Race, Ethnicity, and Other Demographic Variables. Kaiser Family Foundation. Retrieved October 31, 2022. https://www.kff.org/medicaid/issue-brief/medicaid-administrative-data-challenges-with-race-ethnicity-and-other-demographic-variables/
4 Wang, H.L. (June 15, 2022). Biden officials may change how the U.S. defines racial and ethnic groups by 2024. NPR. Retrieved November 1, 2022. https://www.npr.org/2022/06/15/1105104863/racial-ethnic-categories-omb-directive-15
5 Diaz, J. (August 16, 2022). California becomes the first state to break down Black employee data by lineage. NPR. Retrieved November 1, 2022. https://www.npr.org/2022/08/16/1117631210/california-becomes-the-first-state-to-break-down-black-employee-data-by-lineage
6 The New York State Senate. (December 22, 2021). Assembly Bill A6896A. Retrieved November 2, 2022, from https://www.nysenate.gov/legislation/bills/2021/A689