Center for American Progress

Linking Existing Federal Data Systems to Expand Knowledge of Higher Education

Linking Existing Federal Data Systems to Expand Knowledge of Higher Education

The federal government already collects information about college students that could be used to improve the U.S. higher education system, but siloed systems and legal barriers stand in the way.

University of Washington students study in the Suzzallo Library in Seattle, April 2013. (AP/Elaine Thompson)
University of Washington students study in the Suzzallo Library in Seattle, April 2013. (AP/Elaine Thompson)

Nearly a decade ago, a blue ribbon commission appointed by former U.S. Secretary of Education Margaret Spellings issued a clarion call for higher education reform. One of the commission’s key findings was that there were extensive data available on higher education, but that data rarely focused on outcomes and left out large numbers of students who either enrolled part-time or transferred. This finding has sparked years of fighting in Congress about whether the U.S. Department of Education should or could create a database that contains information about all students in U.S. higher education institutions.

While the discussion of a federal student-level data system is a valuable one—and something the Center for American Progress has written about previously—arguments about this policy idea tend to miss a key fact: The federal government already has several comprehensive student-level data systems, but protecting the privacy of higher education institutions has led to restrictions in federal law that prevent their effective use. If used effectively, however, the information in these systems could increase understanding of how the nation’s higher education system is functioning. While this does not mean that the government should stop discussing the creation of a single federal student-level data system, it must do more to leverage existing systems in order to institute a much-needed expansion of actionable higher education data.

The promise of already available data sources

Existing data sources have the potential to help inform students, families, and policymakers about how students pay for college and progress through higher education.

6 promising databases

Recently, CAP took a close look at which promising federal data systems might be sources of more comprehensive higher education data. In “Leveraging What We Already Know: Linking Federal Data Systems”—a report produced for the Institute for Higher Education Policy—CAP found an untapped data landscape across the federal government. In particular, CAP identified six promising databases:

  • The U.S. Department of Health and Human Services’, or HHS, National Directory of New Hires’, or NDNH, and the Social Security Administration’s, or SSA, wage and earnings data
  • The U.S. Internal Revenue Service’s, or IRS, earnings data
  • Tuition and required fees data
  • Financial aid data
  • The U.S. Department of Defense’s, or DOD, military recruiting data
  • Benefit data from the Department of Veterans Affairs, or VA.

As illustrated by the release of earnings data on the College Scorecard, data from the NDNH, SSA, and IRS can provide a clear picture of the earnings outcomes for students who graduate from a particular college or university; the data from the IRS also could provide a better understanding of the true price that students and families pay for higher education. The data from the VA could encourage a better understanding of how students who served in the U.S. armed forces are faring. Finally, the secretary of education should be given access to the data collected by military recruitment stations: enrollment and attainment information for nearly every student enrolled in higher education. The secretary could use this information to create very accurate completion statistics for each higher education institution.

IRS data

Every year, the IRS receives a 1098-T form for nearly every student enrolled in higher education. In addition to student and parents identifying information, the IRS collects data from each institution on the amount of payments received for tuition and related expenses; the amount billed for qualified tuition and related expenses; the amount of scholarships and grants; the amount of any contract reimbursement or refund; and whether the student was enrolled at least half-time or as a graduate student.

The IRS collects this information to ensure that taxpayers who claim an education tax credit do so correctly. But such data also could be used to produce highly accurate net price information for students by institution—and potentially by program of study—and also by family income level. These data could be combined with available information on students who receive financial aid to gain a better understanding of the benefits and outcomes of the different types of support the federal government provides to college students. Rather than being used for analysis and policymaking, however, these data currently sit locked away under privacy restrictions in IRS databases to which only taxpayers can request access. Even the secretary of education does not have access to this information. Using only data from the IRS, students, families, and policymakers could learn about the net price paid by students enrolled in nearly all higher education institutions by a variety of characteristics—without requiring the disclosure of any personally identifiable information.

Addressing unanswered questions within and outside of higher education

The federal government could address many unanswered questions by mining already available information. For example, the data held by military recruiters could explore veterans’ graduation rates by college and—using data from the IRS or SSA—how much money veterans are earning three, five, and 10 years after graduating.

Ultimately, better linking all of these existing datasets together could help answer key questions that students, parents, and policymakers have about higher education. These questions include:

  • How likely is it that I graduate from this or any institution of higher education on time?
  • How much money will I likely make if I attend this institution and major in this field?
  • How likely is it that I will be able to repay my student loans if I attend this institution and major in this field?
  • Will my daughter be better off financially if she gets a degree in microbiology or medical engineering?
  • Is a PLUS loan necessary to meet the cost of education?
  • Are taxpayers getting an appropriate return on their investments in grants, tax benefits, and student loans?
  • How have increased levels of educational attainment affected the nation’s economy?

Mining existing federal data would, no doubt, have benefits outside of higher education. For example, military recruitment could be accomplished more efficiently and equitably if a single entity collected enrollment and attainment information from all higher education institutions and if military recruiters from all military branches and recruiting stations had access to it. This also would greatly reduce the burden on institutions, which currently can be asked up to 12 times—by each of the military units charged with recruiting—for the name, address, telephone number, age, date of birth, place of birth, level of education, academic major, and degrees received for all enrolled students.


A decade ago, the Spellings Commission came to the right conclusion. It is essential for policymakers and consumers to have access to comprehensive information in order to make informed choices about how well colleges and universities are serving their students. Providing that information means doing more than discussing a single federal student-level data system; it means leveraging data that are already available.

David Bergeron is a Senior Fellow at the Center for American Progress.

The positions of American Progress, and our policy experts, are independent, and the findings and conclusions presented are those of American Progress alone. A full list of supporters is available here. American Progress would like to acknowledge the many generous supporters who make our work possible.


David A. Bergeron

Senior Fellow