Maguire Ballard, 13, takes part in a trial run of PARCC in Maryland in 2015. A new study found that states are scoring annual Common Core math and reading tests differently. (Patrick Semansky/AP)

The vast majority of states have adopted Common Core academic standards, but individual states are still setting different definitions of “proficient” on annual math and reading tests, according to a new study.

And in many states, the study says, annual tests set a significantly lower bar for “proficient” than the National Assessment for Educational Progress, or NAEP, a national exam that is administered every two years to a sample of students in the fourth and eighth grades.

The analysis by Gary Phillips of the American Institutes for Research shows that it continues to be difficult to directly compare student performance across state lines — one of the key problems that common standards and tests were meant to address.

“This is something I’m hoping will just help policymakers put in perspective what the states are claiming and what they’re doing,” Phillips said. “The states still are setting wildly different standards.”

The 50 states gave 50 different tests until last year. Although some were scored rigorously, others were not, making it difficult to compare how students in Alabama were faring compared with students in Arizona and Alaska.

In 2015, more than half the states administered one of two Common Core tests developed with multimillion-dollar grants from the Obama administration. The analysis shows that students in the 11 states (plus D.C.) that administered the PARCC (Partnership for Assessment of Readiness for College and Careers) exam faced a scoring regime that was significantly tougher than students in the 18 states that administered the Smarter Balanced exam.

In math, for example, students who scored proficient — a level 4 out of 5 — on PARCC would also have scored proficient on NAEP, the national test. But scoring proficient on Smarter Balanced’s math test was akin to scoring “basic” on NAEP.

In language arts, PARCC and Smarter Balanced had lower expectations for proficiency than NAEP, but PARCC’s bar was still significantly higher than Smarter Balanced’s.

Phillips also examined ACT Aspire, which was administered to students in South Carolina and Alabama last year. It was easier to score proficient on ACT Aspire at most grade levels and subject areas than on PARCC.

Among states that administered other tests, only a few — such as Florida, New York and Kansas — had expectations for proficiency that were as high as NAEP’s, Phillips said.

Michael Petrilli, president of the Thomas B. Fordham Institute, a right-leaning think tank that supports the Common Core, said he was surprised by the findings. Two earlier analyses, by the nonprofit group Achieve and the journal Education Next, found that states raised their expectations on annual tests between 2013 and 2015, setting higher bars for their students to be judged proficient.

Phillips said that his work does not attempt to capture trends over time, but is instead a snapshot of the variability among states in 2015. His study also did not attempt to gauge or compare the quality of the new tests.

But Fordham recently did just that. Its researchers studied the questions on each major Common Core test to determine how deeply they asked students to think and how well they matched the academic content that students are supposed to learn. Fordham judged the tests to be generally of high quality, asking students to think more deeply than old multiple-choice tests.

Petrilli, one of the most vocal boosters of the Common Core on the political right, said that the various analyses out in public can be synthesized this way:

“PARCC is a very high-quality assessment, well matched to the Common Core standards, and extremely challenging to boot. Smarter Balanced is also a high-quality assessment and well matched to the Common Core, though somewhat easier to pass at the college and career-ready level,” he said.

On the whole, he said, states have made “huge strides in making their tests harder to pass and more honest for parents.”