Safety Scoring System 'Conceptually Sound,' But Needs Improvement, Report Says

The federal Compliance, Safety, Accountability program’s controversial Safety Measurement System to identify motor carriers at high risk for future crashes is “conceptually sound” but several features need improvement, according to a new congressionally mandated report by a panel of the National Academies of Sciences, Engineering and Medicine.

“SMS is structured in a reasonable way, and its method of identifying motor carriers for alert status is defensible,” the study, made public June 27, concluded. “However, much of what is now done is ad hoc and based on subject-matter expertise that has not been sufficiently empirically validated.”

The 132-page study was conducted over the past 15 months by a 12-member elite panel of academics, several who specialize in statistics and transportation safety policy. It was funded by the Federal Motor Carrier Safety Administration and mandated by the FAST Act of 2015.

The approximate cost of the study is just under $1 million, said an academies spokeswoman.

The academies’ panel was charged with analyzing the ability of FMCSA’s SMS to discriminate between low- and high-risk carriers, assess the public usage of SMS, review the data and methodology used to calculate the measures and provide advice on additional data collection and safety assessment methodologies.

In March, FMCSA withdrew a January 2016 proposed motor carrier safety fitness rule saying it was awaiting the results of the study.

While the panel’s report was complimentary of the agency for many of its ideas and efforts, it recommended that over the next two years regulators develop a more “statistically principled approach” based on an “item response theory” — that is, a more detailed data-oriented approach that digs deeper and measures the performance of individual trucks and buses, not just at the motor carrier level.

The study said the IRT approach has a long history in the United States, including in commercial air, rail and waterway transportation.

“For example, the U.S. Department of Transportation, along with state and local agencies, collects and reports data on crashes for identifiable subgroups of the population of motor carriers, by specific locations, and associated with various vehicle types, makes, models and equipment,” the study said. “This approach to assessing safety performance is ubiquitous throughout the transportation agencies that report on safety.”

If the agency demonstrates that the IRT model performs well in identifying motor carriers that need safety interventions, it should use it to replace SMS “in a manner akin to the way SMS replaced SafeStat,” the study said.

That effort could take up to two years, according to the study.

“The Agency has received the National Academy of Sciences report and is reviewing its findings and recommendations,” said FMCSA spokesman Duane DeBruyne. “As specified in Section 5221 of the Fixing America’s Surface Transportation (FAST) Act, the Agency will provide its response to Congress and to the USDOT Office of the Inspector General within 120 days of the report being provided to Congress.”

Almost from its implementation in December of 2010, the CSA program has been criticized in many corners of the trucking industry as not being a reliable predictor of a carriers’ crash risk. It’s data collection and safety measurement system have been picked apart by a variety of researchers, including those with the Government Accountability Office and the American Transportation Research Institute.

And yet many motor carrier safety executives have credited CSA with raising the industry’s safety consciousness.

Specifically, the study noted that truckers and other stakeholders have criticized the SMS primarily for making use of highly variable assessments, not accounting for crashes where the motor carrier is not at fault, including carriers that have very different tasks in the same peer groups, using measures that are sensitive to effects from one or more individual states, not predictive of a carrier’s future crash frequency or efforts to improve its safety performance over time.

But the panel’s report said, “FMCSA deserves considerable credit for making use of these data in an attempt to discriminate between safe and unsafe motor carriers.”

In a statement, American Trucking Associations officials said they are pleased with the long-awaited review of CSA, and many of the concerns ATA has raised about the program.

“This report has confirmed much of what we have said about the program for some time: the program, while a valuable enforcement tool, has significant shortcomings that must be addressed and we look forward to working with FMCSA to strengthen the program,” said ATA President Chris Spear.

Specifically, ATA noted that the study validated the trucking industry’s concerns about the inclusion of certain types of violations in the CSA system, that geographic enforcement disparities can have a significant impact on carrier’s scores and that the collection and use of clean inspections is critical to the accuracy of the program.

“Basically, we felt that the SMS that FMCSA is using has many elements that are useful and that its goal for trying to prevent accidents is a good one,” said panel co-chair Joel Greenhouse, a professor of statistics at Carnegie Mellon University.

The panel’s recommendation to use the so-called IRT model approach would be useful in addressing many of the industry’s criticisms, Greenhouse told Transport Topics.

One of the agency’s challenges is that it only has access to limited information from its Motor Carrier Management Information System, or MCMIS, the study said.

Currently the data that’s available through MCMIS — a source for FMCSA inspection, crash, compliance review, safety audit, and registration data — is a self-report roughly every two years, Greenhouse said.

“FMCSA should continue to collaborate with states and other agencies to improve the quality of MCMIS data in support of SMS,” the study said. “Two specific data elements require immediate attention: carrier exposure and crash data. The current exposure data are missing with high frequency, and data that are collected are likely of unsatisfactory quality.”

Greenhouse added, “We think that one of the key pieces of information that could be improved to help FMCSA in its goal is a better measure of exposure that a truck or bus has to deal with. If FMCSA has information about each truck on each run of miles traveled, the actual exposure would be a much more valuable and meaningful measure that would help both the current SMS system as well as the one that we’re proposing.”

However, in the interest of protecting proprietary information, carriers could be reluctant to share such detailed information with regulators.

Panel co-chairman Sharon-Lise Normand, a professor of health care policy at Harvard Medical School, said that the panel did not compare the various strengths of CSA’s seven safety categories, known as Behavior Analysis and Safety Improvement Categories, or BASICS, but didn’t disagree with evidence provided in previous critical studies done by the American Transportation Research Institute and the Government Accountability Office.

“We really felt that FMCSA’s conceptual approach to preventing crashes was quite sound and quite reasonable,” Normand told TT.

But Normand added that the agency should immediately begin a conversion to the IRT method. “We aren’t saying wait five years and do this,” she said. “We’re saying start transitioning now.”

“We also felt that the public would be better served if there was more collaboration between industry and FMCSA,” Greenhouse said. “Instead of being always adversarial, working together would actually benefit everybody.”

Some of the panel’s other observations included:

A recommendation that the FMCSA investigate ways of collecting data that will likely benefit the recommended methodology for safety assessment, including data on such carrier characteristics as driver turnover rate, type of cargo, method and level of compensation and better information on exposure.
FMCSA should undertake a study to better understand the statistical operating characteristics of the percentile ranks to support decisions regarding the usability making some of the scores public.
The agency does not apply SMS to many small carriers and others that have not had a sufficient number of inspections, violations and crashes.
BASICS using IRT models would be based on expert opinion or dated empirical information and a combination of current observed data and expert opinion and ultimately on data alone, the study said.
The IRT method also can account for the probability of being selected for inspection, provide a basis to evaluate how data insufficiency could impact safety ratings of carriers and provide a basis to more rigorously evaluate the structure of the current BASICs, including which violations go into which BASIC, the study said.
Most, but not all, of the agency’s current BASICS are predictive of a carrier’s crash risk.
Some states are riskier to drive in than others and therefore could skew data.
Trucks selected for inspections can sometimes be based on an inspectors expertise at noticing potential problems.
The agency should find a way to include clean inspections in a carrier’s safety scores.