Systems to Rate the Strength Of Scientific Evidence

West S, King V, Carey TS, et al. Systems to Rate the Strength of Scientific Evidence. Evidence Report/Technology Assessment No. 47 (Prepared by the Research Triangle Institute-University of North Carolina Evidence-based Practice Center under Contract No. 290-97-0011). AHRQ Publication No. 02-E016. Rockville, MD: Agency for Healthcare Research and Quality. April 2002.




Related Documents




Health care decisions are increasingly being made on research-based evidence, rather than on expert opinion or clinical experience alone. This report examines systematic approaches to assessing the strength of scientific evidence. Such systems allow evaluation of either individual articles or entire bodies of research on a particular subject, for use in making evidence-based health-care decisions. Identification of methods to assess health care research results is a task that Congress directed the Agency for Healthcare Research and Quality to undertake as part of the Healthcare Research and Quality Act of 1999.

Search Strategy

The authors built on an earlier project concerning evaluating evidence for systematic reviews. They expanded this work by conducting a MEDLINE search (covering the years 1995 to mid-2000) for relevant articles published in English on either rating the quality of individual research studies or on grading a body of scientific evidence. Information from other Evidence-based Practice Centers (EPCs) and other groups involved in evidence-based medicine (such as the Cochrane Collaboration Methods Group) was used to supplement these sources.

Selection of Studies

The initial MEDLINE search for systems for assessing study quality identified 704 articles, while the search on strength of evidence identified 679 papers. Each abstract was assessed by two reviewers to determine eligibility. An additional 219 publications were identified from other sources The first 100 Abstracts in each group were used to develop a coding system for categorizing the publications.

Data Collection and Analysis

From the 1,602 titles and abstracts reviewed for the report, 109 were retained for further analysis. In addition, the authors examined 12 reports from various AHRQ-supported EPCs. To account for differences in study designs — systematic reviews and meta-analyses, randomized controlled trials (RCTs), observational studies, and diagnostic studies — the authors developed four Study Quality Grids whose columns denote evaluations domains of interest, and whose rows are the individual systems, checklists, scales, or instruments. Taken together, the grids form “evidence tables” that document the characteristics (strengths and weaknesses) of these different systems.

Main Results

The authors separately analyzed systems found in the literature and those in use by the EPCs. Four non-EPC checklists for use with systematic reviews or meta-analyses accounted for at least six of seven domains needed to be considered high-performing. For analysis of RCTs, the authors concluded that eight systems represent acceptable approaches that could be used without major modifications. Six high-performing systems were identified to evaluate observational studies. Five non-EPC checklists adequately dealt with studies of diagnostic tests. For assessment of the strength of a body of evidence, seven systems fully addressed the quality, quantity, and consistency of the evidence.


Overall, the authors identified 19 generic systems that fully address their key quality domains for a particular type of study. The authors also identified seven systems that address all three quality domains grading the strength of a body of evidence. The authors also recommended future research areas to bridge gaps where information or empirical documentation is needed. The authors hope that these systems will prove useful to those developing clinical practice guidelines or other health-related policy advice.