TABLE VI

Evaluation of Overall Body of Evidence Using the GRADE Scoring System*

OPLL-Related FactorStrength of EvidenceConclusions/CommentsBaseline Rating of Strength of EvidenceReason for Upgrading Baseline Rating of Strength of Evidence§Reason for Downgrading Baseline Rating of Strength of Evidence#
Shape of ossification
LowThere is low evidence that the shape of ossification is predictive of the postoperative JOA score: in a single retrospective study, patients with a hill-shaped ossification had worse outcomes than those with a plateau-shaped lesionHighNoneConsistency unknown (−1), precision unknown (−1)
InsufficientThere is insufficient evidence that the shape of ossification predicts the JOA recovery rateLowNoneConsistency unknown (−1), precision unknown (−1)
Type of OPLLInsufficientThere is insufficient evidence that the type of OPLL predicts the postoperative JOA recovery rateLowNoneInconsistent (−1), precision unknown (−1)
Occupying ratio
ModerateThere is moderate evidence that the occupying ratio is not predictive of the Nurick score: in a single retrospective study, the occupying ratio was not associated with improvement of the Nurick scoreHighNoneConsistency unknown (−1)
InsufficientThere is insufficient evidence that the occupying ratio predicts the postoperative JOA score or the JOA recovery rateLowNoneInconsistent/consistency unknown (−1), precision unknown (−1)
Dural ossificationInsufficientThere is insufficient evidence that the presence of dural ossification predicts the postoperative JOA scoreLowNoneConsistency unknown (−1), precision unknown (−1)
Space available for spinal cordLowThere is low evidence that the space available for the cord is not predictive of postoperative JOA score: in 2 retrospective studies, the space available for the spinal cord was not associated with the postoperative JOA scoreHighNonePrecision unknown (−1), risk of bias (−1)
  • * The GRADE scoring system was used to evaluate whether the listed OPLL-related factors can predict surgical outcomes.

  • JOA = Japanese Orthopaedic Association.

  • A baseline rating of “high” indicates that the majority of articles represented Level-I or II evidence, and a rating of “low” indicates that the majority of articles represented Level-III or IV evidence.

  • § The possible criteria for upgrading the baseline rating included a large magnitude of effect (1 or 2 levels), a dose-response gradient (1 level), and plausible confounding that would decrease the magnitude of effect (1 level).

  • # The possible criteria for downgrading the baseline rating included inconsistency of results (1 or 2 levels), indirectness of evidence (1 or 2 levels), imprecision of effect estimates (1 or 2 levels), risk of bias (1 or 2 levels), failure to specify subgroup analysis a priori (1 level), and reporting bias (1 level). The numbers of levels of downgrade are shown in parentheses.