Skip to main content

Main menu

  • Online first
    • Online first
  • Current issue
    • Current issue
  • Archive
    • Archive
  • Submit a paper
    • Online submission site
    • Information for authors
  • About the journal
    • About the journal
    • Editorial board
    • Information for authors
    • FAQs
    • Thank you to our reviewers
      • Thank you to our reviewers
    • American Federation for Medical Research
  • Help
    • Contact us
    • Feedback form
    • Reprints
    • Permissions
    • Advertising
  • BMJ Journals

User menu

  • Login

Search

  • Advanced search
  • BMJ Journals
  • Login
  • Facebook
  • Twitter
JIM

Advanced Search

  • Online first
    • Online first
  • Current issue
    • Current issue
  • Archive
    • Archive
  • Submit a paper
    • Online submission site
    • Information for authors
  • About the journal
    • About the journal
    • Editorial board
    • Information for authors
    • FAQs
    • Thank you to our reviewers
    • American Federation for Medical Research
  • Help
    • Contact us
    • Feedback form
    • Reprints
    • Permissions
    • Advertising

Reporting of Multivariable Methods in the Medical Literature

Jeanette M. Tetrault, Maor Sauler, Carolyn K. Wells, John Concato
DOI: 10.2310/JIM.0b013e31818914ff Published 5 January 2016
Jeanette M. Tetrault
From the Veterans Affairs Clinical Epidemiology Research Center,West Haven; and Yale University School of Medicine, New Haven, CT.
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maor Sauler
From the Veterans Affairs Clinical Epidemiology Research Center,West Haven; and Yale University School of Medicine, New Haven, CT.
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carolyn K. Wells
From the Veterans Affairs Clinical Epidemiology Research Center,West Haven; and Yale University School of Medicine, New Haven, CT.
MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Concato
From the Veterans Affairs Clinical Epidemiology Research Center,West Haven; and Yale University School of Medicine, New Haven, CT.
MD, MS, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • eLetters
  • Info & Metrics
  • PDF
Loading

Abstract

Background Multivariable models are frequently used in the medical literature, but many clinicians have limited training in these analytic methods. Our objective was to assess the prevalence of multivariable methods in medical literature, quantify reporting of methodological criteria applicable to most methods, and determine if assumptions specific to logistic regression or proportional hazards analysis were evaluated.

Methods We examined all original articles in Annals of Internal Medicine, British Medical Journal, Journal of the American Medical Association, Lancet, and New England Journal of Medicine, from January through June 2006. Articles reporting multivariable methods underwent a comprehensive review; reporting of methodological criteria was based on each article's primary analysis.

Results Among 452 articles, 272 (60%) used multivariable analysis; logistic regression (89 [33%] of 272) and proportional hazards (76 [28%] of 272) were most prominent. Reporting of methodological criteria, when applicable, ranged from 5% (12/265) for assessing influential observations to 84% (222/265) for description of variable coding. Discussion of interpreting odds ratios occurred in 13% (12/89) of articles reporting logistic regression as the primary method and discussion of the proportional hazards assumption occurred in 21% (16/76) of articles using Cox proportional hazards as the primary method.

Conclusions More complete reporting of multivariable analysis in the medical literature can improve understanding, interpretation, and perhaps application of these methods.

Introduction

Clinicians are expected to interpret the results of studiesfound in the medical literature, and multivariable statistical techniques are often used to assess complex associations.1-3Although generally accepted methodological criteria exist for the application of multivariable analysis,1,2these criteria may not always be applied, or at least not reported. For clinicians who encounter such analyses, medical training offers little instruction in multivariable methods.4,5Accordingly, if authors do not conduct and report the application of methodological criteria appropriately, the results of a study may be misinterpreted or perhaps be incorrect.1

The objectives of this review were to (1) assess the frequency of multivariable methods reported in the medical literature, (2) quantify reporting of methodological criteria applicable to most multivariable models, and (3) determine if assumptions specific to logistic regression or proportional hazards analysis (Cox regression) were reported.

METHODS

We manually reviewed all abstracts of original research articles published in the Annals of Internal Medicine, British Medical Journal, Journal of the American Medical Association, Lancet, and New England Journal of Medicine, from January 2006 through June 2006. Articles underwent complete review if a multivariable method was mentioned (within the abstract) as a statistical analysis or if information suggestive of multivariable modeling techniques was mentioned in the results. Data were then extracted onto a standardized form regarding the types of analytic methods used, reporting of common methodological criteria, and confirmation of model assumptions.

If more than one multivariable method was reported, all methods used were noted, but results were extracted based on the method reported in the abstract. If more than one method was found in the abstract, results were extracted based on the method used to evaluate the primary research question. Finally, if the primary method used was still uncertain, data were extracted based on the method that received the most emphasis or were presented first in the methods section.

We evaluated the adequacy of reporting of methodological criteria common to most multivariable models1: reporting the coding scheme for independent and dependent variables (to interpret coefficients), providing information to calculate the number of events per variable for models with discrete outcome events (to avoid overfitting of the model6-8), reporting of tests for interactions (or mention of a lack thereof), describing the process of variable selection, such as backward or forward selection (to identify the strategy used), whether the model was validated (eg, with an assessment of model "fit"), whether independent variables were tested for colinearity, and whether a method for evaluating outliers was considered (even if data were "left as is").

Finally, for logistic regression, we assessed whether potential problems regarding interpreting odds ratios were mentioned (such that an odds ratio for each independent variable approximates a relative risk only if the outcome being assessed is uncommon). Similarly, for proportional hazards models, we evaluated reporting of the proportional hazards assumption, which involves a relatively constant "hazard" of the outcome for the compared groups over time.

All data were extracted onto a standardized form, and data were double entered into an Excel spreadsheet. For the methodological criteria, categories of no mention versus the combination of "mentioned, with detail" and "mentioned, without detail" were analyzed. A 10% random re-review was performed for data quality assurance by two authors (J.M.T. and M.S.). Descriptive statistics regarding frequencies of methods and criteria were evaluated in SAS version 9.1 (SAS Institute Inc, Cary, NC).

RESULTS

A total of 452 abstracts of original research articles (listed at www.cerc.med.va.gov) were reviewed, with 26% (n = 119) from British Medical Journal, 23% (n = 105) from Journal of the American Medical Association, 22% (n = 100) from New England Journal of Medicine, 20% (n = 90) from Lancet, and 8% (n = 38) from Annals of Internal Medicine (published semimonthly). Multivariable methods were reported in 60% (n = 272) of the articles, including 28% (n = 77) using more than one multivariable method; 2% (n = 9) reported a multivariable analysis for bivariate ("unadjusted") purposes only. For the elements included on the data extraction form, the average percent agreement was 96.9%, and the average κ statistic was 0.91, indicating "almost perfect" agreement.9

As shown in Table 1, logistic regression (33%, n = 89/272) and proportional hazards analysis (28%, n = 76/272) were the most frequently reported methods. Other less common methods were found in 8% (n = 23) of the studies, including Weibull regression10and accelerated failure time models.11,12Three percent of the studies (n = 7) used an unspecified multivariable modeling technique that precluded further review; 1 study reported the use of more than 5 different methods.

View this table:
  • View inline
  • View popup
TABLE 1.

Frequency of Multivariable Methods (n = 272 articles)

Table 2 shows the adequacy of reporting of the 265 articles with clearly stated multivariable methods. The coding scheme for variables was described in 84% (n = 222) of the studies evaluated. The numbers of events per variable were described in 79% (152/192) of studies with categorical outcome events, and 45% (n = 118) of the studies reported data on testing for interaction terms. The model selection process was described in 15% of studies (n = 41); assessing model "fit" or other mechanisms of model validation were described in 10% of studies (n = 27), including techniques such as bootstrapping13(n = 4) and the Hosmer-Lemeshow test14(n = 13). The text described issues relating to colinearity in 9% (n = 24) of the studies, and a method for dealing with outliers was found in 5% (n = 12). Of note, 5% (n = 13) of the studies failed to report any of the criteria; only 1 study met all of the criteria.

View this table:
  • View inline
  • View popup
TABLE 2.

Adequacy of Reporting of Methodological Criteria (n = 265*)

We also looked for discussion of assumptions specific to logistic regression and proportional hazards analysis. Among studies reporting data from logistic regression models, 13% (12/89) discussed the interpretation of odds ratios; 21% (16/76) of studies using proportional hazards analysis discussed the proportional hazards assumption. As examples of analytic strategies, 1 study reported using Poisson regression rather than logistic regression because the outcome event was common, and another study reported use of logistic regression because the proportional hazards assumption was not met. Examples of how the proportional hazards assumption was tested were use of log-log plots and use of the Schoenfeld residual test.15

DISCUSSION

In a review of prominent medical journals, we found that multivariable methods of data analysis were used frequently, with logistic regression and proportional hazards analysis the most commonly reported methods. "Any mention" of methodological criteria applicable to most multivariable models varied widely, suggesting an opportunity for improved reporting (and possibly conduct) of these methods. In addition, model assumptions specific to logistic regression and proportional hazards were infrequently discussed.

The results of our review can be considered in the context of a prior review1finding an 18% prevalence of 4 common multivariable methods in Lancet and New England Journal of Medicine as of 1989. In the current review (as of 2006), we found that 60% of studies used multivariable methods. Across the 2 time periods, logistic regression and proportional hazards analysis remained the most frequently used methods. The range of "adequate" reporting of general criteria in the current analysis ranged from 5% (method for evaluating outliers) to 84% (coding of variables). Among the criteria evaluated in both reviews, most were met more frequently in the current review. For example, testing for interactions was evident in 45% of articles in the current review, versus 27% of articles in the earlier review; only reporting of model selection process occurred less frequently in the current versus former review (15% vs 86%, respectively). The frequency of reporting of the proportional hazards assumption remained approximately the same (and was "low") in both reviews.

Reviews of multivariable analysis have been published in the medical specialty16and obstetrics-gynecology literature,17with similar findings of incomplete reporting. Other research has focused on the statistical review process at the level of the journal or authors. In a masked before-and-after study,18the peer review and editing process at the Annals of Internal Medicine were found to improve the quality of reporting of multivariable methods. In another study19of 114 journals responding to a survey, only one third of journals required statistical review for all accepted manuscripts. Finally, a study20of 704 authors submitting to either of 2 general medical journals found that 73% received input from a methodologist (most often a biostatistician or epidemiologist); papers without methodological input were more likely to be rejected without review.

As a potential limitation of our work, the manual search, although extensive, may not have identified all possible studies using multivariable methods. In addition, the 10% random review indicated minimal interobserver variability, but some studies may still have been misclassified. As a strength of our study, we "gave credit" if any information was presented for the various methodological criteria. For example, mentioning the number of outcome events per independent variable was considered awareness of the issue; we did not apply a threshold value (eg, 10 events per variable7,8) for "appropriateness." Similarly, a figure showing the relationship of 2 factors with an outcome variable was considered evidence of assessing interactions, even if not mentioned in the text.

This review suggests the continued need for more complete reporting of multivariable methods in the medical literature. Inadequate reporting of these methods increases the potential for unclear or misinterpreted results. Efforts to standardize reporting of multivariable methods would improve the quality of publications and assist clinicians in reaching appropriate conclusions.

ACKNOWLEDGMENTS

The authors thank the staff at the Clinical Epidemiology Research Center, and especially Richard Feinn, for assistance with this article.

References

  1. ↵
    1. Concato J ,
    2. Feinstein AR ,
    3. Holford TR.
    The risk of determining risk with multivariable models. Ann Intern Med 1993;118:201-210.
    OpenUrlCrossRefPubMedWeb of Science
  2. ↵
    1. Katz MH.
    Multivariable analysis: a primer for readers of medical research. Ann Intern Med 2003;138:644-650.
    OpenUrlCrossRefPubMedWeb of Science
  3. ↵
    1. Harrell FE Jr ,
    2. Lee KL ,
    3. Pollock BG.
    Regression models in clinical studies: determining relationships between predictors and response. J Natl Cancer Inst 1988;80:1198-1202.
    OpenUrlAbstract/FREE Full Text
  4. ↵
    1. Berwick DM ,
    2. Fineberg HV ,
    3. Weinstein MC.
    When doctors meet numbers. Am J Med 1981;71:991-998.
    OpenUrlCrossRefPubMedWeb of Science
  5. ↵
    1. Windish DM ,
    2. Diener-West, M.
    A clinician-educator's roadmap to choosing and interpreting statistical tests. J Gen Intern Med 2006;21:656-660.
    OpenUrlCrossRefPubMedWeb of Science
  6. ↵
    1. Peduzzi P ,
    2. Concato J ,
    3. Kemper E , et al
    . A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996;49:1373-1379.
    OpenUrlCrossRefPubMedWeb of Science
  7. ↵
    1. Concato J ,
    2. Peduzzi P ,
    3. Holford TR , et al
    . Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. J Clin Epidemiol 1995;48:1495-1501.
    OpenUrlCrossRefPubMedWeb of Science
  8. ↵
    1. Peduzzi P ,
    2. Concato J ,
    3. Feinstein AR , et al
    . Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol 1995;48:1503-1510.
    OpenUrlCrossRefPubMedWeb of Science
  9. ↵
    1. Landis JR ,
    2. Koch GG.
    The measurement of observer agreement for categorical data. Biometrics 1977;33:159-174.
    OpenUrlCrossRefPubMedWeb of Science
  10. ↵
    1. O'Quigley J ,
    2. Flandre P.
    Predictive capability of proportional hazards regression. Proc Natl Acad Sci U S A 1994;91:2310-2314.
    OpenUrlAbstract/FREE Full Text
  11. ↵
    1. Wei LJ.
    The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat Med 1992;11:1871-1879.
    OpenUrlCrossRefPubMedWeb of Science
  12. ↵
    1. Walker S ,
    2. Mallick BK.
    A Bayesian semiparametric accelerated failure time model. Biometrics 1999;55:477-483.
    OpenUrlCrossRefPubMedWeb of Science
  13. ↵
    1. Harrell FE Jr ,
    2. Lee KL ,
    3. Mark DB.
    Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361-387.
    OpenUrlCrossRefPubMedWeb of Science
  14. ↵
    1. Hosmer DW ,
    2. Lemeshow S.
    Applied Logistic Regression New York, NY: Wiley; 1989.
  15. ↵
    1. Grambsch RM ,
    2. Therneau TM.
    Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994;81:515-526.
    OpenUrlAbstract/FREE Full Text
  16. ↵
    1. Moss M ,
    2. Wellman DA ,
    3. Cotsonis GA.
    An appraisal of multivariable logistic models in the pulmonary and critical care literature. Chest 2003;123:923-928.
    OpenUrlCrossRefPubMedWeb of Science
  17. ↵
    1. Khan KS ,
    2. Chien PF ,
    3. Dwarakanath LS.
    Logistic regression models in obstetrics and gynecology literature. Obstet Gynecol 1999;93:1014-1020.
    OpenUrlCrossRefPubMedWeb of Science
  18. ↵
    1. Goodman SN ,
    2. Berlin J ,
    3. Fletcher SW , et al
    . Manuscript quality before and after peer review and editing at Annals of Internal Medicine. Ann Intern Med 1994;121:11-21.
    OpenUrlCrossRefPubMedWeb of Science
  19. ↵
    1. Goodman SN ,
    2. Altman DG ,
    3. George SL.
    Statistical reviewing policies of medical journals: caveat lector? J Gen Intern Med 1998;13:753-756.
    OpenUrlCrossRefPubMedWeb of Science
  20. ↵
    1. Altman DG ,
    2. Goodman SN ,
    3. Schroter S.
    How statistical expertise is used in medical research. JAMA 2002;287:2817-2820.
    OpenUrlCrossRefPubMedWeb of Science
View Abstract
PreviousNext
Back to top
Vol 56 Issue 7 Table of Contents
Journal of Investigative Medicine: 56 (7)
  • Table of Contents
  • Table of Contents (PDF)
  • Index by author
Email

Thank you for your interest in spreading the word on JIM.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Reporting of Multivariable Methods in the Medical Literature
(Your Name) has sent you a message from JIM
(Your Name) thought you would like to see the JIM web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
Alerts
Sign In to Email Alerts with your Email Address
Citation Tools
Reporting of Multivariable Methods in the Medical Literature
Jeanette M. Tetrault, Maor Sauler, Carolyn K. Wells, John Concato
Journal of Investigative Medicine Oct 2008, 56 (7) 954-957; DOI: 10.2310/JIM.0b013e31818914ff

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Cite This
  • APA
  • Chicago
  • Endnote
  • MLA
Loading
Reporting of Multivariable Methods in the Medical Literature
Jeanette M. Tetrault, Maor Sauler, Carolyn K. Wells, John Concato
Journal of Investigative Medicine Oct 2008, 56 (7) 954-957; DOI: 10.2310/JIM.0b013e31818914ff
Download PDF

Share
Reporting of Multivariable Methods in the Medical Literature
Jeanette M. Tetrault, Maor Sauler, Carolyn K. Wells, John Concato
Journal of Investigative Medicine Oct 2008, 56 (7) 954-957; DOI: 10.2310/JIM.0b013e31818914ff
Reddit logo Twitter logo Facebook logo Mendeley logo
Respond to this article
  • Tweet Widget
  • Facebook Like
  • Google Plus One
  • Article
    • Abstract
    • Introduction
    • METHODS
    • RESULTS
    • DISCUSSION
    • ACKNOWLEDGMENTS
    • References
  • Figures & Data
  • eLetters
  • Info & Metrics
  • PDF

Related Articles

Cited By...

More in this TOC Section

  • Trends and demographic patterns in biologic and corticosteroid prescriptions for inflammatory bowel disease: findings from electronic medical records, 2011–2020
  • Is blood lymphocyte count a prognostic biomarker in Staphylococcus aureus bacteremia?
  • Effect of intramuscular depot betamethasone injection in patients with fibromyalgia and elevated C-reactive protein levels
Show more Brief report

Similar Articles

 

CONTENT

  • Latest content
  • Current issue
  • Archive
  • Sign up for email alerts
  • RSS

JOURNAL

  • About the journal
  • Editorial board
  • Subscribe
  • Thank you to our reviewers
  • American Federation for Medical Research

AUTHORS

  • Information for authors
  • Submit a paper
  • Track your article
  • Open Access at BMJ

HELP

  • Contact us
  • Reprints
  • Permissions
  • Advertising
  • Feedback form

© 2023 American Federation for Medical Research