Abstract
Background The Lambda-Mu-Sigma (LMS) method calculates the lower limit of normal for spirometric measures of pulmonary function as the fifth percentile of the distribution of z scores, suitably accounting for age-related changes in pulmonary function. Extending prior work, and to assess whether the LMS method is clinically valid when evaluating respiratory impairment in the elderly, our current objective was to evaluate the association of LMS-defined respiratory impairment (airflow limitation and restrictive pattern) with all-cause mortality and respiratory symptoms (chronic bronchitis, dyspnea, or wheezing) in older persons.
Methods Spirometric data and outcome data on white participants aged 65 to 80 years were obtained from the Third National Health and Nutrition Examination Survey (NHANES-III, n = 1497) and the Cardiovascular Health Study (CHS, n = 3583). Multivariable analyses determined the corresponding associations, adjusting for important covariates.
Results In the NHANES-III and CHS populations, respectively, LMS-defined airflow limitation had adjusted hazard ratios (95% confidence interval) of 1.64 (1.28-2.11) and 1.69 (1.48-1.92) for mortality; adjusted odds ratios for respiratory symptoms were 2.71 (1.92-3.83) and 2.63 (2.11-3.27). The LMS-defined restrictive pattern was also significantly associated with mortality (adjusted hazard ratios of 1.98 [1.54-2.53] and 1.68 [1.44-1.95]), as well as with respiratory symptoms (adjusted odds ratios of 1.55 [1.03-2.34] and 1.37 [1.07-1.75]) in NHANES-III and CHS, respectively.
Conclusions The LMS-defined airflow limitation and restrictive pattern confers a significantly increased risk of death and likelihood of having respiratory symptoms. These results support the use of LMS-derived spirometric z scores as a basis for evaluating respiratory impairment in older persons.
The combination of host factors, along with exposures to tobacco smoke, respiratory infections, occupational dusts, and air pollution that occur across the life span, can lead to respiratory impairment, respiratory symptoms, and increased mortality.1-5Controversy exists, however, regarding the spirometric evaluation of respiratory impairment, including airflow limitation and restrictive pattern.5,6In particular, current spirometric criteria differ in the diagnostic thresholds for the ratio of forced expiratory volume in 1 second to forced vital capacity (FEV1/FVC) and for FVC alone. For example, the Global Initiative for Obstructive Lung Disease (GOLD) advocates a fixed ratio of 0.70 for FEV1/FVC and an 80% predicted cut point for FVC, whereas the American Thoracic and European Respiratory Societies (ATS/ERS) recommend a lower limit of normal (LLN), calculated as the fifth percentile of the distribution of reference values (ATS/ERS-LLN5), for both FEV1/FVC and FVC.5-7
Among older persons, establishing respiratory impairment based on GOLD thresholds has potential problems, for at least 2 reasons.8-13First, normal aging is associated with increased rigidity of the chest wall and loss of elastic recoil of the lung, factors that often lead to an FEV1/FVC less than 0.70 for individuals 65 years or older, including those who are otherwise never-smokers and healthy.8-14Second, spirometric performance among older persons, including those who are otherwise never-smokers and healthy, is associated with increased variability (coefficient of variation), thereby increasing the disparity between the 80% predicted cut point for FVC and lower values of observed FVC, including LLNs.8,9Applying official GOLD thresholds in a large cohort of community-living older persons, prior work has shown that 55% (2743/4965) of participants are classified as having a respiratory impairment (airflow limitation or restrictive pattern).15For these reasons, it is possible that a substantial proportion of GOLD-defined respiratory impairment simply represents typical age-related changes rather than "clinical disease" (eg, asthma or chronic obstructive pulmonary disease in the case of airflow limitation).8-13
The ATS/ERS-LLN5 threshold also has potential limitations because its calculation has been traditionally based on multiple regression equations that assume a linear relationship between predictor variables (age and height) and spirometric measures and because it also assumes that reference values are distributed normally and have constant variability across the life span.8,9This approach is especially problematic for a ratio of 2 spirometric measures, as with the FEV1/FVC. As evidence of the problem, published reports have shown that conventional multiple regression techniques for the FEV1/FVC have limited explanatory ability when applied to an elderly population, with R 2 values ranging from 0.01 to 0.15.11,16,17
To address the concerns described above, a diagnostic threshold for spirometric measures based on the Lambda-Mu-Sigma (LMS) method has been advocated. As an approach widely used to construct growth charts,8,9the LMS method calculates the LLN as the fifth percentile of the distribution of z scores (LMS-LLN5), analogous to the reporting of bone mineral density.8,9,18Mathematically, LMS-derived z scores include the median (Mu), representing how spirometric variables change based on predictor variables; the coefficient of variation (Sigma), modeling the spread of reference values and adjusting for nonuniform dispersion; and skewness (Lambda), modeling a departure from normality.8,9,19By using this approach, the LMS method is also an improvement on prior methods of determining spirometric z scores calculated from "standardized residuals" (relevant measurements obtained from multiple regression equations).11,12
Beyond a strong mathematical rationale, we have previously evaluated the clinical validity of different z score thresholds for FEV1/FVC, based on associations with health outcomes, and found that the upper limit of this ratio that conferred a significantly increased risk occurred at the LMS-LLN5.13Our prior work assessing the FEV1/FVC ratio was based on a single study population, however, and did not specifically evaluate the clinical validity of LMS-defined airflow limitation and restrictive pattern relative to normal spirometric function-a diagnostic process that requires simultaneous consideration of FEV1/FVC and FVC (see Methods section).5,6Thus, it remains to be seen whether the LMS method is appropriate when evaluating respiratory impairment, especially in older persons.
In the present study, using LMS-derived z scores for spirometric values and data from 2 large cohorts of community-living older persons, we evaluated the association of respiratory impairment with mortality and respiratory symptoms. As a secondary aim, we calculated the frequency of the potential misclassification of respiratory impairment when using current spirometric criteria, relative to LMS designations.
METHODS
Study Population
We used deidentified, publicly available data from the Third National Health and Nutrition Examination Survey (NHANES-III) and the Cardiovascular Health Study (CHS), with approval obtained from the VA Connecticut and Yale University institutional review boards.20,21For the present study, eligible participants were white, aged 65 to 80 years, and, at baseline (initial examination), had completed at least 2 ATS-acceptable spirometric maneuvers. Our analyses were limited to whites aged 65 to 80 years because reference values for the LMS method are currently unavailable for nonwhites and for those older than 80 years.8,9As per the current convention, we did not exclude participants based on spirometric reproducibility criteria.22Lastly, to focus on "irreversible" pathologic diagnosis, participants with self-reported asthma were excluded.
The NHANES-III population, assembled in 1988 to 1994 and followed through 2000, used a complex design to generate a nationally representative sample, with an age range of 8 to 80 years (n = 33,994).20On the basis of the eligibility criteria above (including available spirometry), our study sample within NHANES-III included 1497 participants. The CHS population was assembled in 1989 to 1990 as a random sample from Medicare eligibility lists in 4 US communities and followed through 2002, with an age range of 65 to 100 years (n = 5888).21On the basis of the same eligibility criteria, our study sample within CHS included 3583 participants.
Spirometry
In both NHANES-III and CHS, participants who underwent spirometry did so during their baseline examination, according to contemporary ATS protocols.23Spirometry was conducted using a dry-rolling seal spirometer in NHANES-III, whereas a water-sealed spirometer was used in CHS; both met ATS accuracy requirements.20,21,24,25For each study participant, the measured FEV1/FVC was calculated from the largest set of FEV1 and FVC values that were recorded in any of the spirometric maneuvers that met ATS-acceptability criteria.6,22
In our study samples, we calculated LMS-derived z scores for FEV1/FVC and FVC, as recommended8,9; see Appendix. Using the LMS-LLN5 as the diagnostic threshold, we then defined (a) normal spirometric function as FEV1/FVC and FVC both equal to LMS-LLN5 or higher, (b) airflow limitation as FEV1/FVC less than LMS-LLN5, and (c) restrictive pattern as FEV1/FVC equal to LMS-LLN5 or higher and FVC less than LMS-LLN5.6,13
In our study samples, we also classified respiratory status based on GOLD and ATS/ERS criteria. GOLD defined normal spirometric function as FEV1/FVC equal to 0.70 or higher and FVC equal to 80% predicted or higher, airflow limitation as FEV1/FVC less than 0.70, and restrictive pattern as FEV1/FVC equal to 0.70 or higher and FVC less than 80% predicted.5,15In this context, percent predicted was calculated as ([measured/predicted mean] × 100), with predicted values derived from published regression equations.5,7ATS/ERS defined normal spirometric function as both FEV1/FVC and FVC equal to ATS/ERS-LLN5 or higher, airflow limitation as FEV1/FVC less than ATS/ERS-LLN5, and restrictive pattern as FEV1/FVC equal to ATS/ERS-LLN5 or higher as well as FVC less than ATS/ERS-LLN5 6; the ATS/ERS-LLN5 was calculated from published regression equations.7
Clinical Measures
Baseline clinical characteristics of each study sample (NHANES-III and CHS) included age, sex, height, body mass index (BMI; weight divided by height-squared, expressed as kg/m2), self-reported chronic conditions, health status, and smoking history.20,21Respiratory symptoms were also evaluated as a composite measure that included (a) chronic cough or sputum production ("yes" response to "Do you usually cough on most days for 3 consecutive months or more during the year?" or "Do you bring up phlegm on most days for 3 consecutive months or more during the year?" for both NHANES-III and CHS), (b) dyspnea-on-exertion ("yes" response to "Are you troubled by shortness of breath when hurrying on the level or walking up a slight hill?" for both NHANES-III and CHS), or (c) wheezing ("yes" response to "Have you had wheezing or whistling in your chest at any time in the past 12 months?" for NHANES-III and "Does your chest ever sound wheezy or whistling occasionally apart from colds?" for CHS).20,21
All-cause mortality was recorded in NHANES-III based on the National Death Index,26with a median follow-up of 7.6 years (interquartile range, 6.4-9.8). The CHS also recorded all-cause mortality, based on reviews of obituaries, medical records, death certificates, and a hospitalization database,21with a median follow-up of 13.2 years (interquartile range, 9.0-13.6).
Statistical Analysis
Baseline characteristics of each study sample (NHANES-III and CHS) were first summarized as means accompanied by SDs or as counts accompanied by percentages.
Next, in each study sample, the association between LMS-defined respiratory impairment and death was evaluated using Cox regression models, adjusting for baseline clinical characteristics (age, height, sex, ethnicity, smoking history, BMI, number of chronic conditions, and health status). LMS-defined airflow limitation and restrictive pattern were treated as nominal categories, with the reference group including participants with normal spirometric function. Each Cox regression model's goodness-of-fit was assessed by model-fitting procedures and by the analysis of residuals. The proportional hazards assumption was tested by using interaction terms for the time-to-event outcome and each variable in the multivariable model; the terms were retained if P < 0.05 after adjusting for the multiplicity of comparisons. Higher-order effects were tested for the continuous covariates and included in the final model if they met the forward selection criterion of P < 0.20.27Similarly, the association between LMS-defined respiratory impairment and the presence of respiratory symptoms was evaluated by calculating odds ratios (ORs) using logistic regression models.
Lastly, as a descriptive and exploratory analysis, the prevalence of respiratory impairment was calculated in each study sample, as defined by GOLD, ATS/ERS, and LMS criteria. For these analyses, the frequencies of discordant designations of respiratory impairment by GOLD and ATS/ERS criteria were also evaluated relative to LMS criteria.
SUDAAN version 10 and SAS version 9.2 software were used in the analyses, with a P < 0.05 (2-sided) denoting statistical significance.28,29
RESULTS
Table 1 shows the characteristics of participants in each study sample (NHANES-III and CHS). Overall, the 2 study samples were similar in age, BMI, and frequency of chronic conditions; CHS had more female representation and also had slightly lower rates of current smoking, fair-to-poor health status, and mortality.
Table 2 shows hazard ratios (HRs) for all-cause mortality in each study sample, based on LMS-defined respiratory impairment. In contrast to normal function, LMS-defined airflow limitation had an adjusted HR (95% confidence interval [CI]) for mortality of 1.64 (1.28-2.11) and 1.69 (1.48-1.92) in NHANES-III and CHS, respectively. Similarly, LMS-defined restrictive pattern had an adjusted HR for mortality of 1.98 (1.54-2.53) and 1.68 (1.44-1.95) in NHANES-III and CHS, respectively.
Table 3 shows ORs for having respiratory symptoms in each study sample, according to LMS-defined respiratory impairment. Relative to normal function, LMS-defined airflow limitation had an adjusted OR for respiratory symptoms of 2.71 (1.92-3.83) and 2.63 (2.11-3.27) in NHANES-III and CHS, respectively. Similarly, LMS-defined restrictive pattern had an elevated adjusted OR for respiratory symptoms of 1.55 (1.03-2.34) and 1.37 (1.07-1.75) in NHANES-III and CHS, respectively.
Table 4 compares the prevalence of respiratory impairment based on GOLD, ATS/ERS, and LMS-LLN5 criteria. GOLD yielded the highest frequency of airflow limitation at 37.7% (558/1481) and 38.4% (1370/3563) in the NHANES-III and CHS study populations, respectively, whereas the ATS/ERS yielded the second highest frequency of airflow limitation at 19.1% (283/1481) and 19.2% (683/3,563), respectively. In contrast, the LMS-LLN5 yielded the lowest frequency of airflow limitation at 13.2% (196/1481) and 13.8% (492/3563), in NHANES-III and CHS, respectively. For restrictive pattern, similar frequencies of impairment were found across the various spirometric thresholds and study samples, ranging from 9.4% to 11.2%.
Table 5 shows the percentages of discordant designations of respiratory impairment in each study sample when using GOLD and ATS/ERS criteria, relative to LMS criteria. As can be seen, GOLD frequently overclassified respiratory impairment, with discordance of 58.4% (326/558) and 57.0% (781/1370) for airflow limitation and 34.6% (56/162) and 41.2% (165/401) for restrictive pattern in NHANES-III and CHS, respectively. Similarly, the ATS/ERS frequently overclassified respiratory impairment, with discordance of 27.6% (78/283) and 24.3% (166/683) for airflow limitation and 16.0% (24/150) and 23.3% (90/387) for restrictive pattern in NHANES-III and CHS, respectively. Potential underclassification of respiratory impairment was also found but was limited to restrictive pattern, with discordance of 25.4% (36/142) and 29.1% (97/333) when using GOLD criteria and 11.3% (16/142) and 10.8% (36/333) when using ATS/ERS criteria in NHANES-III and CHS, respectively.
DISCUSSION
Using spirometric data on white participants aged 65 to 80 years from NHANES-III and CHS, we found that LMS-defined airflow limitation and restrictive pattern were associated with a statistically significant increased risk of death and likelihood of respiratory symptoms. These results validate the use of LMS-derived z scores as a basis for evaluating respiratory impairment in older persons. In addition, relative to LMS designations, our findings suggest that current spirometric criteria published by GOLD and the ATS/ERS may substantially misclassify both airflow limitation and restrictive pattern.
Evaluating respiratory impairment based on the LMS-method has a strong mathematical and clinical rationale,8,9by accounting for age-related changes in pulmonary function, including the increased variability and skewness in spirometric reference data.8,9In the current context, z score-based diagnostic thresholds for respiratory impairment were also associated with important clinical outcomes. All-cause mortality is an objective and definitive clinical outcome that is resistant to miscoding and has been the primary end point in landmark studies of oxygen therapy.30In addition, respiratory symptoms are the most distressing feature of chronic respiratory disease and can lead to disability and increased health care utilization.30,31Although somewhat subjective, our use of respiratory symptoms recognizes their importance in clinical decisions, as is evident in guidelines published by GOLD, ATS/ERS, and the American College of Physicians.5,32,33
Our use of LMS-derived z scores also has precedence in clinical practice, including the use of z scores in bone mineral density testing and the common application of the LMS method to construct growth charts.8,9,18Regarding spirometric measures, diagnostic interpretation is based on a single threshold set at the LLN, corresponding to an LMS-derived z score of −1.64. Using this threshold, a reduced FEV1/FVC establishes airflow limitation because the impairment indicates a greater decline in timed lung volume (FEV1) than untimed lung volume (FVC). Conversely, the combination of a normal FEV1/FVC but reduced FVC establishes restrictive pattern because the impairment indicates similar declines in both the FEV1 and FVC.
Our results also quantify the frequency of discordant designations of respiratory impairment in older persons by current spirometric criteria relative to the LMS approach. For example, based on LMS criteria, GOLD and the ATS/ERS identify substantially more cases of airflow limitation and restrictive pattern; GOLD also identifies substantially fewer cases of restrictive pattern (Table 5). The potential misclassification of respiratory impairment by current spirometric criteria is arguably attributable to methodological limitations.8-13In particular, GOLD thresholds do not account for age-related changes in pulmonary function, and the calculation of the ATS/ERS-LLN5 does not adequately account for age-related variability and skewness in spirometric reference data.8-13
In the present study, because self-reported asthma was an exclusion criterion and rates of smoking (former and current) were high (56.1% in NHANES-III and 55.9% in CHS), participants who had airflow limitation likely had chronic obstructive pulmonary disease (COPD). Consequently, our results may have clinical implications for the evaluation and management of COPD in older persons.5,32,33Specifically, because LMS-defined airflow limitation may be less likely to misclassify "disease," future work should evaluate prospectively whether this new approach accurately predicts health care utilization related to COPD, including medication use and hospitalizations.
The LMS classification of restrictive respiratory impairment also has clinical implications. For example, the current standard of practice is to initially evaluate lung function based on spirometry.6If a restrictive pattern is present (see Methods section), the results are then confirmed by a reduced total lung capacity (TLC), as measured by plethysmography or helium dilution.6A dilemma arises, however, in that spirometric definitions of a restrictive pattern when based on GOLD or the ATS/ERS do not accurately predict a reduced TLC.6For example, 1 study suggested that a reduced TLC can occur at an FEV1/FVC well below the GOLD-defined threshold of 0.70 (as low as 0.55), whereas another study has shown that only 58% of participants who have an ATS-defined spirometric restrictive pattern also had a reduced TLC.34,35Future work should evaluate whether LMS-defined spirometric restrictive pattern predicts a reduced TLC more accurately, representing a potential new clinical and epidemiological strategy for the initial evaluation of restrictive respiratory diseases.
We recognize potential limitations to our study. First, spirometry in NHANES-III and CHS was not obtained specifically after a bronchodilator. Although persons with self-reported asthma were excluded from our study samples, the absence of information on "reversibility" may have led to misclassification of airflow limitation as COPD (ie, study participants may have underreported asthma).36Yet, postbronchodilator values may have had a minimal effect on our results because study participants had high rates of smoking (conferring less reversible airways' pathologic diagnosis) and because reversibility is neither a sufficient criterion to exclude COPD nor an independent predictor of mortality.37,38Second, NHANES-III and CHS used spirometers that may have differed in accuracy.39Nonetheless, differences in accuracy were likely small, as the spirometers met ATS accuracy requirements.24,25Third, our results were generated for older white persons, using the only available data, but nonetheless limiting generalizability. Prior work has demonstrated racial differences in pulmonary function,40and the variability and skewness in reference data are less pronounced in younger versus older adults.8Lastly, the study populations were assembled in the late 1980s and early 1990s and followed through 2000 to 2002, raising the issue of "timeliness" of data-yet pulmonary function physiology is unlikely to change during the intervening period. In view of these issues, future work should confirm the clinical validity of LMS-defined respiratory impairment in other racial, ethnic, and (younger) age groups, as well as in more contemporary cohorts or when using postbronchodilator spirometry.
In conclusion, among white older persons, LMS-defined airflow limitation and restrictive pattern were significantly associated with mortality and respiratory symptoms. These results suggest that an approach based on LMS-derived spirometric z scores has the potential to provide an age-appropriate and clinically valid basis evaluating respiratory impairment.
APPENDIX
The LMS z scores are calculated as: ([measured/predicted median]Lambda − 1) ÷ (Lambda × Sigma), with a z score of −1.64 corresponding to the fifth percentile of the distribution of z scores (LMS-LLN5).8,9 The LMS prediction equations were used to calculate values for the median, lambda, and skewness; cubic splines for age were obtained from tables based on 4 pooled reference samples, with ages ranging from 4 to 80 years.8,9