What Are Your Values

Vol. 15 •Issue 14 • Page 8
What Are Your Values

Reference Points Define Pulmonary Function Testing The Challenge Is Selecting Right Ones

Success in the pulmonary function lab depends on putting things in context.

“Interpretation is always by comparison,” Robert Crapo, MD, reminded pulmonologists at the 2002 meeting of the National Association for Medical Direction of Respiratory Care in Monterey, Calif., in February.

For example, look at this pulmonary lab patient report:

Date: Nov. 3, 2002

FVC = 5.19 liters

FEV1 = 3.60 liters

It lacks many vital clues. Is the patient a man or woman? Short or tall? Caucasian, Hispanic or another race? What age? With no reference data for comparison, these results sit lonely and useless in a vacuum. It’s a vacuum all PFT techs, like nature, should abhor.

PFTs are not useful without reference data; but choosing valid, suitable reference data is an exercise fraught with uncertainty. As the subject inevitably and necessarily must consider ethnic and racial differences among humans, Crapo said he “gets a panic attack” whenever he thinks about presenting on it in public.

Small wonder. Trying to establish an exact science while coping with inexact, fluctuating percentages, politically sensitive racial and ethnic classifications, vague and shifting boundaries, and economic and geographical confounders is to open a Pandora’s Body Box of trouble.

“Normal” is a relative concept.


“Reference values for a given patient population may change from hospital to hospital, from region to region, from state to state,” observed Bonnie McQuaid, BS, RRT, RPFT. Values obtained at sea level in Lenexa, Kan., could diverge from values obtained 8,000 feet above sea level in Vail, Colo.

“There’s lots of pollen around here, and maybe not as much elsewhere, so we must double-check for that,” said McQuaid, a neonatal pediatric specialist at Fairfax Hospital in Falls Church, Va. “Reference values must suit the profile of your overall patient population.”

To test the validity of her reference values, McQuaid takes lung measurements of her co-workers and compares their numbers to her reference numbers.

Researchers have conducted thousands of PFTs, running the gamut of measures from FEV1 to diffusing capacity in an attempt to construct viable sets of reference values. In her lab, McQuaid uses the Morris-Polgar lung function reference values for pediatric patients and, for adults, Knudsen’s values published in 1983 (ARRD 1983;127:725).

“Dr. Knudsen took X number of patients, did pulmonary function tests of them, then took the normal range,” she explained. “He said if you are 5′ 5″, weigh 120 pounds and you’re female, this is what you should be. It’s based on height, weight, age, sex and ethnic identity. You plug those factors into the PFT machine and come up with what a patient should be doing for that profile.”


“The ideal way to do reference values is to do your own,” resumed Crapo, a professor of medicine at the University of Utah.

Create a database of values by testing large numbers of healthy subjects from the population you serve. Or compare a patient’s observed values against values obtained from people with diseases you suspect the patient has, such as asthma or COPD. Or match a patient’s results to his or her own previous values. If you test an individual several times, you establish baseline values that include a measure of the variability within that individual’s lung function. Intra-individual variability is small compared to inter-individual variability.

“You’re scraping off the major part of the variability and getting it smaller so that your signal can be more easily identified,” Crapo said.

Remember, too, that reference data are not “normal” data. The word “normal” implies health. “Abnormal” implies disease. These two words may lead to incorrect inferences. Avoid them. Instead use the terms “inside” or “outside” the reference range.

For example, a 60-year-old male on the typical American “Big Mac” diet might have a cholesterol level of 250. That number would appear normal when compared to the values obtained from many other 60-year-old Americans males, but it is still not a healthy number, given what we know about cholesterol and heart disease.


Traditional reference values are average values from a representative sample of healthy ambulatory subjects.

Consider this report:

Date: Nov. 3, 2002

Name: John Jones

Gender: Male

Age: 45

Height: 6’3″

History: 25 pack years of smoking, no symptoms

FVC = 5.19 liters

FEV1 = 3.60 liters

FEV1/FVC = 0.694

While more informative than the prior example, it still doesn’t place the patient within a proper context. Compared to reference values for 45-year-old Causacian men, his values are lower than average for his age and height. But what are the lower limits? Is he inside or outside the entire distribution of upper and lower levels? That’s the information you need.

Problem is, different doctors use different equations to determine upper and lower lung function limits. “It takes some serious consideration and selection to get this right and minimize your chances for error,” he stressed.

Further compounding matters, consider that when you take an FEV1 of, say, 3.8 L, don’t assume it’s a true value. Measurement bias and error are common. And that value is not constant but variable; it can change from day to day or even minimally within the same day.


The patient you are examining should resemble reference individuals in all respects other than those under investigation, i.e., the patient’s illness or symptoms. If you have validated your instruments and followed procedural standards, you have established reasonable, technical comparability between your tests and your reference values, Crapo said. Now comes the next hurdle: confronting biological variability among individuals.

According to one researcher, categories of race or ethnic group are rarely well defined in scientific papers and studies, and people are often arbitrarily allocated to these groups. Furthermore, the term “race” is used interchangeably with ethnicity, and differences often result in biological explanations when the variable may be socially or politically determined.2

“The term ‘race’ inherently implies biological variability, and there are problems with that,” Crapo said. “No race has a discrete package of genetic traits. There is more genetic variation within than among races. The genes associated with morphological features such as skin color are few and not associated with the genes for disease. Race may be more useful for social than biological explanations of variations in disease prevalence.”


Ethnicity is another problematic term with a complex, imprecise definition, encompassing shared origins, social background, culture and traditions. In one 2-year study, a third of the subject population changed its ethnic group in the second year, Crapo pointed out. What’s more, ethnic, race and cultural groupings are largely confounded by covariates such as socioeconomic status and education.

Crapo studied spirometric values in healthy Hispanic Americans, collecting family background information from 259 healthy men and women ages 20-80 years to create a composite “Hispanic” study participant. The effort revealed an enormously complex genetic background: 42 percent Spanish European, 18 percent unknown, 15 percent North American Indian, 10 percent South American Indian, 8 percent Central American Indian, 3 percent French European and 4 percent other European.3

Crapo warned against ethnocentricity, the inherent tendency to view one’s own culture as the standard against which others are judged. Ethnocentricity can taint all aspects of PFT research projects, including their design, aims, methods, interpretations and results.

“You’re starting to get a flavor for this as a problem, I hope,” he told the group.


Creating an individual database of reference values is expensive and difficult to execute, so most labs use published reference data and transfer them to their labs.

Choose these reference values carefully, Crapo advised, and take care how you use them.

One early paper on interpretive strategies concluded it was acceptable to use a constant fraction (say, 88 percent) of the published predicted FEV1 for Caucasians as the predicted for African-Americans. The actual adjustment, or even whether an adjustment is warranted, generates debate.

“When we put in height, weight, age and so on, and then their race, the machine automatically adjusts for (race),” McQuaid said of her lab at Fairfax Hospital. “A ballpark FEV1 adjustment for race is 10 percent. Some say it should be 15 percent. Some say it doesn’t make any difference. There is a lot of controversy. We adjust for it automatically. I would hope most labs do.”

Researchers conducting the National Health and Nutrition Examination Study (NHANES III) have further explored racial differences. They performed spirometry on a random sampling of the U.S. population ages 8 to 80 years from 1988 to 1994, with an over-sampling of African- and Mexican-Americans. FEV1 values obtained from 7,429 individuals showed the average values among these various ethnic groups are not constant.4

In other words, this study suggests one cannot use a fixed fraction to adjust for ethnic group. “That clearly won’t work uniformly and should be abandoned,” Crapo said.

As interracial couples become more commonplace, perhaps race will diminish as a factor in PFTs, McQuaid speculated.


A subsequent researcher analyzing the NHANES survey data found that anthropometric and socioeconomic factors explained some racial differences in FEV1. For example, among healthy non-smoking men and women, factors such as age and standing height, age and sitting height, poverty index and BMI accounted for about half the differences in average lung function observed between minorities and non-Hispanic whites.5

Of course, that leaves fully half of the differences unexplained.

“Ethnic differences in lung function exist,” Crapo said. “The reasons remain unclear. Socioeconomic, environmental and dietary factors play a large role. The contribution of genetic differences appears to be small. I think you should consider ethnic differences in lung function tests. Unfortunately, how exactly to go about that remains unclear.”

He had a few suggestions, however. First, allow subjects to categorize themselves ethnically, as was done in the NHANES survey. “You should not make any attempt at categorizing subjects by looking at their morphological features,” he stressed.

The above advice goes doubly when testing people of mixed race. Using the NHANES equations and allowing the individuals to classify themselves “will get you the closest fit but it still won’t be perfect,” he conceded.


As a general rule, one should feel confident about values that fall well within or well without boundaries but should interpret borderline values with caution.

For example, researchers have traditionally drawn a statistical line at some point on the distribution of values of subjects free of disease. Those on one side of the line are judged normal; those on the other are abnormal. The line is usually drawn at the extreme lower limit of “normal,” that is, at about .05 of normal.

“It is inherently a flawed concept, but it is all we have,” Crapo acknowledged. “But you should be very cautious about what you say about people sitting near that boundary” because the boundary is arbitrary and has variability.

Likewise, the traditional “normal” FEV1 range of plus-or-minus 20 percent was created based on measures of average males of average height. “It worked well, except the further away you get from the average height and age, the more it fails,” he said. “That arbitrary categorization scheme should not be used except when you are comfortable that patients fall well within or well outside it.”

Under some schemes, patients with FEV1/FVC ratios < 70 percent of predicted fall outside the normal range and are considered to have airway obstruction. But NHANES data demonstrate the older a patient gets, the more false positives you get if you hold to this arbitrary, fixed ratio. And it yields false negatives in younger people for the same reason.


As if all this didn’t serve to confirm that PFTs require the wisdom of Solomon to interpret, Crapo offered some cautionary words about interpretative schemes.

The prevailing scheme for diagnosing small airways disease is this: a normal FEV1/FVC ratio plus a low FEF 25 percent to 75 percent equals small airways disease.

Some background: researchers looking for the early markers of lung disease in COPD latched on to FEF 25 percent to 75 percent, a tool that focuses on the middle 50 percent of a forced expiration. They found it could distinguish differences between the small airways of healthy non-smokers and asymptomatic smokers, Crapo explained. This difference was reinforced by other pathological data emerging at the time and by the correct belief that smoking causes small airways disease.

However, a significant overlap exists in the FEF 25 percent to 75 percent scores of healthy non-smokers and asymptomatic smokers, such that “when you apply the test to an individual in your lab, it becomes almost impossible to properly classify that patient,” Crapo said.

The problem is compounded by the measure’s plus or minus 20 percent range. Rather than 80 percent of predicted being the lower limit of normal (LLN), experts now believe the real LLN for males 40 years is about 40 percent of predicted, due to the large degree of variability of this test.

“If you approach that scheme looking at a normal ratio and a low 25Ð75 percent and use 80 percent of predicted, you’re probably going to be calling small airway disease falsely in 35 percent of healthy subjects,” he ventured.

He summarized: Pay attention to the accuracy and precision of your measurements. Choose reference values wisely. Select the most up-to-date lower limits of normal (LLN). And don’t allow your interpretative scheme to lead you into mistakes.


1. ATS Statement: Lung function testing: selection of reference values and inter.pretative strategies. AmJRespCritCareMed. (1991;144:1202-1218).

2. Crowcroft N, McKenzie K, et al. Race, ethnicity, culture and science. BMJ. (1994;309: 286-287).

3. Crapo R, Jensen R, et al. Normal spirometric values in healthy Hispanic Americans. CHEST. (1990;98:1435).

4. Hankinson J, et al. The National Health and Nutrition Examination Survey. AmerJourResCritCare. (1999;159:179).

5. Harik-Kahn, et al. The effect of anthropometric and socioeconomic factors on the racial difference in lung function. AJRCCM. (2001;164:1647).

You can reach Michael Gibbons at mgibbons@merion.com.

About The Author