********************************************************************************
********************************************************************************
WARNING!!! WARNING!!! WARNING!!! WARNING!!!
THIS TEXT DOCUMENT IS OVER 900 PAGES LONG!!!!!
DO NOT PRINT THE DOCUMENT UNLESS YOU ARE ABLE
TO PRINT THIS MANY PAGES ON YOUR PRINTER!
YOU ARE ADVISED TO DOWNLOAD THE DOCUMENT TO
YOUR PERSONAL COMPUTER AND THEN VIEW OR
PRINT IT FROM WITHIN A WORD PROCESSOR!!!!
SET YOUR MARGINS TO ZERO AND USE A FONT
THAT MIMICS A TEXT PRINTER.
WARNING!!! WARNING!!! WARNING!!! WARNING!!!
********************************************************************************
Third National Health and Nutrition Examination Survey
(NHANES III), 1988-94
Catalog Number 76700
NHANES III Individual Foods Data File from the Dietary Recall
December 1996
Table of Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Guidelines for Data Users. . . . . . . . . . . . . . . . . . . . . . . .
Survey Description . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sample Design and Analysis Guidelines. . . . . . . . . . . . . . . . . .
Data Preparation and Processing Procedures . . . . . . . . . . . . . . .
General References . . . . . . . . . . . . . . . . . . . . . . . . . . .
NHANES III Individual Foods Data
General Information . . . . . . . . . . . . . . . . . . . . . . . .
Data File Index . . . . . . . . . . . . . . . . . . . . . . . . . .
Data File Item Descriptions, Codes, Counts, and Notes . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sas Code to Merge Look-up Tables . . . . . . . . . . . . . . . . .
Food Preparation Codes Look-up Table . . . . . . . . . . . . . . .
Brand Names Look-up Table . . . . . . . . . . . . . . . . . . . . .
Code Book Look-up Table . . . . . . . . . . . . . . . . . . . . . .
ID Codes Look-up Table . . . . . . . . . . . . . . . . . . . . . .
Example Merged Record Tables . . . . . . . . . . . . . . . . . . .
Introduction
The National Center for Health Statistics (NCHS) of the Centers for Disease
Control and Prevention (CDC) collects, analyzes, and disseminates data on
the health status of U.S. residents. The results of surveys, analyses, and
studies are made known through a number of data release mechanisms
including publications, mainframe computer data files, CD-ROMs (Search and
Retrieval Software, Statistical Export and Tabulation System (SETS)), and the
Internet (http://www.cdc.gov/nchswww/nchshome.htm).
The National Health and Nutrition Examination Survey (NHANES) is a periodic
survey conducted by NCHS. The third National Health and Nutrition
Examination Survey (NHANES III), conducted from 1988 through 1994, was the
seventh in a series of these surveys based on a complex, multi-stage sample
plan. It was designed to provide national estimates of the health and
nutritional status of the United States' civilian, noninstitutionalized
population aged two months and older.
Data from NHANES III are being released in five public release data files:
NHANES III Household Adult Data File (Catalog Number 77560)
NHANES III Household Youth Data File (Catalog Number 77550)
NHANES III Examination Data File (Catalog Number 76200)
NHANES III Laboratory Data File (Catalog Number 76300)
NHANES III Dietary Recall Data Files (Catalog Number 76700)
A table showing the location of the interview and examination components in
the five NHANES III public release data files follows.
Location of the interview and examination components in the five NHANES III
public release data files
Data File
Topic | HA | HY | EXAM | LAB | DIET |
-----------------------------------------+-----+-----+-------+-----+------+
Sample weights | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Age/race/sex | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Ethnic background | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Household composition | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Individual characteristics | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Health insurance | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Family background | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Occupation of family head | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Housing characteristics | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Family characteristics | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Orientation | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Health services | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Selected health conditions | X | X | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Diabetes questions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
High blood pressure and | X | . | . | . | . |
cholesterol questions | | | | | |
-----------------------------------------+-----+-----+-------+-----+------+
Cardiovascular disease questions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Musculoskeletal conditions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Physical functioning questions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Gallbladder disease questions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Location of the interview and examination components in the five NHANES III
public release data files (continued)
Data File
Topic | HA | HY | EXAM | LAB | DIET |
-----------------------------------------+-----+-----+-------+-----+------+
Kidney conditions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Respiratory and allergy questions | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Diet questions | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Food frequency | X | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Vision questions | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Hearing questions | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Dental care and status | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Tobacco | X | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Occupation | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Language usage | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Exercise | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Social support/residence | X | . | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Vitamin/mineral/medicine usage | X | X | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Blood pressure measurement | X | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Birth | . | X | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Infant feeding practices/diet | . | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Motor and social development | . | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Functional impairment | X | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
School attendance | . | X | . | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Cognitive function | . | X | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Location of the interview and examination components in the five NHANES III
public release data files (continued)
Data File
Topic | HA | HY | EXAM | LAB | DIET |
-----------------------------------------+-----+-----+-------+-----+------+
Alcohol and drug use | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Reproductive health | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Diagnostic interview schedule | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Activity | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Physician's examination | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Height and weight | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Body measurements | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Dental examination | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Allergy skin test | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Audiometry | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Tympanometry | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
WISC and WRAT | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Spirometry | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Bone densitometry | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Gallbladder ultrasonography | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Central nervous system | . | . | X | . | . |
function evaluation | | | | | |
-----------------------------------------+-----+-----+-------+-----+------+
Fundus photography | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Physical function evaluation | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Fasting questions | . | . | . | X | . |
-----------------------------------------+-----+-----+-------+-----+------+
Location of the interview and examination components in the five NHANES III
public release data files (continued)
Data File
Topic | HA | HY | EXAM | LAB | DIET |
-----------------------------------------+-----+-----+-------+-----+------+
Laboratory tests on blood and urine | . | . | . | X | . |
-----------------------------------------+-----+-----+-------+-----+------+
Total nutrient intakes | . | . | X | . | . |
-----------------------------------------+-----+-----+-------+-----+------+
Individual foods | . | . | . | . | X |
-----------------------------------------+-----+-----+-------+-----+------+
Combination foods | . | . | . | . | X |
-----------------------------------------+-----+-----+-------+-----+------+
Ingredients | . | . | . | . | X |
-----------------------------------------+-----+-----+-------+-----+------+
Data File Definitions
HA - Household Adult Data File
HY - Household Youth Data File
EXAM - Examination Data File
LAB - Laboratory Data File
DIET - Dietary Recall Data Files
This document includes the documentation for the NHANES III Foods Data
File and also contains a general overview of the survey and the use of the
data files. The general overview includes five sections. The first
section, entitled "Guidelines for Data Users," contains important information
about the use of the data files. The second section, "Survey Description,"
is a brief overview of the survey plan and operation. The third section,
"Sample Design and Analysis Guidelines," describes some technical aspects of
the sampling plan and discusses some analytic issues particularly related to
the use of data from complex sample surveys. The "Data Preparation and
Processing Procedures" section describes the editing conventions and the
codes used to represent the data. The last and fifth section, "General
References," includes a reference list for the survey overview sections of
the document.
Public Use Data Files for the third National Health and Nutrition
Examination Survey will also be available from the National Technical
Information Service (NTIS). A list of NCHS public use data tapes available
for purchase from NTIS may be obtained from the Data Dissemination Branch at
NCHS. Information regarding a bibliography (on disk) of journal articles
citing data from all the NHANES and the availability of NHANES III data in
CD-ROM/SETS software format can be obtained from the Data Dissemination
Branch(301-436-8500) or by writing to:
Data Dissemination Branch
National Center for Health Statistics
Room 1018
6525 Belcrest Road
Hyattsville, Maryland 20782
NTIS can be contacted at:
NTIS - Computer Products Office
5285 Port Royal Road
Springfield, Virginia 22161
(703) 487-4807
Copies of all NHANES III questionnaires and data collection forms are
included in the Plan and Operation of the Third National Health and
Nutrition Examination Survey, 1988-94 (NCHS, 1994; U.S. DHHS, 1996). This
publication, along with detailed information on NHANES procedures,
interviewing, data collection, quality control techniques, survey design,
nonresponse, and sample weighting can be found on the NHANES III Reference
Manuals and Reports CD-ROM (U.S. DHHS, 1996). Information on how to order
this CD-ROM is available from the Data Dissemination Branch at NCHS at the
address and telephone number given above.
GUIDELINES FOR DATA USERS
Please refer to the following important information before analyzing data.
NHANES III Background Documents
o The Plan and Operation of the Third National Health and Nutrition
Examination Survey, 1988-94, (NCHS, 1994; U.S. DHHS, 1996) provides an
overview of the survey and includes copies of the survey forms.
o The sample design, nonresponse, and analytic guidelines documents on
the NHANES III Reference Manuals and Reports CD-ROM (U.S. DHHS, 1996)
discuss the reasons that sample weights and the complex survey design
should be taken into account when conducting any analysis.
o Instruction manuals, laboratory procedures, and other NHANES III
reference manuals on the NHANES III Reference Manuals and Reports
CD-ROM(U.S. DHHS, 1996) are also available for further information on
the details of the survey.
Analytic Data Set Preparation
o Most NHANES III survey design and demographic variables are found only
on the Adult and Youth Household Data Files. In preparing a data set
for analysis, other data files must be merged with either or both of
these files to obtain many important analytic variables.
o All of the NHANES III public use data files are linked with the common
survey participant identification number (SEQN). Merging information
from multiple NHANES III data files using this variable ensures that
the appropriate information for each survey participant is linked
correctly.
o NHANES III public use data files do not have the same number of
records on each file. The Household Questionnaire Files (divided into
two files, Adult and Youth) contain more records than the Examination
Data File because not everyone who was interviewed completed the
examination. The Laboratory Data File contains data only for persons
aged one year and older. The Individual Foods Data File based on the
dietary recall has multiple records for each person rather than the one
record per sample person contained in the other data files.
o For each data file, SAS program code with standard variable names and
labels is provided as separate text files on the CD-ROM that contains
the data files. This SAS program code can be used to create a SAS
data set from the data file.
o Modifications were made to items in the questionnaires, laboratory,
and examination components over the course of the survey; as a result,
data may not be available for certain variables for the full six years.
In addition, variables may differ by phase since some changes were
implemented between phases. Users are encouraged to read the Notes
sections of this document carefully for information about changes.
o Extremely high and low values have been verified whenever possible,
and numerous consistency checks have been performed. Nonetheless, users
should examine the range and frequency of values before analyzing
data.
o Some data were not ready for release at the time of this publication
due to continued processing of the data or analysis of laboratory
specimens. A listing of those data are available in the general
information section of each data file.
o Confidential and administrative data are not being released to the
public. Additionally, some variables have been recoded to help
protect the confidentiality of the survey participants. For example,
all age-related variables were recoded to 90+ years for persons who were
90 years of age and older.
o Some variable names may differ from those used in the Phase 1 NHANES
III Provisional Data Release and some variables included in the Phase 1
provisional release may not appear on these files.
o Although the data files have been edited carefully, errors may be
detected. Please notify NCHS staff (301-436-8500) of any errors in
the data file or the documentation.
Analytic Considerations
o NHANES III (1988-94) was designed so that the survey's first three
years, 1988-91, its last three years, 1991-94, and the entire six
years were national probability samples. Analysts are encouraged to use
all six years of survey results.
o Sample weights are available for analyzing NHANES III data. One of
the following three sample weights will be appropriate for nearly all
analyses: interviewed sample final weight (WTPFQX6), examined sample
final weight (WTPFEX6), and mobile examination center (MEC)- and
home-examined sample final weight (WTPFHX6). Choosing which of these
sample weights to use in any analysis depends on the variables being
used. A good rule of thumb is to use "the least common denominator"
approach. In this approach, the user checks the variables of
interest. The variable that was collected on the smallest number of
persons is the "least common denominator," and the sample weight that
applies to that variable is the appropriate one to use for that
analysis. For more detailed information, see the Analytic and Reporting
Guidelines for NHANES III (U.S. DHHS, 1996).
Referencing or Citing NHANES III Data
o In publications, please acknowledge NCHS as the original data source.
For instance, the reference for the NHANES III Laboratory Data File
is:
U.S. Department of Health and Human Services (DHHS). National Center
for Health Statistics. Third National Health and Nutrition
Examination Survey, 1988-1994, NHANES III Laboratory Data File (CD-ROM).
Public Use Data File Documentation Number 76200. Hyattsville, MD.:
Centers for Disease Control and Prevention, 1996. Available from
National Technical Information Service (NTIS), Springfield, VA.
Acrobat. PDF format; includes access software: Adobe Systems, Inc.
Acrobat Reader 2.1.
o Please place the acronym "NHANES III" in the titles or abstracts of
journal articles and other publications in order to facilitate the
retrieval of such materials in bibliographic searches.
SURVEY DESCRIPTION
The third National Health and Nutrition Examination Survey (NHANES III) was
the seventh in a series of large health examination surveys conducted in
the United States beginning in 1960. Three of these surveys, the National
Health Examination Surveys (NHES), were conducted in the 1960's (NCHS, 1965;
NCHS, 1967; NCHS, 1969). In 1970, an expanded nutrition component was added
to provide data with which to assess nutritional status and dietary
practices, and the name was changed to the National Health and Nutrition
Examination Survey (Miller, 1973; Engel, 1978; McDowell, 1981). A special
survey of Hispanic populations in the United States was conducted during
1982-1984 (NCHS, 1985).
The general structure of the NHANES III sample design was similar to that
of the previous NHANES. All of the surveys used complex, multi-stage,
stratified, clustered samples of civilian, noninstitutionalized
populations. NHANES III was the first NHANES without an upper age limit; in
fact, the age range for the survey was two months and older. A home
examination option was employed for the first time in order to obtain
examination data for very young children and for elderly persons who were
unable to visit the mobile examination center (MEC). The home examination
included only a subset of the components used in the full MEC examination
since it would have been difficult to collect some types of data in a home
setting. A detailed description of design specifications and copies of the
data collection forms can be found in the Plan and Operation of the Third
National Health and Nutrition Examination Survey, 1988-1994 (NCHS, 1994; U.S.
DHHS, 1996).
NHANES III was conducted from October 1988 through October 1994 in two
phases, each of which comprised a national probability sample. The first
phase was conducted from October 18, 1988, through October 24, 1991, at 44
locations. The second phase was conducted from September 20, 1991, through
October 15, 1994, at 45 different locations. In NHANES III, 39,695 persons
were selected over the six years; of those, 33,994 (86%) were interviewed
in their homes. All interviewed persons were invited to the MEC for a
medical examination. Seventy-eight percent (30,818) of the selected persons
were examined in the MEC, and an additional 493 persons were given a special,
limited examination in their homes.
Data collection began with a household interview. Several questionnaires
were administered in the household: Household Screener Questionnaire,
Family Questionnaire, Household Adult Questionnaire, and Household Youth
Questionnaire.
At the MEC, an examination was performed, and five automated questionnaires
or interviews were administered: MEC Adult Questionnaire, MEC Youth
Questionnaire, MEC Proxy Questionnaire, 24-Hour Dietary Recall, and Dietary
Food Frequency (ages 12-16 years). The health examination component
included a variety of tests and procedures. The examinee's age at the time
of the interview and other factors determined which procedures were
administered. Blood and urine specimens were obtained, and a number of tests
and measurements were performed including body measurements, spirometry,
fundus photography, x-rays, electrocardiography, allergy and glucose
tolerance tests, and ultrasonography. Measurements were taken of bone
density, hearing, and physical, cognitive, and central nervous system
functions. A physician performed a limited standardized medical examination
and a dentist performed a standardized dental examination. While some of the
blood and urine analyses were performed in the MEC laboratory, most analyses
were conducted elsewhere by contract laboratories.
A home examination was conducted for those sample persons aged 2-11 months
and aged 20 years or older who were unable to visit the mobile examination
center. The home examination consisted of an abbreviated version of the
tests and interviews performed in the MEC. Depending on age of the sample
person, the components included body measurements, blood pressure,
spirometry, venipuncture, physical function evaluation, and a questionnaire
to inquire about infant feeding, selected health conditions, cognitive
function, tobacco use, and reproductive history.
SAMPLE DESIGN AND ANALYSIS GUIDELINES
Sample Design
The general structure of the NHANES III sample design is the same as that
of the previous NHANES. Each of these surveys used a stratified, multi-stage
probability design. The major design parameters of the two previous NHANES
and the special Hispanic HANES, as well as NHANES III, have been previously
summarized (Miller, 1973; McDowell, 1981; NCHS, 1985; NCHS, 1994). The
NHANES III sample was designed to be self-weighting within a primary
sampling unit (PSU) for subdomains (age, sex, and race-ethnic groups). While
the sample was fairly close to self-weighting nationally for each of these
subdomain groups, it was not representative of the total population, which
includes institutionalized, non-civilian persons that were outside the
scope of the survey.
The NHANES III sample represented the total civilian, noninstitutionalized
population, two months of age or over, in the 50 states and the District of
Columbia of the United States. The first stage of the design consisted of
selecting a sample of 81 PSU's that were mostly individual counties. In a
few cases, adjacent counties were combined to keep PSU's above a minimum
population size. The PSU's were stratified and selected with probability
proportional to size (PPS). Thirteen large counties (strata) were chosen
with certainty (probability of one). For operational reasons, these 13
certainty PSU's were divided into 21 survey locations. After the 13
certainty strata were designated, the remaining PSU's in the United States
were grouped into 34 strata, and two PSU's were selected per stratum (68
survey locations). The selection was done with PPS and without
replacement. The NHANES III sample therefore consists of 81 PSU's or 89
locations.
The 89 locations were randomly divided into two groups, one for each phase.
The first group consisted of 44 and the other of 45 locations. One set
of PSU's was allocated to the first three-year survey period (1988-91) and
the other set to the second three-year period (1991-94). Therefore,
unbiased estimates (from the point of view of sample selection) of health and
nutrition characteristics can be independently produced for both Phase 1
and Phase 2 as well as for both phases combined.
For most of the sample, the second stage of the design consisted of area
segments composed of city or suburban blocks, combinations of blocks, or
other area segments in places where block statistics were not produced in
the 1980 Census. In the first phase of NHANES III, the area segments were
used only for a sample of persons who lived in housing units built before
1980. For units built in 1980 and later, the second stage consisted of sets
of addresses selected from building permits issued in 1980 or later. These
are referred to as "new construction segments." In the second phase, 1990
Census data and maps were used to define the area segments. Because the
second phase followed within a few years of the 1990 Census, new construction
did not account for a significant part of the sample, and the entire sample
came from the area segments.
The third stage of sample selection consisted of households and certain
types of group quarters, such as dormitories. All households and eligible
group quarters in the sample segments were listed, and a subsample was
designated for screening to identify potential sample persons. The
subsampling rates enabled production of a national, approximately
equal-probability sample of households in most of the United States with
higher rates for the geographic strata with high Mexican-American
populations. Within each geographic stratum, there was a nearly
equal-probability sample of households across all 89 stands.
Persons within the sample of households or group quarters were the fourth
stage of sample selection. All eligible members within a household were
listed, and a subsample of individuals was selected based on sex, age, and
race or ethnicity. The definitions of the sex, age, race or ethnic
classes, subsampling rates, and designation of potential sample persons
within screened households were developed to provide approximately
self-weighting samples for each subdomain within geographic strata and at the
same time to maximize the average number of sample persons per sample
household. Previous NHANES indicated that this increased the overall
participation rate. Although the exact sample sizes were not known until
data collection was completed, estimates were made. Below is a summary of
the sample sizes for the full six-year NHANES III at each stage of selection:
Number of PSU's 81
Number of stands (survey locations) 89
Number of segments 2,144
Number of households screened 93,653
Number of households with sample persons 19,528
Number of designated sample persons 39,695
Number of interviewed sample persons 33,994
Number of MEC-examined sample persons 30,818
Number of home-examined sample persons 493
More detailed information on the sample design and weighting and estimation
procedures for NHANES III can be found in the Plan and Operation of the
Third National Health and Nutrition Examination Survey, 1988-94 (NCHS, 1994;
U.S. DHHS, 1996) and in the Analytic and Reporting Guidelines: Third National
Health and Nutrition Examination Survey (NHANES III), 1988-94 (U.S. DHHS,
1996).
Analysis Guidelines
Because of the complex survey design used in NHANES III, traditional
methods of statistical analysis based on the assumption of a simple random
sample are not applicable. Detailed descriptions of this issue and possible
analytic methods for analyzing NHANES data have been described earlier (NCHS,
1985; Yetley, 1987; Landis, 1982; Delgado, 1990). Recent analytic and
reporting guidelines that should be used for most NHANES III analyses and
publications are contained in Analytic and Reporting Guidelines (U.S. DHHS,
1996). These recommendations differ slightly from those used by analysts for
previous NHANES surveys. These suggested guidelines provide a framework to
users for producing estimates that conform to the analytic design of the
survey. All users are strongly urged to review these analytic and reporting
guidelines before beginning any analyses of NHANES III data.
It is important to remember that this set of statistical guidelines is not
absolute. When conducting analyses, the analyst needs to use his/her
subject matter knowledge (including methodological issues) as well as
information about the survey design. The more one deviates from the original
analytic categories defined in the sample design, the more important it is to
evaluate the results carefully and to interpret the findings cautiously.
In NHANES III, 89 survey locations were randomly divided into two sets or
phases, the first consisting of 44 and the other of 45 locations. One set
of PSU's was allocated to the first three-year survey period (1988-91) and
the other set to the second three-year period (1991-94). Therefore, unbiased
national estimates of health and nutrition characteristics can be
independently produced for each phase as well as for both phases combined.
Computation of national estimates from both phases combined (i.e., total
NHANES III) is the preferred option; individual phase estimates may be
highly variable. In addition, individual phase estimates are not
statistically independent. It is also difficult to evaluate whether
differences in individual phase estimates are real or due to methodological
differences. That is, differences may be due to changes in sampling methods
or data collection methodology over time. At this time, there is no valid
statistical test for examining differences between Phase 1 and Phase 2.
Therefore, although point estimates can be produced separately for each
phase, no test is available to test whether those estimates are
significantly different from each other.
NHANES III is based on a complex, multi-stage probability sample design.
Several aspects of the NHANES design must be taken into account in data
analysis, including the sample weights and the complex survey design.
Appropriate sample weights are needed to estimate prevalence, means,
medians, and other statistics. Sample weights are used to produce correct
population estimates because each sample person does not have the same
probability of selection. The sample weights incorporate the differential
probabilities of selection and include adjustments for noncoverage and
nonresponse. A detailed discussion of nonresponse adjustments and issues
related to survey coverage have been published (U.S. DHHS, 1996). With the
large oversampling of young children, older persons, black persons, and
Mexican-Americans in NHANES III, it is essential that the sample weights be
used in all analyses. Otherwise, a misinterpretation of results is highly
likely. Other aspects of the design that must be taken into account in data
analyses are the strata and PSU pairings from the sample design. These
pairings should be used to estimate variances and test for statistical
significance. For weighted analyses, analysts can use special computer
software packages that use an appropriate method for estimating variances for
complex samples such as SUDAAN (Shah, 1995) and WesVarPC (Westat, 1996).
Although initial exploratory analyses may be performed on unweighted data
using standard statistical packages and assuming simple random sampling,
final analyses should be done on weighted data using appropriate sample
weights. A summary of the weighting methodology and the type of sample
weights developed for NHANES III is included in Weighting and Estimation
Methodology (U.S. DHHS, 1996).
The purpose of weighting the sample data is to permit analysts to produce
estimates of statistics that would have been obtained if the entire
sampling frame (the United States) had been surveyed. Sample weights can be
considered as measures of the number of persons the particular sample
observation represents. Weighting takes into account several features of
the survey: the specific probabilities of selection for the individual
domains that were oversampled as well as nonresponse and differences between
the sample and the total U.S. population. Differences between the sample and
the population may arise due to sampling variability, differential
undercoverage in the survey among demographic groups, and possibly other
types of response errors, such as differential response rates or
misclassification errors. Sample weighting in NHANES III was used to:
1. Compensate for differential probabilities of selection among subgroups
(i.e., age-sex-race-ethnicity subdomains where persons living in
different geographic strata were sampled at different rates);
2. Reduce biases arising from the fact that nonrespondents may be
different from those who participate;
3. Bring sample data up to the dimensions of the target population
totals;
4. Compensate, to the extent possible, for inadequacies in the sampling
frame (resulting from omissions of some housing units in the listing
of area segments, omissions of persons with no fixed address, etc.); and
5. To reduce variances in the estimation procedure by using auxiliary
information that is known with a high degree of accuracy.
In NHANES III, the sample weighting was carried out in three stages. The
first stage involved the computation of weights to compensate for unequal
probabilities of selection (objective 1, above). The second stage adjusted
for nonresponse (objective 2). The third stage used poststratification of
the sample weights to Census Bureau estimates of the U.S. population to
accomplish the third, fourth, and fifth objectives simultaneously. In
NHANES III, several types of sample weights (see the sample weights table
that follows) were computed for the interviewed and examined sample and are
included in the NHANES III data file. Also, sample weights were computed
separately for Phase 1 (1988-91), Phase 2 (1991-94), and total NHANES III
(1988-94) to facilitate analysis of items collected only in Phase 1, only
in Phase 2, and over six years of the survey. Three sets of pseudo strata
and PSU pairings are provided to use with SUDAAN in variance estimation.
Since NHANES III is based on a complex, multi-stage sample design,
appropriate sample weights should be used in analyses to produce national
estimates of prevalence and associated variances while accounting for
unequal probability of selection of sample persons. For example, the final
interview weight, WTPFQX6, should be used for analysis of the items or
questions from the family or household questionnaires, and the final MEC
examination weight, WTPFEX6, should be used for analysis of the
questionnaires and measurements administered in the MEC. Furthermore, for a
combined analysis of measurements from the MEC examinations and associated
medical history questions from the household interview, the final MEC
examination weight, WTPFEX6, should be used. We recommend using SUDAAN
(Shah, 1995) to estimate statistics of interest and the associated variance.
However, one can also use other published methods for variance estimation.
Application of SUDAAN and alternative methods, such as the average design
effect approach, balance repeated replication (BRR) methods, or jackknife
methods for variance estimation, are discussed in Weighting and Estimation
Methodology (U.S. DHHS, 1996).
Appropriate Uses of the NHANES III Sample Weights
Final interview weight, WTPFQX6
Use only in conjunction with the sample interviewed at home and
with items collected during the household interview.
Final examination (MEC only) weight, WTPFEX6
Use only in conjunction with the MEC-examined sample and with
interview and examination items collected at the MEC.
Final MEC+home examination weight, WTPFHX6
Use only in conjunction with the MEC+home-examined sample and
with items collected at both the MEC and home.
Final allergy weight, WTPFALG6
Use only in conjunction with the allergy subsample and with items
collected as part of the allergy component of the exam.
Final CNS weight, WTPFCNS6
Use only in conjunction with the CNS subsample and with items
collected as part of the CNS component of the exam.
Final morning examination (MEC only) subsample weight, WTPFSD6
Use only in conjunction with the MEC-examined persons assigned to
the morning subsample and only with items collected in the MEC
exam.
Final afternoon/evening examination (MEC only) subsample weight, WTPFMD6
Use only in conjunction with the MEC-examined persons assigned to
the afternoon/evening subsample and only with items collected in
the MEC exam.
Final morning examination (MEC+home) subsample weight, WTPFHSD6
Use only in conjunction with the MEC- and home-examined persons
assigned to the morning subsample and with items collected during
the MEC and home examinations.
Final afternoon/evening examination (MEC+home) weight, WTPFHMD6
Use only in conjunction with the MEC- and home-examined persons
assigned to the afternoon/evening subsample and with items
collected during the MEC and home examinations.
DATA PREPARATION AND PROCESSING PROCEDURES
Automated data collection procedures for the survey were introduced in
NHANES III. In the mobile examination centers, data for the interview and
examination components were recorded directly onto a computerized data
collection form. With the exception of a few independently automated
systems, the system was centrally integrated. This operation allowed for
ongoing monitoring of much of the data. Before the introduction of the
computer-assisted personal interview (CAPI), the household questionnaire
data were reviewed manually by field editors and interviewers. CAPI
(1992-1994 only) questionnaires featured built-in edits to prevent entering
inconsistencies and out-of-range responses. The multi-level data
collection and quality control systems are discussed in detail in the Plan
and Operation of the Third National Health and Nutrition Examination Survey,
1988-1994 (NCHS, 1994; U.S. DHHS, 1996). All interview, laboratory, and
examination data were sent to NCHS for final processing.
Guidelines were developed that provided standards for naming variables,
filling missing values and coding conventional responses, handling missing
records, and standardizing two-part quantity/unit questionnaire variables.
NCHS staff, assisted by contract staff, developed data editing
specifications that checked data sets for valid codes, ranges, and skip
pattern consistencies and examined the consistency of values between
interrelated variables. Comments, collected in both interviews and
examination components, were reviewed and recoded when possible. Responses
to "Other" and "Specify" were recoded either to existing code categories or
to new categories. The documentation for each data set includes notes for
those variables that have been recoded and standardized and for those
variables that differ significantly from what appears in the original data
collection instrument. While the data have undergone many quality control
and editing procedures, there still may be values that appear extreme or
illogical. Values that varied considerably from what was expected were
examined by analysts who checked for comments or other responses that might
help to clarify unusual values. Generally, values were retained unless they
could not possibly be true, in which case they were changed to "Blank but
applicable." Therefore, the user must review each data set for extreme or
inconsistent values and determine the status of each value for analysis.
Several editing conventions were used in the creation of final analytic
data sets:
1. Standardized variables were created to replace all two-part
quantity/unit questions using standard conversion factors.
Standardized variables have the same name as the variable of the
two-part question with an "S" suffix. For instance, MAPF18S (Months
received WIC benefits) in the MEC Adult Questionnaire was created from
the two-part response option to question F18, "How long did you receive
benefits from the WIC program?," using the conversion factor 12 months
per year.
2. Recoded variables were created by combining responses from two or more
like variables, or by collapsing responses to create a summary
variable for the purpose of confidentiality. Recoded variables have the
original variable name with an R suffix. For example, place of birth
variable (HFA6X) in the Family Questionnaire was collapsed to a three
level response category (U.S., Mexico, Other) and renamed HFA6XR.
Generally, only the recoded variable has been included in the data file.
3. Fill values, a series of one or more digits, were used to represent
certain specific conditions or responses. Below is a list of the fill
values that were employed. Some of the fill values pertain only to
questionnaire data, although 8-fill and blank-fill values are found in
all data sets. Other fill values, not included in this list, are used
to represent component-specific conditions.
6-fills = Varies/varied. (Questionnaires only)
7-fills = Fewer than the smallest number that could be reported within
the question structure (e.g., fewer than one cigarette per day).
(Questionnaires only)
8-fills = Blank but applicable/cannot be determined. This means that
a respondent was eligible to receive the question, test, or component
but did not because of refusal, lack of time, lack of staff, loss of
data, broken vial, language barrier, unreliability, or other similar
reasons.
9-fills = Don't know. This fill was used only when a respondent did
not know the response to a question and said, "I don't know."
(Questionnaires only)
Blank fills = Inapplicable. If a respondent was not eligible for a
questionnaire, test, or component because of age, gender, or specific
reason, the variable was blank-filled. In the questionnaire, if a
respondent was not asked a question because of a skip-pattern,
variables corresponding to the question were blank-filled. For
examination or laboratory components, if a person was excluded by a
defined protocol (e.g., screening exclusion questions) and these
criteria are included in the data set, then the corresponding
variables were blank-filled for that person. For home examinees,
variables for examination components and blood tests not performed as
part of the home examination protocol were blank-filled.
4. For variables describing discrete data, codes of zero (0) were used to
mean "none," "never," or the equivalent. Value labels for which "0"
is used include: "has not had," "never regularly," "still taking," or
"never stopped using." Unless otherwise labeled, for variables
containing continuous data, "zero" means "zero.
5. Where there are logical skip patterns in the flow of the questionnaire
or examination component, the skip was indicated by placing the
variable label of the skip destination in parentheses as part of the
value label of the response generating the skip. For example, in the
Physical Function Evaluation, the variable PFPWC (in wheelchair) has a
value label, "2 No (PFPSCOOT)" that means that the next item for
persons not in a wheelchair would be represented by the variable,
PFPSCOOT.
Variable Nomenclature
A unique name was assigned to every NHANES III variable using a standard
convention. By following this naming convention, the origin of each
variable is clear, and there is no chance of overlaying similar variables
across multiple components. Variables range in length from three to eight
characters. The first two variable characters represent the topic (e.g.,
analyte, questionnaire instrument, examination component) and are listed
below alphabetically by topic. For questionnaires administered in the
household, the remainder of the variable name following the first two
characters indicates the question section and number. For example, data
for the response to the Household Adult Questionnaire question B1 are
contained in the variable HAB1. For most laboratory and examination
variables, as well as some other variables, a "P" in the third position
refers to "primary" and the remainder of the variable name is a brief
description of the item. For instance, in the Laboratory Data File,
information on the length of time the person fasted before the first blood
draw is contained in the variable PHPFAST. The variable PHPFAST was derived
as follows: characters 1-2 (PH) refer to "phlebotomy," character 3 (P)
refers to "primary," characters 4-8 (FAST) refer to an abbreviation for
"fasting."
CODE TOPIC
AT Alanine aminotransferase (from biochemistry profile)
AM Albumin (from biochemistry profile)
AP Alkaline phosphatase (from biochemistry profile)
AL Allergy skin test
AC Alpha carotene
AN Anisocytosis
AA Apolipoprotein (AI)
AB Apolipoprotein (B)
AS Aspartate aminotransferase (from biochemistry profile)
LA Atypical lymphocyte
AU Audiometry
BA Band
BO Basophil
BS Basophilic stippling
BC Beta carotene
BX Beta cryptoxanthin
BL Blast
BU Blood urea nitrogen (BUN) (from biochemistry profile)
BM Body measurements
BD Bone densitometry
C1 C-peptide (first venipuncture)
C2 C-peptide (second venipuncture)
CR C-reactive protein
UD Cadmium
CN Central nervous system function evaluation
CL Chloride (from biochemistry profile)
CO Cotinine
CE Creatinine (serum)(from biochemistry profile)
UR Creatinine (urine)
DM Demographic
DE Dental examination
MQ Diagnostic interview schedule
DR Dietary recall (total nutrient intakes)
EO Eosinophil
EP Erythrocyte protoporphyrin
FR Ferritin
FB Fibrinogen
RB Folate (RBC)
FO Folate (serum)
FH Follicle stimulating hormone (FSH)
FP Fundus photography
CODE TOPIC
GG Gamma glutamyl transferase (GGT) (from biochemistry profile)
GU Gallbladder ultrasonography
GB Globulin (from biochemistry profile)
G1 Glucose (first venipuncture)
G2 Glucose (second venipuncture)
SG Glucose (from biochemistry profile)
GH Glycated hemoglobin
GR Granulocyte
C3 HCO3 (Bicarbonate)(from biochemistry profile)
HD HDL cholesterol
HP Helicobacter pylori antibody
HT Hematocrit
HG Hemoglobin
AH Hepatitis A antibody (HAV)
HB Hepatitis B core antibody (anti-HBc)
SS Hepatitis B surface antibody (anti-HBs)
SA Hepatitis B surface antigen (HBsAg)
HC Hepatitis C antibody (HCV)
DH Hepatitis D antibody (HDV)
H1 Herpes 1 antibody
H2 Herpes 2 antibody
HX Home examination (general)
HF Household family questionnaire
HA Household adult questionnaire
HQ Household questionnaire variables (composite)
HS Household screener questionnaire
HY Household youth questionnaire
HZ Hypochromia
I1 Insulin (first venipuncture)
I2 Insulin (second venipuncture)
UI Iodine (urine)
FE Iron
SF Iron (from biochemistry profile)
LD Lactate dehydrogenase (from biochemistry profile)
L1 Latex antibody
LC LDL cholesterol (calculated)
PB Lead
LP Lipoprotein (a)
LH Luteinizing hormone
LU Lutein/zeaxanthin
LY Lycopene
LM Lymphocyte
MR Macrocyte
MC Mean cell hemoglobin (MCH)
MH Mean cell hemoglobin concentration (MCHC)
MV Mean cell volume (MCV)
PV Mean platelet volume
MA MEC adult questionnaire
MX MEC examination (general)
FF Dietary food frequency (ages 12-16 years)
MP MEC proxy questionnaire
MY MEC youth questionnaire
ME Metamyelocyte
MI Microcyte
MO Monocyte
MN Mononuclear cell
ML Myelocyte
CODE TOPIC
IC Normalized calcium (derived from ionized calcium)
OS Osmolality (from biochemistry profile)
PH Phlebotomy data collected in MEC (e.g., questions)
PS Phosphorus (from biochemistry profile)
PF Physical function evaluation
PE Physician's examination
PL Platelet
DW Platelet distribution width
PK Poikilocytosis
PO Polychromatophilia
SK Potassium (from biochemistry profile)
PR Promyelocyte
RC Red blood cell count (RBC)
RW Red cell distribution width (RDW)
RE Retinyl esters
RF Rheumatoid factor antibody
RU Rubella antibody
WT Sample weights
SE Selenium
SI Sickle cell
NA Sodium (from biochemistry profile)
SH Spherocyte
SP Spirometry
SD Survey design
TT Target cell
TE Tetanus
TB Total bilirubin (from biochemistry profile)
CA Total calcium
SC Total calcium (from biochemistry profile)
TC Total cholesterol
CH Total cholesterol (from biochemistry profile)
TI Total iron binding capacity (TIBC)
TP Total protein (from biochemistry profile)
TX Toxic granulation
TO Toxoplasmosis antibody
PX Transferrin saturation
TG Triglycerides
TR Triglycerides (from biochemistry profile)
TY Tympanometry
UA Uric acid (from biochemistry profile)
UB Urinary albumin
VU Vacuolated cells
VR Varicella antibody
VA Vitamin A
VB Vitamin B12
VC Vitamin C
VE Vitamin E
WC White blood cell count (WBC)
WW WISC/WRAT cognitive test
GENERAL REFERENCES
Delgado JL, Johnson CL, Roy I, Trevino FM. Hispanic Health and Nutrition
Examination Survey: methodological considerations. Amer J Pub Health
80(suppl.):6-10. 1990.
Engel A, Murphy RS, Maurer K, Collins E. Plan and operation of the HANES I
Augmentation Survey of Adults 25-74 Years, United States, 1974-75.
National Center for Health Statistics. Vital Health Stat 1(14). 1978.
Freeman DH, Freeman JL, Brock DB, Koch GG. Strategies in the multivariate
analysis of data from complex surveys II: an application to the United
States National Health Interview Survey. Int Stat Rev 40(3):317-30. 1976.
Khare M, Mohadjer LK, Ezzati-Rice TM, Waksberg J. An evaluation of
nonresponse bias in NHANES III (1988-91). 1994 Proceedings of the Survey
Research Methods section of the American Statistical Association. 1994.
Landis JR, Lepkowski JM, Eklund SA, Stehouwer SA. A statistical
methodology for analyzing data from a complex survey, the first National
Health and Nutrition Examination Survey. National Center for Health
Statistics. Vital Health Stat 2(92). 1982.
McDowell A, Engel A, Massey JT, Maurer K. Plan and operation of the second
National Health and Nutrition Examination Survey, 1976-80. National Center
for Health Statistics. Vital Health Stat 1(15). 1981.
Miller HW. Plan and operation of the Health and Nutrition Examination
Survey, United States, 1971-1973. National Center for Health Statistics.
Vital Health Stat 1(10a) and (10b). 1973.
National Center for Health Statistics. Plan and initial program of the
Health Examination Survey. Vital Health Stat 1(4). 1965.
National Center for Health Statistics. Plan and operation of a health
examination survey of U.S. youths 12-17 years of age. Vital Health Stat
1(8). 1969.
National Center for Health Statistics. Plan and operation of the Hispanic
Health and Nutrition Examination Survey, 1982-84. Vital Health Stat 1(19).
1985.
National Center for Health Statistics. Plan and operation of the Third
National Health and Nutrition Examination Survey, 1988-94. Vital Health
Stat 1(32). 1994.
National Center for Health Statistics. Plan, operation, and response
results of a program of children's examinations. Vital Health Stat 1(5).
1967.
Shah BV, Barnwell BG, Bieler GS. SUDAAN User's Manual: Software for
Analysis of Correlated Data. Research Triangle Park, NC: Research Triangle
Institute. Release 6.04. 1995.
Skinner CJ. Aggregated analysis: standard errors and significance tests.
In: Skinner CJ, Holt D, Smith TMF, eds. Analysis of complex surveys. New
York: John Wiley and Sons, Inc. 1989.
U.S. Department of Health and Human Services (DHHS). National Center for
Health Statistics. NHANES III reference manuals and reports (CD-ROM).
Hyattsville, MD: Centers for Disease Control and Prevention, 1996.
Available from National Technical Information Service (NTIS), Springfield,
VA. Acrobat .PDF format; includes access software: Adobe Systems, Inc.
Acrobat Reader 2.1.
Westat, Inc. A User's Guide to WesVarPC. Rockville, MD. Westat, Inc.
1996.
Yetley E, Johnson C. Nutritional applications of the Health and Nutrition
Examination Surveys (HANES). Annu Rev Nutr 7:441-63. 1987.
NHANES III Dietary Interview Component
Dietary interviews were administered to all examinees by a trained dietary
interviewer in the mobile examination center (MEC). Respondents reported all
foods and beverages consumed except plain drinking water (i.e., not bottled) for
the previous 24-hour time period (midnight to midnight). An automated,
microcomputer-based dietary interview and coding system known as the NHANES III
Dietary Data Collection (DDC) System was used to collect all NHANES III dietary
recall data. The DDC system was developed for use in the survey by the
University of Minnesota's Nutrition Coordinating Center (NCC).
The dietary interviews were conducted in English and Spanish by
bilingual dietary interviewers in a private room to ensure
confidentiality. Proxy respondents were permitted for infants and children aged
two months through five years and for other respondents who were unable to
report on their own. Children aged six to 11 years were permitted to report
their own intake if the interviewer deemed it acceptable and appropriate, but
many interviewers for respondents in this age category were completed by proxy
or with the child and a proxy. The dietary interviewers contacted other
information sources such as care providers and schools to obtain complete
dietary intake data for respondents.
The primary source of food composition data for NHANES III is the U.S.
Department of Agriculture (USDA) Survey Nutrient Database; two nutrient files
were provided by USDA for use in NHANES III (USDA 1993, 1995). Each USDA file
contained food composition values that were appropriate for the time period
during which the NHANES III data were collected. Additionally, food composition
data for a small number of herbs and spices were obtained from NCC (NCC, 1996).
The DDC system's foods database was designed specifically to handle time-related
changes in food descriptions, food amounts, and recipes; updated information was
applied retrospectively to data collected in the early part of NHANES III. As
was mentioned earlier, two USDA food composition databases were used to assign
nutrient values to the NHANES III dietary recalls (USDA 1993; USDA, 1995). In
addition to data changes that occurred in the nutrient values of foods due to
food product reformulations, recipe changes, and so forth, the U.S. marketplace
underwent tremendous growth and change as new food product lines were introduced
and new food components were added to the food
supply (e.g., fat substitutes and artificial sweeteners). The impact of these
and other changes may require additional analysis.
Dietary recall interviews were edited by the interviewers to ensure that they
were as complete as possible. NCHS completed all final editing and
determinations regarding the completeness and reliability of the dietary
recalls. Analysts should note that the data reported are self-reported data.
Extreme values were verified.
Information pertaining to the use of nutritional supplements and
antacids was reported separately during the Household Adult and
Household Youth Interviews.
A number of quality-control monitoring techniques were employed during the
survey. The techniques for monitoring the Dietary Interview component included
observations of actual dietary interviews and reviews of audiotape interviews by
NCHS and contractor staff. In addition, the dietary interviewers worked in
two-person teams; there was one team in each MEC. The dietary interviewers
performed 10-percent cross-check reviews of their partners' work using printed
recall reports. Finally, newsletters, field memoranda, telephone calls, and
staff retraining sessions were other methods used to maintain quality control
during the survey. Refer to the NHANES III Dietary Interviewer's Training
Manual for the dietary interview protocol (U.S. DHHS, 1996b).
Analysts are encouraged to use six years of survey data in their
analyses. The reliability of estimates is improved when larger sample sizes are
used. For more detailed information, see the Analytic and Reporting Guidelines
for NHANES III (U.S. DHHS, 1996b). In addition, MEC final examination weights
(WTPFEX6) should be used when analyzing the total nutrient intake data and
related questionnaire data in this file. For more information on the use of
sample weights in NHANES III data analysis, refer to the NHANES III Analytic and
Reporting Guidelines (U.S. DHHS, 1996b).
NHANES III Total Nutrient Intakes and Foods Data Files
NCHS prepared 4 datasets that are based on the 24-hr dietary recall interview.
Total nutrient intakes were reported in the NHANES III Examination Data file
(Catalog 76200). Three foods files were prepared; three files are found in
Catalog 76700: NHANES III Individual Foods Data File from the Dietary Recall;
NHANES III Combination Foods Data File from the Dietary Recall; NHANES III
(Variable) Ingredients Data File from the Dietary Recall. Documentation was
prepared for each of the foods data files. Data users are encouraged to review
all of the documentation prior to using the data files.
Look-up Tables for the NHANES III Foods Data Files
Textual descriptions for several NHANES III Foods Data File numeric code
variables are located in an Appendix section that accompanies the Foods Data
Files. The Appendix files are referred to as "look-up" tables throughout the
data file documentation for the Foods Data Files. Computer code is provided so
that data users can merge the foods data files with the information in the
Appendix/look-up tables.
INDIVIDUAL FOODS FILE
The NHANES III Individual Foods File (IFF) is comprised of records. Each IFF
record includes a meal number (DRPMN), a food number (DRPFN), and a component
number (DRPCN). The IFF was sorted by case, meal number, food number (within
meals), and component number (within foods). Meals are comprised of foods.
Foods are comprised of one or more components. Most components in the IFF are
foods. There are some ingredient-type components (salt, water, corn meal, etc.)
in the IFF. Components were either eaten alone or in combination with other
foods. The term "component foods" may be used for most of the components in the
IFF. Components may have ingredient records associated with them. Ingredient
information is reported separately in the NHANES III Variable Ingredients File.
The IFF contains information on all component level foods and beverages reported
by examinees whose dietary recalls have a final dietary recall status code
(DRPSTAT) equal to 1, 2, or 5; the documentation for this file includes an
explanation of DRPSTAT values. Partial foods data are reported in the IFF for
examinees with incomplete recalls (DRPSTAT=2) and for nursing infants and
children (DRPSTAT=5). If total dietary intake information is required for data
analysis, only examinees with DRPSTAT=1 should be selected for analysis.
The IFF can be linked to the other NHANES III foods files by case, meal number
(DRPMN), food number (DRPFN), component number (DRPCN), and ingredient number
(DRPIN). Multi-component or combination foods have a combination foods flag
(DRPCFF) value equal to 1 in the IFF; combination foods are described in the
Combination Foods File. Some component foods in the IFF have variable
ingredients (DRPVIF=1); the Variable Ingredients File contains information about
these ingredients. The Appendix section of the IFF file documentation contains
a series of tables that illustrate the linkages between the NHANES III foods
files. Additionally, data users should refer to the documentation sections for
each of the NHANES III foods files to learn more about the content and uses of
these data.
Coding Foods Reported During NHANES III
An underlying principle of NHANES III Dietary Data Collection (DDC) System
database maintenance was to use U.S. Department of Agriculture (USDA) Survey
Nutrient Data Base (SNDB) files as the primary data sources for food codes
(Codebook File), recipes (Recipe File), and nutrient data (Nutrient File).
Hispanic HANES was the first HANES to use SNDB data bases exclusively to code
and report dietary findings. Continued use of the SNDB data bases during NHANES
III served to maintain consistency with Hispanic HANES (HHANES) data for
nutrition monitoring purposes. Most of the foods and beverages reported during
NHANES III are coded using USDA SNDB food codes (hereafter referred to as "USDA
food codes"). A small number of non-USDA food codes are included in the NHANES
III data release files because there were no USDA food codes for spices and
certain recipe ingredients. All component food codes DRPFCODE)reported in the
IFF have text descriptions in the look-up table called "Codebook".
Many of the component foods reported during NHANES III were coded using the
food code that USDA would use in its food consumption surveys. For some foods
reported during NHANES III, however, the coding method used was not based on the
USDA code for the following reasons:
1. The DDC system foods database was more specific than the USDA database
with respect to recipe ingredient specification. Ingredient information was
used to compute the nutrient content of recipe foods reported in NHANES III.
2. A brand name product was not in the USDA database. The DDC system
included more than 6,000 brand names in approximately 30 food categories; the
USDA database contains fewer brand name products. NCC assigned USDA food codes
to all brand-name foods in the DDC system. The coding decisions for brand-name
foods were based upon USDA, NCC, and manufacturer information.
3. The brand name was in the USDA database, but the University of
Minnesota Nutrition Coordinating Center (NCC) coded it differently from USDA.
For example, NCC and USDA used different criteria to code brand-name cookies,
salad dressings, and crackers; NCC nutrient criteria were used to assign USDA
food codes to commercial products in these food groups.
4. An NCC recipe was used instead of the USDA recipe.
5. The food was not in the USDA database.
Food Type Categories
All component foods in the IFF are divided into two food type categories as
denoted by the variable "DRPREC." The food category determines the method used
to code foods and assign nutrient values to foods.
The first food category type is "elemental" foods. Elemental foods include
milk, fresh fruits and vegetables, ready-to-eat breakfast cereal, sweeteners,
and fats and oils. Some mixture foods also are classified as elemental foods in
the DDC system foods database. USDA food codes were assigned to elemental
foods. The USDA food code for elemental foods (DRPFCODE) has a direct link to
the USDA SNDB Nutrient File that was used to assign nutrient values to all
elemental foods.
The second category of foods is "recipe" foods. Recipe foods are denoted
by DRPREC=1. The survey files for recipe foods contain ingredient records; the
ingredient records for each recipe food are linked together by a USDA food code.
The USDA food codes for recipe foods (DRPFCODE) are reported in the IFF.
The nutrient values for recipe foods were calculated using recipe
ingredient nutrient values found in the USDA SNDB Nutrient Files provided for
use in NHANES III. The USDA Nutrient Files for NHANES III are slightly
different from the standard public release USDA SNDB Nutrient Files because
special food codes for recipe ingredients (usually denoted by food codes that
begin with numbers "00" were added to the USDA file at NCHS's request. The
nutrient values for recipe ingredients were summed to produce the nutrient
values for all recipe foods in the IFF.
A USDA food code was assigned to recipe foods as a means of linking the
recipe ingredient records. The USDA food code that is used to report the recipe
food should be used only to provide a basic food description for the food and
was not used to assign nutrient values to recipe foods. Further, a USDA food
code may be used to code more than one type of recipe food; this was because the
DDC system included more food description options. All food codes (DRPFCODE)
used in NHANES III have text descriptions associated with them in a look-up
table called "Codebook".
To summarize, the distinctions between elemental foods and recipe foods
are: 1) recipe foods have ingredient records associated with them and elemental
foods do not; 2) recipe ingredient information was used to compute the nutrient
values of recipe foods reported during the survey.
Using Ingredient Information to Calculate Nutrient Values of Foods
Recipe foods have ingredient records associated with them in the
comprehensive DDC System output files. Many of the ingredients used to prepare
recipe foods were "variable" ingredients meaning that respondents could specify
the types of ingredients that were used to prepare the foods they ate. The
variable ingredient flag (variable name: DRPVIF) denotes the recipe foods that
had variable ingredients. The ability to vary the types of ingredients that
were used to prepare recipe foods is important because the nutrient values for
recipe foods that have a particular food code (DRPFCODE) can have a range of
values rather than a single nutrient profile. To illustrate, take the example
of a homemade macaroni and cheese casserole. There were two variable ingredient
probes in the DDC system for this entry. One probe pertained to the type of
cheese used, and the second probe was for the type of milk used in the recipe.
Assume that the same basic recipe was used for this dish. If one respondent
used low-fat cheddar cheese and skim milk, and a second respondent used regular-
fat cheddar cheese and whole milk, the nutrient content of the two dishes would
differ because two major recipe ingredients had different nutrient values.
A second example would be for a commercial food prepared at home. Many
commercial foods were defined as recipe foods in the DDC system so that specific
information about the ingredients used to prepare commercial foods could be
ascertained. One example of a commercial food product with variable ingredient
probes was commercial breaded chicken that was purchased in frozen form and
fried at home. The DDC system probes included the type of fat used to fry the
chicken and a probe for the addition of salt during food preparation. A second
example of a commercial product with variable ingredients was a brand-name meal
replacement beverage that was reconstituted with fluid milk; the type of milk
used to prepare the beverage was a variable ingredient. These examples
illustrate how preparation ingredients produce variations in the nutrient
content of the prepared foods.
Notes to Analysts
Ordinarily, respondents were not asked to report plain drinking water
during the dietary interview because a separate set of questions addressed plain
drinking water consumption. Plain drinking water was a component of certain
foods, however. This occurred when foods were diluted with extra water or when
modified recipes were entered by individual components that included plain
drinking water. In these instances, drinking water was included in the file as
a component record, and the nutrients contributed from the drinking water were
included in the IFF and the Dietary Recall Total Nutrient Intakes portion of the
Examination File.
Respondents also were not asked to quantify the amount of salt added during
food preparation or at the table. A separate set of questions was administered
to determine categories of salt use at the table. This information was reported
in the Dietary Recall Total Nutrient Intakes portion of the Examination File.
Salt appears in the IFF as a component record for some foods that were reported
as having modified recipes. If a food was entered by ingredient-type food
components that included salt, a component record for salt was included in the
IFF. (Note: Also refer to the documentation for the Combination Foods File.)
Food Descriptions
1. Brand-name foods
The DDC system foods database contains more than 6,000 brand-name foods.
DDC system brand-name products are grouped into more than 30 food categories and
include commercial frozen entrees, "fast food" restaurant menu items, ready-to-
eat breakfast cereals, candy, fats and margarine, and juice drink beverages.
The brand name foods in the IFF have a USDA food code and a numeric brand
product code (DRPCOMM); DRPCOMM is linked to a look-up table called "Brands".
2. Generic foods
Generic foods in the IFF have USDA food codes assigned to them; the USDA
food codes are linked to a food code description in the look-up table called
"Codebook". Many generic foods have expanded food descriptions in the IFF.
The food identification code (DRPFID) variable is linked to an expanded food
description; the look-up table "IDCODE" contains the text descriptions for
DRPFID.
Two examples are provided to illustrate the use of food identification
codes (DRPFID). The first example is trout. The DDC system probes for trout
included several types of trout -- rainbow, brown, speckled, and so forth. The
USDA Codebook does not distinguish among types of trout but uses the same food
code for all varieties of trout. The food identification codes (DRPFID) in the
IFF can be used to distinguish between different types of trout that were
reported in the Survey. In this example, if a respondent reported eating
rainbow trout, the DRPFID would be a specific code for rainbow trout.
A second example relates to cuts of meat such as beefsteak. The DDC system
probes included the cut of steak reported -- sirloin, round, tenderloin, and so
forth. In summary, the food identification codes often provide more specific,
descriptive information for foods that have the same USDA food code.
Food Amount Information
The DDC system's computer data entry screens usually displayed several
options for entering food amount data. In general, the DDC system's food amount
options corresponded to the food amount options listed in the USDA SNDB
Codebook. In addition to weight and volume options, many foods could be
quantified by means of "food specific units" (FSUs). For foods such as whole
chicken parts, pork chops, commercial sliced bread, sliced luncheon meats, and
so forth, the FSU was the preferred method for quantifying such foods because
their dimensions were difficult to estimate. All DDC system food amount
entries, including the food models, volume amounts, and FSUs, were converted
into gram weights automatically during final data processing and preparation.
All food amounts in the IFF were reported as grams of food eaten.
Unusually large amounts of food were verified during the dietary interview.
The DDC system's data quality control features included a "maximum amount check
verification screen" for each food item. This screen appeared whenever large
food-specific amounts of food and beverages were entered during the interview.
Interviewers were required to verify that the amount of food or beverage
reported was correct.
Default Selections for Foods and Food Amounts
1. Default Selections for Foods
The DDC system was designed to collect specific information about foods,
yet respondents' knowledge about the foods they ate varied. When respondents
were unable to provide specific information about the foods they ate, the
dietary interviewers used the DDC system's default selection options to complete
data entry for foods reported during the survey. The DDC system had default
selection options for the type of food, ingredients used to prepare foods, and
food preparation methods. DDC system default options were available for many
home-prepared and commercially prepared foods. When the origin of the food
(i.e., commercially prepared or homemade) was unknown, a system default option
"unknown as to whether commercially prepared or homemade" was selected by the
interviewer. Default selections also were available for food preparation
methods and the ingredients used to prepare foods. The DDC system's default
food selections have USDA food codes associated with them that are linked to the
USDA SNDB files described earlier.
2. Default Food Amounts
Some foods were not quantified at the time of the dietary interview for by
a number of reasons.
Example #1: Food amounts were known but were reported using an amount
option that was not available to the interviewer at the time of the interview.
Therefore, an amount could not be entered using the DDC system. The interviewer
noted the amount description provided by the respondent. NCHS and USDA staff
completed the research required to quantify these foods. New food amount
options were added to the DDC system throughout the survey.
Example #2: The respondent was unable to quantify the amount of food
consumed, but the food was from a small list of foods for which the interviewers
were permitted to calculate a default amount. The dietary interviewer initially
"flagged" the food amount as having an unknown amount. All information provided
by the respondent that could be used to calculate a default amount was recorded
by the interviewer. During the interviewer's edit, amounts of certain foods,
including sandwich condiments, catsup and barbecue sauce on meat, coffee
creamer, butter and margarine added to bread, and milk added to beverages and
cereal, were calculated. NCHS reviewed the interviewers' calculations to verify
that the calculation was performed correctly.
Example #3: The amount of food consumed was unknown, and no default amount
standard existed for the food. This problem was most common when the recall
involved infants and young children who attended day care or school on the day
of the recall. The interviewers were instructed to flag the food as having an
unknown amount. In the meantime, the dietary interviewers attempted to obtain
information from day care providers, schools, etc. If the amount could not be
entered, NCHS assigned a default food amount. The default food amounts usually
were based on a "not further specified amount" for a similar food in the USDA
SNDB Codebook. NCHS developed editing guidelines that were used to assign food
amounts to many types of foods.
Examples #2 and #3 describe situations in which food was consumed but for
which the respondents could not quantify the food. In both instances, the
amount consumed was entered initially into the DDC system as an "unknown
amount." A food amount was assigned later. The default amount flag (DRPCAUF)
in the IFF denotes the foods described in examples #2 and #3 that had default
amounts assigned; if DRPCAUF=1, a default amount was assigned to the component
food.
Food Preparation Information
The IFF includes information on food preparation methods and ingredients
used to prepare foods. The interview probes for food preparation methods varied
according to the type of foods reported. For example, the probes for vegetables
usually began with the name of the vegetable and whether it was eaten raw or if