Evaluating the Measurement Error Structure of Self-Report Dietary Assessment Instruments

[glossary term:] Recovery biomarkers have been used in [glossary term:] validation studies to estimate error in data from self-report dietary assessment instruments (Learn More about Biomarkers). A recovery biomarker is a specific biologic product that is directly related to intake and not subject to homeostasis or substantial inter-individual differences in metabolism, thus allowing for an estimate of intake without systematic [glossary term:] bias. Only a limited number of recovery biomarkers are currently known, including doubly labeled water (DLW) for energy intake, urinary nitrogen for protein intake, urinary potassium for potassium intake, and urinary sodium for sodium intakes.

Two other kinds of [glossary term:] biomarkers used in dietary assessment are [glossary term:] concentration biomarkers, which are related to dietary intakes but subject to substantial inter-individual differences in metabolism, and [glossary term:] predictive biomarkers, which have a relationship with intake that is much stronger than recovery biomarkers because they are relatively stable, time-related, and sensitive to intake in a dose-response manner. Although predictive biomarkers may be affected by systematic error, after appropriate [glossary term:] calibration based on feeding studies, they can be used for adjustment for measurement error in a manner similar to recovery biomarkers [2-3]. In general, however, concentration biomarkers without a stable relationship with [glossary term:] true intake cannot be used for this purpose.

Evaluating 24-hour Dietary Recalls and Food Frequency Questionnaires

Several large validation studies using recovery biomarkers have been conducted in recent years, including the National Cancer Institute’s Observing Protein and Energy Nutrition (OPEN) [4-6], the Women’s Health Initiative Nutrition Biomarkers Study (NBS) [7], the Nutrition and Physical Activity Assessment Study (NPAAS) [8], the United States Department of Agriculture’s Automated Multiple-Pass Method (AMPM) Validation Study [9], and the UCLA Energetics Study [10]. These studies have shed considerable insights on the nature of measurement error in data collected using 24HRs, food records, and FFQs [11].

In OPEN, investigators assessed the measurement error structure of two non-consecutive 24HRs and an FFQ. Recovery biomarkers used to estimate true intake included DLW for energy and urinary nitrogen for protein. Statistical modeling was then used to assess the structure of measurement error in the 24HR and FFQ. The findings suggested that data from a 24HR have larger [glossary term:] within-person random error but smaller [glossary term:] systematic error than data collected using an FFQ. The error structure of the instruments relates to their characteristics (see the individual dietary assessment instruments in the Instrument Profiles section of the Primer). For example, the within-person random error in a 24HR is primarily driven by [glossary term:] day-to-day variation in intake and other [glossary term:] random errors that affect reporting from day-to-day, whereas the systematic error in an FFQ is primarily driven by inaccuracies associated with the cognitive challenge of recalling long-term intake as well as features of the instrument, such as the finite food list and the relative lack of detail about foods consumed.

OPEN also showed that the [glossary term:] correlations between intake of nutrients measured using multiple 24HRs and true intake are higher than those between intake measured using an FFQ and true intake, although after [glossary term:] energy adjustment (e.g., using protein density instead of absolute protein), the correlations using an FFQ were comparable to those using multiple 24HRs (Learn More about Energy Adjustment). Further, attenuation factors for FFQ-reported absolute nutrients (not densities) are closer to zero than those for 24HR. This indicates that estimated associations between a dietary [glossary term:] exposure and health exposure are attenuated to a greater extent when FFQ data are used (but again, not after energy adjustment) [11].

Results from these recovery biomarker studies also show that correlation coefficients and attenuation factors for absolute intakes across nutrients vary. For both FFQ and 24HR data, absolute reported intakes of energy [11] and sodium [12] have correlation coefficients and attenuation factors at a level so low that calibration of reported intakes or adjustment of relative risks using [glossary term:] regression calibration may not be possible. Absolute reported intakes of protein and potassium, however, have correlation coefficients and attenuation factors sufficient for use in calibrating intakes (Learn More about Calibration) and for regression calibration (Learn More about Regression Calibration) [6,12].

Although absolute energy intake is measured poorly in self-report dietary assessment instruments, recovery biomarker studies show that its use to create energy-adjusted variables for self-reported protein, sodium, and potassium intakes is very useful, improving both the correlation coefficients and attenuation factors that allow calibration and regression calibration. These studies also provide data about the [glossary term:] mean percentage of [glossary term:] misreporting for absolute intakes of energy, protein, sodium and potassium for different dietary assessment instruments. Energy is seriously misreported on both 24HR and FFQs ([glossary term:] underreporting of 6% to 26% of energy for 24HR and 24% to 33% of energy for FFQ). Results for absolute protein are more variable for 24HRs and for FFQs (24HRs range from 13% underreporting to 19% overreporting; FFQs range from 29% underreporting to 4% overreporting) [11]. With so few available recovery biomarkers, the extent to which results pertaining to them apply to other nutrients and dietary components is unknown.

Evaluating Food Records

Most of the large biomarker-based validation studies conducted to date have focused on 24HR and FFQ data, except for the NPASS [8], which analyzed data from an unweighted 4-day food record. Mean percent underreporting of absolute energy and protein based on food records in this study was 20% and 4% for energy and protein, respectively.

NPASS also found that 24HR and food records captured truth as measured by biomarkers for energy and protein better than did an FFQ. Based on the characteristics of the instruments (i.e., focused on detailed data for one or a small number of days), it is logical to assume that food records are similar to 24HR in terms of containing substantial within-person random error.

However, the source of systematic error in food record data is likely different from that for 24HR data. Food records may have fewer missing foods and beverages compared to 24HR because foods are reported as they are consumed. However, food records will likely contain greater systematic errors related to changes in diet resulting from the act of recording (i.e., [glossary term:] reactivity) and errors due to the uneven quality of reporting across participants. To date, fewer studies have examined the measurement error properties of dietary data collected using food records compared to 24HR.

Evaluating Screeners

Brief instruments, such as screeners, do not assess total energy or protein intake, and thus it is not possible to examine their measurement error properties relative to recovery biomarkers. However, it is reasonable to assume that FFQ-type screeners are similar to complete FFQs with regard to measurement error structure, i.e., more systematic error than random within-person error.

Summary and Conclusions

Overall, the current evidence suggests that data collected using 24HRs, though not free of systematic error, provide less biased estimates of intake than do FFQ data. 24HRs are thus the preferred assessment tool for most purposes (see Choosing an Approach for Dietary Assessment). Further investigation using [glossary term:] recovery biomarkers is needed to assess [glossary term:] bias in data collected using food records. FFQs do a reasonable job for relating energy-adjusted nutrients to health [glossary term:] outcomes but not for absolute intakes. However, [glossary term:] regression calibration techniques can help to reduce bias in estimated diet-health outcome relationships when FFQs are used as a primary dietary assessment instrument (Learn More about Regression Calibration), such as in ongoing cohort studies. Even in this case though, further improvement in FFQs is needed to increase our ability to detect diet-health outcome relationships. FFQs are not useful for estimating population distributions of intake (see Choosing an Approach for Dietary Assessment).

Improved understanding of the measurement error in data collected using different instruments through [glossary term:] biomarker-based [glossary term:] validation studies has spurred efforts to increase the feasibility of collecting 24HR data in large studies. For example, the Automated Self-Administered 24-Hour Dietary Assessment Tool (ASA24) has been created to eliminate the need for an interviewer and coding of intakes. Advances in statistical modeling have been developed for the use of 24HR in studies aimed at surveillance and at estimating relationships between diet and health. Improved understanding of the characteristics and error structure of the instruments has also led to strategies to use them in combination to take advantage of the strengths and minimize the weaknesses of each (see Choosing an Approach for Dietary Assessment).

An ongoing project that is pooling data from several biomarker-validation studies will help to further advance our knowledge in this area and potentially lead to improved approaches for minimizing and accounting for measurement error in dietary intake data. Updates to this website that reflect our evolving evidence base will reflect findings of this study and similar initiatives.