Validation Using Unbiased Reference Instruments

Recovery biomarkers

These include urinary nitrogen, urinary potassium, urinary sodium, and doubly labeled water (DLW) (Learn More about Biomarkers). Recovery biomarkers measure true usual intake with only [glossary term:] within-person random error (i.e., without systematic error) that is independent of error in self-report. Therefore, recovery biomarkers can be used as unbiased references in validation studies to explore the relationship between reported intake and true usual intake. As noted above, [glossary term:] measurement error characteristics of interest, such as [glossary term:] bias, correlation coefficients between self-reported and true usual intakes, and [glossary term:] attenuation factors can be calculated to quantify the relationship [1]. Recovery biomarkers have been used to assess error in 24HRs, food records, and FFQs [1-4].

For studies that aim to assess the [glossary term:] validity of screeners that assess usual intake of a limited number of nutrients or food groups, options for validation using recovery biomarkers are limited because screeners do not assess total energy intake, and potassium and sodium are generally too widely distributed in the diet to be captured by a finite number of questions on a short instrument like a screener. Although it may be possible to evaluate screener-estimated protein intake using urinary protein, evaluation of screeners almost always relies on imperfect [glossary term:] reference instruments.

Observation and feeding studies

For 24HRs, validation studies have been conducted using observation, usually in structured school or institutional settings, with one or more eating occasions observed by trained staff [5-6]. For 24HRs and food records, feeding studies in which all foods and beverages consumed over a defined period are unobtrusively weighed and recorded have also been used [7-8]. Analyses of such data include comparisons of [glossary term:] mean intakes of energy, nutrients, and food groups based on true and self-reported data; [glossary term:] correlations between true and reported intake values; examination of the proportions of items accurately reported and excluded; and comparisons between true and reported serving sizes. These types of studies are typically not used to evaluate FFQs because observation and feeding studies are often limited to one or a small number of days.

Design and Analysis Considerations

The following are a few important design and analysis considerations for conducting validation studies with unbiased reference instruments, regardless of the self-report instrument to be evaluated:

  • In studies that make use of recovery biomarkers to assess true [glossary term:] usual dietary intake, multiple administrations of the instruments are required to assess [glossary term:] within-person random error (also known as within-person variability). Knowledge of the within-person random error is essential for estimating the [glossary term:] correlation coefficient between self-reported and true usual intakes.
  • When using recovery biomarkers to assess usual intakes, include two administrations of the main instrument (even an FFQ), before and after administering the [glossary term:] biomarker, if possible.
  • Because DLW represents average energy intakes over a two-week period, a single administration on the entire sample and a repeat (separated by several months) on a subsample are adequate for the purpose of capturing within-person random error. If using 24-hour urine collections to assess true protein, potassium and/or sodium intakes, two or more 24-hour urines separated by at least a few days or months is sufficient. When using recovery biomarkers to assess how well [glossary term:] short-term instruments, such as 24HRs or food records, assess short-term (rather than usual) intake, collecting biomarker and dietary data for the same time period is useful [9].

In general, validation studies using unbiased reference measures show that, for absolute intakes, data collected using 24HRs and records have less error than those collected using FFQs [1]. Findings using a recovery biomarker for protein density based on protein intake from 24-hour urines and energy intake from DLW show that [glossary term:] energy adjustment improves validity substantially [1, 3-4] (Learn More about Energy Adjustment).