Validation is a critical concept when determining whether a dietary assessment instrument is suitable for a particular research question. This concept involves several aspects: evaluating the [glossary term:] validity of an instrument, understanding characteristics of those who [glossary term:] misreport on an instrument, and considering what type of [glossary term:] measurement error is present in the instrument.

Evaluating Validity

The validity of the 24HR has been evaluated in several ways (see Key Concepts About Validation). The strongest class of validity studies is based on objective [glossary term:] recovery biomarkers.

  • Salient features of studies with recovery biomarkers:
    • Recovery biomarkers are ideal for validation because the intake of the dietary component is reflected by the [glossary term:] biomarker in a relatively constant and known manner. Recovery biomarkers thus provide unbiased estimates of [glossary term:] true intake.
    • Known recovery biomarkers are [glossary term:] doubly labeled water (DLW) for energy intake, urinary nitrogen for protein intake, urinary potassium for potassium intake, and urinary sodium for sodium intake.
    • Studies using recovery biomarkers generally have been on small, highly selective groups, because of the expense involved. No data are available for children.
  • Results of 24HR evaluation studies using recovery biomarkers:
    • 24HR estimates of energy intake in Western populations generally underestimate true intake by 3% to 34%. The largest three studies in adults using an interviewer-administered multiple-pass method showed 12% to 23% [glossary term:] underreporting [1-4].
    • Little information is available as yet for web-based self-administered 24HRs. However, one study in adults showed 9% underreporting of energy relative to DLW [5].
    • For protein, underreporting tends to be in the range of 11% to 28% [6].
    • Underreporting for sodium is in the range of 8% to 20%.
    • For potassium, there are reports of accurate reporting and [glossary term:] overreporting in the range of 10% to 40%.
    • Because of a lack of [glossary term:] recovery biomarkers for nutrients other than energy, protein, potassium, and sodium, little is known about [glossary term:] misreporting on other dietary components.

A second class of validity studies relies on independent and unobtrusive [glossary term:] observation of the eating behaviors being reported on the 24HR (Learn More about Observation and Feeding Studies).

  • Salient features and results of observation studies:
    • In an [glossary term:] observational study, one or more trained staff unobtrusively observe individuals during a meal while noting the foods and portions consumed. The observer may have access to a planned menu, weighed portions given to participants, and/or plate waste.
    • Many such studies have been performed on children in school meal settings, and they generally show large discrepancies between consumption observed and reported by the child [7].
    • Observation studies in adults are few in number and on select populations, and suggest that energy is similar or underreported relative to observed energy intake [8-9].

The third class of studies, which includes most of the validation studies, examines 24HR performance relative to other self-report instruments, such as food records.

  • Salient features and results of relative validation studies:
    • Relative validation studies administer two or more self-report dietary instruments to the same population, and often for the same or overlapping time periods.
    • Relative validation is imperfect, as no self-report instrument represents true intake. Although individual relative validity studies may be useful, for example, to learn whether two different instruments produce comparable results, no overall judgment about 24HR validity can be made from this type of study.
    • Another weakness of relative validation is that errors in the two instruments are likely to be [glossary term:] correlated, which typically results in an overstatement of their agreement.
  • Relative validation studies have found that for interviewer-administered recalls, telephone-administration is comparable to face-to-face administration [10-11]. For web-based self-administered recalls, limited research thus far indicates minimal differences between interviewer- and self-administered recalls among adults [12-13]. Larger differences have been found for children [14-15].

Understanding Misreporting

Misreporting on dietary assessment instruments can include both overreporting and underreporting of intake. Knowledge of who is likely to misreport, and in which direction, is useful in interpreting the 24HR results (Learn More about Misreporting).

Many studies have examined misreporting, looking at a variety of characteristics. Underreporting of energy is more common than overreporting in the United States, but this is not universal in all countries. Studies using recovery biomarkers have reported that respondents with higher [glossary term:] body mass index (BMI) and women consistently underreport energy [6]. Other factors that have been shown to be related to underreporting less consistently are social desirability traits, restrained eating, education, literacy, perceived health status, and race/ethnicity (Learn More about Reactivity and Learn More about Social Desirability).

Considering Measurement Error

Measurement error refers to the difference between the true value of a parameter, such as true energy intake, and the value obtained from a particular measure, for example, energy reported on a 24HR (see Key Concepts About Measurement Error). There are two types of measurement error:

24HR data are affected by [glossary term:] day-to-day variation, a source of [glossary term:] within-person random error. Although day-to-day variation does not reflect error per se in reporting intake for a given day, it is considered to be part of within-person random error from the perspective of estimating [glossary term:] usual dietary intake distributions using data for a small number of days (see Choosing an Approach for Dietary Assessment). This type of error can be corrected with statistical modeling. Systematic error cannot be corrected with statistical modeling, but is thought to affect 24HR data to a smaller extent than within-person random error.