Validation
Validation is a critical concept when determining whether a dietary assessment instrument is suitable for a particular research question. This concept involves several aspects: evaluating the [glossary term:] validity of an instrument, understanding characteristics of those who [glossary term:] misreport on an instrument, and considering what type of [glossary term:] measurement error is present in the instrument.
Evaluating Validity
The validity of the FFQs has been evaluated in several ways (see Key Concepts About Validation). The strongest class of validity studies is based on objective [glossary term:] recovery biomarkers.
- Salient features of studies with recovery biomarkers:
- Recovery biomarkers are ideal for validation because the intake of the dietary component is reflected by the [glossary term:] biomarker in a relatively constant and known manner. Recovery biomarkers thus provide unbiased estimates of [glossary term:] true intake (Learn More about Biomarkers).
- Known recovery biomarkers are [glossary term:] doubly labeled water (DLW) for energy intake, urinary nitrogen for protein intake, urinary potassium for potassium intake, and urinary sodium for sodium intakes.
- Studies using recovery biomarkers generally have been on small, highly selective groups, because of the expense involved. No data are available for children.
- Results of FFQ evaluation studies using DLW and 24-hour urine collections:
- FFQs underestimate true intakes of energy and protein. Underestimates range widely, from 11% to 35% lower for energy and up to 30% lower for protein, depending on the particular study sample [1-5].
- In two larger studies that included FFQs (the Observing Protein and Energy Nutrition (OPEN) Study, which included 484 men and women, and the Nutrition and Physical Activity Assessment Study (NPAAS), which included 450 postmenopausal women from the Women's Health Trial), FFQs underestimated true energy intake by about 35% and 28%, and protein intake by about 31% and 9%, respectively [1, 3]. However, [glossary term:] energy adjustment led to non-significant misestimation in OPEN and overestimation in NPAAS.
- For sodium, there is a tendency for [glossary term:] underreporting in the range of 10% to 20%, while for potassium, underreporting is uncommon, with most studies indicating overreporting in the range of 8% to 33% [6-9].
- Because of a lack of recovery biomarkers for nutrients other than energy, protein, potassium and sodium, little is known about [glossary term:] misreporting on other dietary components.
A second class of validity studies, which includes most of the validation studies, examines FFQ performance relative to other self-report instruments, such as 24HR and food records.
- Salient features of comparative studies:
- Relative validation studies administer two or more self-report dietary instruments to the same population.
- An FFQ is often compared to data from [glossary term:] short-term instruments, such as 24HR and food records. Ideally the short-term instrument is administered during the time period of interest in an FFQ.
- Relative validation is imperfect, as no self-report instrument is totally accurate. Although individual relative validity studies may be useful, for example, to learn whether two different instruments produce comparable results, no overall judgment about FFQ validity can be made from this type of study.
- Another weakness in relative validation is that errors in the two instruments are likely to be [glossary term:] correlated, which typically results in overstatement of their agreement.
- Many different FFQs have been developed. These instruments vary both by food items and by whether and how portion size questions are incorporated. However, few studies have been conducted that compare the performance of different FFQs in the same population or the performance of the same FFQ administered in different ways in the same population [10-11].
Understanding Misreporting
Misreporting on dietary assessment instruments can occur either by overreporting and [glossary term:] underreporting intakes. Knowledge of who is likely to misreport, and in which direction, is useful in interpreting the 24HR results (Learn More about Misreporting).
Many studies have examined misreporting, looking at a variety of characteristics. Underreporting of energy is more common than overreporting in the United States, but this is not universal in all countries. Overall, groups found to be particularly prone to energy underreporting on FFQs are those with higher body mass index [2, 12] (Learn More about Reactivity and Learn More about Social Desirability).
Considering Measurement Error
Measurement error refers to the difference between the true value of a parameter, such as true energy intake, and the value obtained from a particular measure, for example, energy reported on an FFQ (see Key Concepts About Measurement Error). There are two types of measurement error:
- [glossary term:] Random error, which is an unpredictable type of error that contributes to variability, and
- [glossary term:] Systematic error, which results in measurements that consistently depart from the true value in the same direction. Systematic error also is known as [glossary term:] bias.
Although [glossary term:] day-to-day variation is non-existent in an FFQ because it asks about intake over a long time period, FFQ reports over months or years have some [glossary term:] within-person random error. Studies have shown high but less than perfect [glossary term:] correlations in successive FFQs [13].
The major type of measurement error for an FFQ is systematic error, arising from limitations of an FFQ. These include incomplete or inappropriate food lists and the difficulty inherent in performing cognitively complex memory and averaging tasks.