Data Processing & Data Analysis
Data Processing Requirements
For frequency-type screeners producing quantitative estimates:
- A nutrient composition database is required to translate foods and beverages reported on the screener into nutrient intakes (Learn More about Food Composition Databases for Food Frequency Questionnaires and Screeners). A database, such as the Food Patterns Equivalents Database, is required to translate foods and beverages reported into amounts of guidance-based food groups (e.g., dark-green vegetables, whole grains, and added sugars) as applicable depending on the scope of the screener (For more information, read a factsheet on FPED products and associated data files or its application to dietary analysis.).
- Software specific to each screener is required to derive nutrient and food group intakes from the respondent-reported information (Learn More about Software for Food Frequency Questionnaires and Screeners).
- Daily nutrient and/or Food Pattern equivalent estimates should be examined to identify [glossary term:] outliers (Learn More about Outliers). Although outliers may indicate coding or processing errors, which should be identified and fixed if possible, it is more likely they indicate reporting errors, which may be addressed in the analysis.
For behavioral-type instruments:
- Scoring or indexing procedures must be developed and applied.
Data Analysis Considerations
For more details on the following issues when considering whether to use a screener to answer a particular research question, see Choosing an Approach for Dietary Assessment.
General Considerations
- For frequency-type screeners, strategies for handling missing information in the frequency and, if applicable, portion size questions should be applied consistently.
- Missing frequency information can be handled with several possible [glossary term:] imputation methods, including assumption of zero intake, single imputation with the population's [glossary term:] mode or [glossary term:] median value, and model-based imputation [11].
- Missing portion size information is generally handled by imputation of standard [glossary term:] mean or median values specific to the screener.
- Strategies have been developed for the NCI All-Day Fruit and Vegetable Screener.
- For behavioral-type instruments, rules for handling missing information must be developed and applied consistently.
- For instruments resulting in quantitative dietary intake estimates:
- Dietary intake estimates should be examined for [glossary term:] skewed distributions (truncated at zero and skewed toward high intakes). [glossary term:] Transformation of dietary data may be required for statistical testing and modeling. Log transformation is often used, but other transformations that may better approximate [glossary term:] normal distributions should be considered.
- Depending on the study design, [glossary term:] nuisance effects, such as the season in which the screener is administered (Learn More about Season Effect) and [glossary term:] mode of administration, can be taken into consideration through modeling.
- If the screener has been administered more than once, [glossary term:] within-person random error can be corrected by statistical modeling.
Guidance for Specific Research Objectives
- If your research objective is to estimate the mean intakes of a group, and you have conducted an [glossary term:] internal calibration sub-study using a less biased instrument, statistical adjustment can be performed to reduce [glossary term:] bias in data from the screener. This is done by applying [glossary term:] calibration equations arising from the [glossary term:] calibration sub-study to the screener estimates (Learn More about Calibration). Alternatively, data from an external source (called an [glossary term:] external calibration study) can be used. If you have not conducted a calibration study, [glossary term:] scoring algorithms developed in another population may be used (Learn More about Scoring Algorithms for Screeners).
- If your research objective is to estimate the [glossary term:] usual dietary intake distributions for a group (for example, for the purpose of examining percentiles or estimating the proportion above or below some threshold), the use of a screener alone is not recommended (Learn More about Usual Dietary Intake). Distributions estimated from a screener (and an FFQ) are narrower than true distributions. Thus, prevalence estimates in the tails of the distribution are biased. However, procedures for using information from an internal calibration sub-study in which 24HRs are administered that may correct for this bias have been developed. Alternatively, data from an external source (called an external calibration study) can be used. More research is needed to test these new methods.
- If your research objective is to analyze the [glossary term:] association between diet as an [glossary term:] independent variable and another variable (e.g., diet at baseline and onset of cancer), and you have conducted an internal calibration sub-study, resulting [glossary term:] regression calibration equations can be applied to the screener estimates and used in the analyses (Learn More about Regression Calibration). This may lead to greater [glossary term:] precision in the estimates of the associations. Alternatively, data from an external calibration study can be used.
- If your research objective is to analyze the association of an independent variable (e.g., socioeconomic status) and diet as the [glossary term:] dependent variable, variables known to affect quality of report (e.g., body mass index) should be included as [glossary term:] covariates in analyses.
- If your research objective is to analyze changes in diet as a result of an intervention (e.g., to evaluate the effectiveness of an educational program to encourage fruit and vegetable intake), analysis of objective data alone (e.g., [glossary term:] biomarker) may yield results with the least bias.
If you have not conducted a calibration sub-study, scoring algorithms developed in another population may be used.
If you have conducted an internal calibration sub-study using less biased measures such as 24-hour recalls, food records, or [glossary term:] recovery biomarkers, statistical techniques can be used to improve screener estimates. Alternatively, data from an external calibration study can be used.
If less biased data are available from an internal calibration sub-study, calibration equations should be estimated for each treatment group and, if relevant, each time period and applied to the screener estimates. This calibration would yield less bias in the means. However, differential response bias still may be problematic. If social desirability questions also have been collected, the resulting score may be useful to at least partially control for [glossary term:] differential response bias (Learn More about Social Desirability).