CDC-Data-2025/attachments/DownscalerOzone_Metadata_CensusTract_Jun2017_djvu.txt

National Environmental Public Health Tracking Network
Downscaler Ozone Metadata — Census Tract Level


Publication Date


01/11/2017


Background


The Downscaler ozone dataset provides the output from a Bayesian space-time
downscaling fusion model called Downscaler (DS) that combines ozone monitoring data
from the US EPA Air Quality System (AQS) repository of ambient air quality data (e.g.,
National Air Monitoring Stations/State and Local Air Monitoring Stations (NAMS/SLAMS))
and simulated ozone data from the deterministic prediction model, Models-
3/Community Multiscale Air Quality (CMAQ). The files contain estimates of the mean
prediction and associated standard error for each of the 2010 US Census Tracts within
the contiguous US for each day of the modeling year.


The data are intended for use by professionals comparing air quality and health
outcomes through techniques such as case crossover analysis. Other uses may be
developed at a later time. The standard errors of the predictions should be taken into
account when using the results.


Data Values


The dataset includes nine variables:


STATEFIPS: State FIPS code

COUNTYFIPS: County FIPS code

CTFIPS: Census tract FIPS code

LATITUDE: Latitude of census tract centroid (degrees)

LONGITUDE: Longitude of census tract centroid (degrees)

YEAR: Year of prediction

DATE: Date (day-month-year) of prediction

DS _O3 PRED: Mean estimated 8-hour average ozone concentration in parts per billion
(ppb) within 3 meters of the surface of the earth

DS_O3_STDD: Standard error of the estimated ozone concentration


Geographic Scale


All census tracts in the contiguous United States


& Scope

Time Period January 1, 2001 to December 31, 2014

Raw Data The air quality monitoring data from the NAMS/SLAMS network were downloaded from
Processing the Air Quality System (AQS) database. Only Federal Reference Method (FRM) samplers


were included in the dataset. Data from all Pollutant Occurrence Codes (POC) were used.
The data were downloaded covering January 1, 2001 through December 31, 2014. The
CMAQ data was created from version 4.7.1 of the model using Carbon Bond Mechanism-
05 (CB-05). The CMAQ data are daily maximum 8-hour ozone concentrations calculated
ona12kmx 12 km grid for the continental United States. The CMAQ emissions data are
based on 2008 NEI version 2, with specific updates including data from regional planning
organizations and year-specific data for some larger point sources, including continuous
emissions monitoring data for NO, and SO2 sources. The onroad mobile source
emissions were generated using MOVES 2010B, except for California, in which data
provided by the California Air Resources Board was interpolated to each year. In
addition, the meteorological data used are from the Weather Research and Forecasting
Model (WRF) version 3.4 at 12 km simulation. The WRF simulation included the physics
options of the Pleim-Xiu land surface model (LSM), Asymmetric Convective Model
version 2 planetary boundary layer (PBL) scheme, Morrison double moment


microphysics, Kain- Fritsch cumulus parameterization scheme and the RRTMG long-wave
and shortwave radiation (LWR/SWR) scheme. The CMAQ model results were developed
in November 2013. The DS combines the actual monitoring data and the estimated
ozone concentration surface (CMAQ) to predict ozone through space and time. It
attempts to find an optimal linear relationship between CMAQ output and measurement
data to predict new "measurements" at each spatial point in the area of interest. Fitted
parameters are based on sampling from distributions (built into the code by the
developers) rather than an objective function minimum, which allows calculation of a
standard error associated with each prediction. It differs from other fusion efforts by
not assuming the existence of a true air pollution process driving both the monitoring
data and CMAQ output. Instead, downscaling relates air data and model output using a
linear regression with bias coefficients (additive and multiplicative) that can vary in space
and time. This approach to modeling provides a new answer to the “change-of-support”
problem where we would like to predict air pollution at a certain spatial resolution, but
must reconcile the difference between point monitoring data and areal average CVAQ
concentrations. Model parameters are fit just to paired CMAQ and air monitoring data,
thus CMAQ output that do not contain monitoring sites are not used in model fitting.


Additional processing of the data was conducted to standardize variable names across all
years of data and to expand FIPS variable into separate statefips, countyfips, and ctfips
variables.


Additional
Information


Berrocal, V., Gelfand, A. E. and Holland, D. M. (2011). Space-time fusion under error in


computer model output: an application to modeling air quality


Berrocal, V., Gelfand, A. E. and Holland, D. M. (2010). A bivariate space-time downscaler
under space and time misalignment. The Annals of Applied Statistics 4, 1942-1975.


Berrocal, V., Gelfand, A. E., and Holland, D. M. (2010). A spatio-temporal downscaler for
output from numerical models. J. of Agricultural, Biological,and Environmental Statistics
15, 176-197) is used to provide daily, predictive PM2.5 (daily average) and O3 (daily 8-hr
maximum) surfaces for 2010.