800 likes | 945 Views
Overview What is a “Source”? PM and Precursor Emissions Source Apportionment Overview Spatial/Temporal Analyses Cluster/Factor Analyses Positive Matrix Factorization UNMIX Chemical Mass Balance Model Handling Secondary PM Trajectory Approaches Method and Tool Availability
E N D
Overview What is a “Source”? PM and Precursor Emissions Source Apportionment Overview Spatial/Temporal Analyses Cluster/Factor Analyses Positive Matrix Factorization UNMIX Chemical Mass Balance Model Handling Secondary PM Trajectory Approaches Method and Tool Availability Uncertainties and Limitations in Source-Receptor Analysis Discerning Local vs. Non-local Influences Discerning Among Source Categories Discerning Among Source Regions Discerning Among Specific Source Influences Network Design Issues References Appendix AIRS Codes for PM2.5 Secondary Aerosol Source Profiles PM2.5 Species Relationships Quantifying the Contribution of Important Sources to PM Concentrations PM Data Analysis Workbook: Source Apportionment
Overview • Why do we need to understand the sources of PM? When an area experiences high concentrations of PM, particularly when the concentrations are in exceedance of the standard, research and analysis is needed to investigate the possible sources of PM and PM precursors leading to the high concentrations. The analysis and research required spans all aspects of the regulatory community: • Monitoring staff should know whether or not their sampling and analysis set up is adequate to identify the PM and precursor species that are critical for identifying potential sources in their area. • Analysts should be able to identify potential sources and meteorological conditions to assist policy makers and modelers in developing control strategies. • Modelers should know how well current emission inventories and dispersion models represent the ambient conditions so that they can model future control scenarios and the effect on PM concentrations. • Policy makers should know what sources are the principal contributors to PM so that appropriate controls on PM and precursor emissions can be developed and implemented. • In a previous chapter of the workbook, data analyses exploring the spatial and temporal characteristics of PM data were discussed. In this chapter, we first discuss “what is a source?”. Important emissions sources are then described as well as source attribution methods and tools and their uncertainties. Examples are then provided of how to discern among source categories, source regions, and specific source influences. PM Data Analysis Workbook: Source Apportionment
What is a “Source”?: Primary versus Secondary • Primary PM is composed of material in the same chemical form as when they were emitted into the atmosphere including windblown dust, sea salt, road dust, mechanically generated particles and combustion-generated particles such as fly ash and soot. PM also includes particles formed from the condensation of high temperature vapors formed during combustion (e.g., As, Se, Zn). Concentrations of primary PM are a function of emission rate, transport and dispersion, and removal rate. • Secondary particles are formed from condensable vapors generated by chemical reactions of gas-phase precursors. Secondary processes can result in either the formation of new particles or the addition of PM to preexisting particles. For example, sulfate in PM is mostly formed by atmospheric oxidation of SO2. Also, oxides of nitrogen react in the atmosphere to form nitric acid vapor which in turn may react with NH3 to form particulate ammonium nitrate. A portion of the organic aerosol is also due to secondary processes. Secondary formation is a function of many factors including: concentrations of precursors, concentrations of other gaseous reactive species (e.g., ozone, hydroxyl radical), atmospheric conditions, and cloud or fog droplet interactions. It is considerably more difficult to relate ambient concentrations of secondary species to sources of precursor emissions than it is to identify the sources of primary particles. PM Data Analysis Workbook: Source Apportionment
What is a “Source”?: Local vs. Transport • A key question analysts face is “how do I tell the difference between locally generated PM and PM transported into the area?” Policy makers need to understand how much of the PM problem is under their jurisdiction to control. • Techniques for assessing the difference between local and transported PM include: • Spatial and temporal analyses (e.g., Are high concentrations observed on a regional basis or only at a few “hot spots”?). • Assessing the age of an air mass accompanied with trajectory analysis. • The use of “tracers-of-opportunity” and species ratios accompanied with trajectory analysis (e.g., using potassium to identify forest fire impact). • The use of satellite information to corroborate transport (e.g., Saharan dust storm impact on U.S. sites). • Model the dependence of PM on ozone to determine a component of PM that is photochemically produced. PM Data Analysis Workbook: Source Apportionment
What is a “Source”?: Other Issues • Regional issues. Analysts need to be able to assess how much an urban area is contributing to a PM exceedance problem compared to the regional background. • As a first approximation of local versus regional contributions to an urban area’s PM, assess the differences between the concentrations of average urban and nearby rural monitoring data. This assumes that the PM at rural sites is not “contaminated” by urban emissions and that the same regional sources have the same impact on rural monitors as on urban monitors (see Schichtel, 1999a). • Model the PM dependence on wind speed and wind direction to classify a site as being dominated by local or regional source contributions (Schichtel, 1999a). • Researchers are investigating the development of “regional background” profiles to assist in apportioning PM. These profiles could be used in an attempt to quantify the regional contribution to PM concentrations. • Samplers were strongly influenced by sources less than 10 km away and even minor sources close to the sampler could overwhelm any regional component in a 24-hr integrated sample (VanCuren, 1998). However, individual emitters can have a zone of influence less than 1 km (e.g., Chow et al., 1999). PM Data Analysis Workbook: Source Apportionment
PM and PM Precursor Emissions • Knowledge of emissions is required for performing source apportionment and assessing control measures. • The majority of the PM2.5 mass over the United States is of secondary origin, formed within the atmosphere through gas-particle conversion of precursor gases such as sulfur oxides, nitrogen oxides, and organics. • Precursor emissions that are well-defined include sulfur and nitrogen oxides (SO2,NOx) while the emissions of other species such as organics, soil, and soot are poorly defined. Schichtel (1999b) PM Data Analysis Workbook: Source Apportionment
SO2 Annual Emissions North American SO2 Emission Rates Schichtel (1999b) The highest SO2 emission rates occur over the Ohio River Valley, eastern seaboard, and urban locations, such as Atlanta and St. Louis. There are few major SO2 sources in the west. PM Data Analysis Workbook: Source Apportionment
Eastern U.S. NOx Emission Rates • Area source NOx emissions are highest near cities. • Point source emissions are highest over the Industrial Midwest. Schichtel (1999b) PM Data Analysis Workbook: Source Apportionment
Meat-cooking operations Paved-road dust Fireplaces Noncatalyst gasoline vehicles Diesel vehicles Surface coating Forest fires Cigarettes Catalyst-equipped gasoline vehicles Organic chemical processes Brake lining Roofing tar pots Tire wear Misc. industrial point sources Natural gas combustion Misc. petroleum industry processes Primary metallurgical processes Railroad (diesel oil) Residual oil stationary sources Refinery gas combustion Major Sources of Organic Carbon Emissions Adapted from Cass, 1997 Sources listed from most abundant to least abundant for the Los Angeles urban area for 1982. PM Data Analysis Workbook: Source Apportionment
Emissions Issues • PM2.5 precursor emissions patterns vary across the U.S.; thus, PM2.5 speciation and concentrations also vary. • Of importance to source apportionment is guidance on the following: • validating emission profiles and inventories • improving emission profiles and inventories • estimating emission profiles and inventories • identifying unusual events • Many of these topics are covered in the Introduction and in the Emission Inventory Evaluation sections of the workbook. PM Data Analysis Workbook: Source Apportionment
Source Apportionment: Overview(1 of 3) • Relating source emissions to their quantitative impact on ambient air pollution is referred to as source apportionment. In principle source apportionment can be performed in complementary ways. The traditional approach is dispersion modeling, in which a pollutant emission rate and meteorological information are input to a mathematical model that disperses (and may also chemically transform) the emitted pollutant, generating a prediction of the resulting pollutant concentration at a point in space and time. The inputs may be measured quantities but they need not be, in which case the modeling is a "what if" exercise which explores the consequences of different emission rates and meteorological variables. The alternative is receptor modeling, which may be defined as "a specified mathematical procedure for identifying and quantifying the sources of ambient air contaminants at a receptor primarily on the basis of concentration measurements at that receptor." The concentration measurements referred to are those of particular chemical or physical properties that are characteristic of particular source emissions. In contrast to dispersion modeling, receptor modeling is diagnostic, not prognostic - it describes the past rather than the future. In further contrast to dispersion modeling, receptor modeling has everything to do with measurements and cannot be performed without them. While source apportionment in principle embraces both modeling approaches, in common usage it is often taken as synonymous with receptor modeling. This overview is concerned only with this restricted meaning of source apportionment. • Two milestones in the development of receptor modeling are worth noting. First, the Friedlander (1973) article is generally recognized as the genesis of the chemical mass balance (CMB) receptor model, referred to at the time as chemical element balance. CMB has a special status in the receptor modeling toolbox as the only model up to the present that has been officially approved (i.e., supported and distributed) by EPA. CMB is described elsewhere in this document. • A second milestone was the 1982 Mathematical and Empirical Receptor Models Workshop, now known as "Quail Roost II", and the series of articles that resulted from it. The workshop was important for multiple reasons: (a) it introduced the concept of sophisticated synthetic (simulated) data sets as test beds for comparing the performance of alternative receptor modeling approaches, where the "truth" is known a priori by the constructors of the data sets; (b) it brought together many of the U.S. receptor modeling practitioners in a "blind" intercomparison of their various methods when applied to common data sets, including both synthetic and real sets; and (c) it resulted in a Glossary of receptor modeling terms that provided a common language for this emerging field. The degree of success that was achieved in (b) was instrumental in bringing EPA the realization of the potential importance of receptor modeling as a complement to traditional dispersion modeling for source apportionment. PM Data Analysis Workbook: Source Apportionment
Source Apportionment: Overview (2 of 3) • Receptor model types may be classified as single-sample or multivariate. In the first type the modeling analysis is performed independently on each available sample. The simplest example of this is the “tracer element” method, in which a particular property (e.g., chemical specie) is known to be uniquely associated with a specific source, so that the total ambient mass impact of the source may be estimated by dividing the measured ambient concentration of the property by the property's known abundance in the source's emissions. The method is not often available because of the difficulties of finding unique tracers or knowing their abundances. However, even if the property is not uniquely associated with a source of interest, if its abundance in that source is known, then the method can always be used to provide an upper limit for the source's impact. A novel example of this method is the use of the radiocarbon (14C) content of an ambient sample to estimate the fraction of carbon in the sample that is biogenic (non-fossil-fuel related). • The best-known example of single-sample receptor modeling is of course chemical mass balance. CMB removes the need for unique tracers of sources but still requires the abundances of the chemical components of each source (source profiles) to be known. • Multivariate receptor models require the input of data from multiple samples, and extract the source apportionment information from all of the sample data simultaneously. The reward for the extra complexity of these models is that they purport to estimate not only the source contributions but the source compositions (profiles) as well. The simplest example of a multivariate method is “tracer element/multiple linear regression.” This method requires tracers that are uniquely associated with the sources of interest, but it does not require their abundances to be known. • Additional multivariate receptor models include (a) absolute principal component analysis, (b) specific rotation factor analysis, (c) target transformation factor analysis, (d) three-mode factor analysis, (e) source profiles by unique ratios (SPUR), (f) receptor model applied to patterns in space (RMAPS), (g) UNMIX, and (h) positive matrix factorization (PMF). Most of these models are based on factor analysis, or the closely related principal component analysis. In recent years the development and investigation of the last three has been supported by EPA. In comparison with CMB far less is understood about the behavior and validity of these multivariate models. Criticisms have been directed at specific models, in addition to the general criticism of any factor analysis-based model that does not employ additional constraints to limit the solution space. PM Data Analysis Workbook: Source Apportionment
Source Apportionment: Overview (3 of 3) • One of the challenges that receptor modeling will have to confront in the PM2.5 arena is the treatment of secondary mass - products that result from atmospheric transformation processes between source and receptor, such as sulfate and nitrate. While this has always been a problem for receptor modeling, it is more severe for PM2.5 than for PM10 because the secondary contribution to PM2.5 is a larger fraction than for PM10. CMB deals with this in a limited way that isolates the total mass of a secondary component (e.g., sulfate) but cannot apportion it to individual sources. Progress in this area will require a hybrid receptor model approach, i.e., the use of selected emissions rate, meteorological, and chemical transformation information with an otherwise conventional receptor model. Receptor modeling was invented for the very reason of avoiding the need for such frequently uncertain information, but it seems inevitable that source apportionment of secondaries will require an extension of classical receptor modeling. By combining elements of both receptor and dispersion models, the intent is to minimize the weaknesses of the separate approaches and maximize their combined strengths. • The development of receptor modeling over the past two decades has been strongly influenced by the extensive use of inorganic species, particularly atomic elements measured by x-ray fluorescence. In the receptor modeling of PM2.5 there is likely to be a new emphasis on organic species because of the relatively greater contribution of combustion sources and carbon to PM2.5 than to PM10. This will not be an easy transition because of formidable difficulties in organic aerosol sampling (both positive and negative artifacts can occur) and analysis (the presence of a bewildering number of organic species, frequently at low concentrations). The Northern Front Range Air Quality Study recently performed in the Denver area and the continuing characterization of organic aerosol in southern California provide some indications of the promise of this new direction. Lewis, 1999 PM Data Analysis Workbook: Source Apportionment
Source Apportionment Methods and Tools • Source apportionment methods are used to resolve the composition of PM into components related to emission sources. Several methods are available. • It is useful to apply more than one method and look for consensus among results. • Methods and tools discussed in this section include the following: • Spatial and temporal characteristics of data • Cluster, factor, and other multivariate statistical techniques • Positive matrix factorization (PMF) • UNMIX • Source-receptor models: chemical mass balance (CMB) model • Trajectory approaches PM Data Analysis Workbook: Source Apportionment
Assessing Spatial and Temporal Characteristics of PM • Simple analyses of the spatial and temporal characteristics of PM2.5 can be used to obtain information regarding the data. • These investigative analyses can include the use of time series plots of PM mass and species concentrations, scatter plots, individual sample “fingerprints”, box-whisker plots, and summary statistics. • These investigations can help the analyst identify important species, species relationships, time periods of interest, and likely sources. PM Data Analysis Workbook: Source Apportionment
Examples Using Spatial and Temporal Data(1 of 3) • Potassium nitrate (KNO3) is a major component of all fireworks. • This figure shows all available PM2.5 K+ data from all North American sites, averaged to produce a continental average for each day during 1988-1997. • Fourth of July celebration fireworks are clearly observed in the potassium time series. • Fireworks displays on local holidays/events could have a similar effect on data. Poirot (1998) Regional averaging and count of sample numbers were conducted in Voyager, using variations of the Voyager script on p. 6 of the Voyager Workbook Kvoy.wkb. Additional averaging and plotting was conducted in Microsoft Excel. PM Data Analysis Workbook: Source Apportionment
Examples Using Spatial and Temporal Data(2 of 3) • A simple material balance on the annual average chemical composition can be useful (shown here: Los Angeles area PM2.5). • EC concentrations were highest in Central LA, consistent with fresh motor vehicle emissions and traffic density. • OC concentrations typically accounted for the largest portion of the PM2.5 at most sites. More emphasis on OC measurements may be warranted. • Nitrate and ammonium concentrations were highest at the downwind site (Rubidoux) consistent with NH3 emission sources and secondary nitrate formation. Made using Excel; adapted from Cass, 1997. Sites are arranged from west to east (the general direction of transport in the Los Angeles basin. PM Data Analysis Workbook: Source Apportionment
Simple analyses of wind direction and PM species concentrations can be used to begin an assessment of likely sources. Temporal resolution less than 24-hr of the PM data may be necessary for this analysis. PM10 zinc concentrations at Crows Landing with respect to wind direction are shown. High concentrations in the northwest sector are consistent with the refuse incinerator located one km to the NNW of the monitoring site. Examples Using Spatial and Temporal Data(3 of 3) Zinc concentration distribution with respect to wind direction (%) Adapted from wind roses reported by Chow et al., 1996. Radar plot prepared in Excel. Wind direction is the direction from which the wind is blowing. Data from Crows Landing, CA during 1990 summer intensive study. PM Data Analysis Workbook: Source Apportionment
Multivariate Analyses • Multivariate analyses are statistical procedures used to infer the mix of PM sources impacting a receptor location (see the following tables for species/source links). • Procedures including cluster, factor/principal component, regression, and other multivariate techniques are usually available in statistical software packages. • Literature review shows many refinements and options to these analyses. • A drawback to these analyses is that the analyst must infer how certain statistical species groupings relate to emissions sources. • A nice feature of these analyses is the ability to summarize a multivariate data set using a few components. PM Data Analysis Workbook: Source Apportionment
Key PM Species and Sources (1 of 3) Adapted from U.S. EPA (1998a, 1998b); Watson and Chow (1998); Chow (1995) PM Data Analysis Workbook: Source Apportionment
Key PM Species and Sources (2 of 3) Adapted from U.S. EPA (1998a, 1998b); Watson and Chow (1998); Chow (1995) PM Data Analysis Workbook: Source Apportionment
Key PM Species and Sources (3 of 3) Adapted from U.S. EPA (1998a, 1998b); Watson and Chow (1998); Chow (1995) PM Data Analysis Workbook: Source Apportionment
Cluster and Factor Analyses • Cluster analysis is a multivariate procedure for grouping data by similarity between observations (i.e., observations with similar chemical compound concentrations are grouped). • This is typically done using a Euclidean distance between each pair of observations (squared differences between individual concentrations summed across all species). • Factor analysis is a procedure for grouping data by similarity between variables (i.e., variables that are highly correlated are grouped). • This is typically done using the correlation between each pair of variables. • Correlation measures are often used because they are not influenced by differences in scale between objects. This is important because PM species concentrations can vary over several orders of magnitude. • Factors indicate the best associations among variables while regression lines indicate the best predictions. • The factor model expresses the variation within, and the relations among, observed variables as partly common variation among factors and partly specific variation among random errors. PM Data Analysis Workbook: Source Apportionment
Cluster/Factor Analysis Example Example PM2.5 cluster and factor analyses to be developed for the workbook, see the following: • Wongphatarakul et al., 1998 for example PM cluster analysis • Huang et al., 1999 for example conventional factor analysis applied to PM data PM Data Analysis Workbook: Source Apportionment
Positive Matrix Factorization Positive matrix factorization (PMF) was developed by Dr. P. Paatero (Dept. of Physics, University of Helsinki). PMF can be used to determine source profiles based on the ambient data. Features include the following: • PMF uses weighted least squares fits for data that are normally distributed and maximum likelihood estimates for data that are distributed long normally. • PMF weights data points by their analytical uncertainties. • PMF constrains factor loadings and factor scores to nonnegative values and thereby minimizes the ambiguity caused by rotating factors. This is one of the major differences between PMF and principal component analysis (PCA). • PMF expresses factor loadings in mass units which allows factors to be used directly as source signatures. • PMF provides uncertainties for factor loadings and factor scores which makes the loadings and scores easier to use in quantitative procedures such as chemical mass balance. PM Data Analysis Workbook: Source Apportionment
PMF Analysis Example (1 of 2) • Polissar et al. (1998) used PMF to investigate the fine particle composition data from seven National Park Service locations in Alaska for the period 1986-1995. The sites are the Northwest Alaska Areas National Park (NWAA), Bering Land Bridge National Preserve (BELA), Gates of the Arctic National Park (GAAR), Denali National Park (DENA), Yukon Charley National Preserve (YUCH), Wrangell St. Elias National Park (WRST), and Katmai National Park (KATM). • PMF uses the estimates of the error in the data to provide optimum data point scaling and permits a better treatment of missing and below detection limit values. • Up to eight source components were obtained for the data sets: PM Data Analysis Workbook: Source Apportionment
PMF Analysis Example (2 of 2) • The highest average PM2.5 concentration at the Bering Land Bridge site (BELA) may be due to the strong influence of aerosol emissions from local pollution sources in nearby Nome plus PM transported into the region. • Note the large seasonal difference in the forest fire factor at Gates of the Arctic (GAAR). Polissar et al., 1998 Stacked bar plots prepared using a spreadsheet program. PM Data Analysis Workbook: Source Apportionment
UNMIX • UNMIX is a multivariate receptor modeling package that inputs observations of particulate composition and seeks to find the number, composition, and contributions of the contributing sources or source types. UNMIX also produces estimates of the uncertainties in the source compositions. UNMIX uses a generalization of the self-modeling curve resolution method developed in the chemometrics community (Henry, 1997). • Data Requirements: UNMIX inputs data in tabular format as flat ASCII files. Each column represents one species and each row is one sample or observation. It is very helpful to have a measure of total mass included in the data. It is generally best to analyze data from one site at a time. Basically, the more data the better, in terms of both species and observations. The upper limit on the amount of data is determined by the size of the computer. Based on experience, the practical lower limit on the number of observations is 50 to 100. • System Requirements: UNMIX is currently implemented as a MATLAB program (see the website mathworks.com for more information). UNMIX has a graphical user interface so the user need not be familiar with MATLAB itself. PM Data Analysis Workbook: Source Apportionment
UNMIX Analysis Example • UNMIX was applied to PM2.5 data collected at Underhill, VT, during 1988-1995. • Six “sources” were identified using mass (MF), particle absorption (BABS), arsenic (As), calcium (Ca), iron (Fe), nickel (Ni), selenium (Se), silicon (Si), total sulfur (S), and non-soil potassium (KNON). • The “sources” were further investigated by performing back trajectories and investigating time series. • The smelter (“smelt”) source, oil combustion, and winter coal combustion source trajectories are consistent with known emission patterns. Values represent the % of the element accounted for by the source. Poirot (1999) PM Data Analysis Workbook: Source Apportionment
Comparing UNMIX and PMF Results • PM 2.5 data from Underhill, VT, for 1988-1995 were analyzed using both UNMIX and PMF. • Results for arsenic sources (i.e., smelter), nickel sources (i.e., oil combustion), and soil sources compared well between the two source apportionment methods. • Consensus among results gives the analyst more confidence that the results are meaningful. Poirot, 1999 PM Data Analysis Workbook: Source Apportionment
Overview of the Chemical Mass Balance Model (1 of 3) • Miller et al. (1972) first proposed the mass balance model for apportioning ambient aerosol mass to its sources via its chemical constituency. The basic concept of CMB is that composition patterns of emissions from various classes of sources are different enough that their contributions can be identified by measuring concentrations of many species collected at the receptor site (Gordon, 1988). Thus, CMB consists of an effective variance weighted, least squares solution to a set of linear equations which expresses each receptor species concentration as a linear sum of the products of the source species profile and source contributions. The source species profile (i.e., fractional amount of the chemical species in the emissions from each source type) and the receptor concentrations are the basic input to the CMB model. The output consists of the amount contributed by each source type to each chemical species and to the total receptor concentration. Input data uncertainties (analytical) are used both to weight the importance of input data values in the solution and to calculate the uncertainties of the source contributions. • The basic formulation of the CMB may be expressed as: • where ci is the concentration of constituent or property i, aij is the fractional concentration of constituent or property i in the emissions from sourcej as perceived at the receptor, sj is the total mass contribution of source j to the receptor,p is the total number of sources contributing, and n is the total number of constituents or properties. • CMB makes several assumptions: • Compositions of source emissions are constant over the period of ambient and source sampling. • Chemical species do not react with each other (i.e., they add linearly). For many assessments, secondary formation of particles is important. While CMB is not formulated to explicitly treat secondary transformation, a surrogate procedure is available to give some information on at least the extent of secondary materials in the ambient data (discussed later in the chapter). • All sources with a potential for significantly contributing to the receptor have been identified and have had their emissions characterized. • The source compositions are linearly independent of each other. • The number of sources or source categories is less than the number of chemical species. • Measurement uncertainties are random, uncorrelated, and normally distributed. PM Data Analysis Workbook: Source Apportionment
Overview of the Chemical Mass Balance Model (2 of 3) • In fact, these assumptions pose a limitation of the model because source compositions are not constant (they vary with changes in process inputs, loads and cycles); components do react with each other and systems are not linear; one rarely knows exactly how many sources are contributing to a receptor; there are many more sources than components which can be practically measured; many sources have very similar compositions; measurement errors are not necessarily random, uncorrelated, or normally distributed; and very few sources have their own unique tracer components (Watson, 1984). While the implicit assumptions are fairly restrictive and will never be totally obeyed in actual practice, CMB can tolerate deviations from these assumptions with some penalty in uncertainty. Several studies have been published that document CMB's tolerance to such deviations. • The limitations of receptor models may be offset by their advantages. They are relatively simple compared to source-oriented models of comparable accuracy and precision. And because an analytical method of determining the effects of systematic errors on the mass balance equations has been developed, the precisions required of measurements to provide a target precision for the model output can be estimated. • CMB has been used in a great number of actual air pollution studies, some of which are described here. Size-fractionated samples were analyzed during the summer of 1982 in Philadelphia (Dzubay, 1988). With the promulgation in 1987 of National Ambient Air Quality Standard for suspended particles nominally £ 10 mm in diameter (PM10), receptor modeling (notably CMB) was employed in support of state implementation plan (SIP) development. CMB was applied to the apportionment of PM10 from the West Orem Steel Plant during episodes in the winter of 1978/88 in Utah Valley (Cooperet al., 1989). CMB was also applied to the apportionment of fine and coarse particles in Windsor, Ontario from January through November 1991 (Conner et al., 1993). CMB was applied to the chemically speciated diurnal particulate matter samples acquired in California's South Coast Air Basin during the summer and fall of 1987 as part of the Southern California Air Quality Study (Watson et al., 1994). • CMB7 (for DOS) and its User's Manual were developed under contract with Desert Research Institute (DRI) and released in 1990. These products have been uploaded to EPA's modeling website (www.epa.gov/scram001). As explained in a README file, a source profile library (SPECIATE) is accessible elsewhere (www.epa.gov/ttn/chief) and is applicable (with some reformatting described in the README) for input to CMB7. PM Data Analysis Workbook: Source Apportionment
Overview of the Chemical Mass Balance Model (3 of 3) • DRI began development on CMB8 (for WINDOWS) and its documentation in late 1995. This documentation included an updated User's Manual and Protocol for Applying and Validating the CMB Model. Key features of CMB8 include: correcting a bug associated with using the AUTOFIT feature; enabling the Britt and Luecke exact least squares solution; providing additional source profile and fitting species combinations (arrays); optimizing fitting (individual and AUTOFIT) so that sources with contribution estimates that are either negative or lower than their standard errors while appearing in an uncertainty/similarity cluster are eliminated; automatically eliminating species with missing values; and expanding options in the configuration and Output files. DRI released a preliminary version in late 1998 (model and documentation available via anonymous ftp: eafs.sage.dri.edu) and tested in early 1999. EPA discovered problems with CMB8's execution and errors in its documentation. In an effort to solve these problems, and also to enhance the model to make it more robust and user friendly, EPA is letting another contract to complete the work. A final product is anticipated for spring 2000. • An example of how CMB8 may be applied to PM2.5 is included in Section 5 of the (draft) CMB8 Applications and Validation Protocol for PM2.5 and VOC for the Northern Front Range Air Quality Study (Watson et al., 1998). Other examples will be cited as more experience is gained and as new results become available in the literature. Coulter, 1999 PM Data Analysis Workbook: Source Apportionment
Chemical Mass Balance Modeling • The purpose of CMB receptor modeling is to apportion ambient PM (or any categorical pollutant) to emission sources. The source apportionment of ambient PM provides independent evaluation of the relative contributions of sources to ambient levels of PM. • The CMB model expresses each measured chemical species concentration as a linear sum of products of source profile species and source contributions, and then solves a set of linear equations. • Model input includes: • Source profile species (fractional amount of species in the PM emissions from each source type). • Receptor (ambient) concentrations. • Realistic uncertainties for source and receptor values. Input uncertainty is used to weigh the relative importance of input data to model solutions and to estimate uncertainty of the source contributions. • Model output includes: • Contributions from each source type to the total ambient PM and individual species and the uncertainty. PM Data Analysis Workbook: Source Apportionment
CMB Model Assumptions • Composition of source emissions do not change during travel from the point of emission (where the source profile is defined) to the point of receptor site measurements (minor contributors are frequently omitted). • Chemical species do not react with each other (i.e., they add linearly) (little known about this). • All sources which may significantly contribute to the receptor have been identified and their emissions characterized (minor contributors may be omitted). • Number of sources or source categories is less than the number of chemical species (the larger the difference, the better). • Source profiles are linearly independent (degree of independence depends on the variability of the source profile). It may be necessary to combine chemically similar source categories or add additional fitting species to the model. • Measurement uncertainties are random, uncorrelated, and normally distributed (effects unknown). PM Data Analysis Workbook: Source Apportionment
CMB Application Protocol (1 of 2) • Assessmodel applicability (e.g., data from well-characterized methods, large number of species, major sources identified, source profiles available, and reasonable uncertainties attached). • Selectsource profiles for potential contributors (e.g., area, natural, and point sources plus other sources identified in preliminary analyses). • Select sources for inclusion in the CMB solution (e.g., upwind point, seasonal emitters, non-collinear profiles). • Determine initial source contribution estimates (SCE) (e.g., use variety of source profiles and fitting species combinations, determine effects on results of alternate source profiles). May need to combine similar source types due to collinearity. • Examine model outputs andperformance measures. Do spatial and temporal results make sense considering meteorology and source emission patterns? PM Data Analysis Workbook: Source Apportionment
CMB Application Protocol (2 of 2) • Check how the removal and addition of some species affects results. The source profiles need to be the most precise for the most influential species. • Identifydeviations from model assumptions (e.g., source compositions constant, all sources included, source profiles independent, etc.). • Identify and correct model input errors (e.g., increase uncertainty of profiles, provide different composites, identify and characterize missing sources, stratify samples by meteorology). • Verify consistencyand stability of SCE (substitute different profiles for same source type, add or drop species from fit, examine source contributions to individual species). • Evaluate results of CMB with respect to other source assessment methods (e.g., compare SCEs among nearby sites, compare source contribution variations over time with expected emissions and meteorology, apply other receptor methods and compare results, apply dispersion models and compare results). PM Data Analysis Workbook: Source Apportionment
Example CMB Performance Goals (1 of 2) • Standard error is the variance of the SCE. • Chi square (2) is used to consider the uncertainty of the calculated species concentrations (weighted sum of squares of the differences between calculated and measured fitting species concentrations). Values < 1.0 indicate a very good fit. • The percent mass is the percent ratio of the sum of model-calculated SCEs to the measured mass concentration. This is used to track the percent explained mass; a value near 100 percent can be misleading because poor fits can force a high percent mass. • The t-statistic is the ratio of the SCE to its standard error. The standard error of the SCE is an indicator of the precision in the model estimates. Values < 2.0 identify model estimates that are not significantly different from 0. • R2 is used to measure the variance in the ambient species concentrations, which is explained by the calculated species concentrations via linear regression. The closer the value is to 1.0, the better the SCEs explain the measured concentrations. Note that these performance goals are subjective and based on early experience with TSP and PM10 models. Goals may change as more PM2.5 CMB applications are performed. PM Data Analysis Workbook: Source Apportionment
Example CMB Performance Goals (2 of 2) • Degrees of freedom (df) is the number of species in the fit minus the number of sources in the fit. Some researchers recommend df >> 5. • The ratio of the calculated species mass (C) to measured species mass (M) is used to identify species that are over- or under-accounted for by the model. A ratio >1.0 means that more mass for a given species was accounted for by the model than was measured in the ambient sample. • The ratio of the residuals to the uncertainty is the signed difference between C and M divided by the uncertainty of the difference. It is used to identify species that are over- and under-accounted for by the model. • The normalized modified pseudo-inverse matrix (MPIN), a diagnostic output of CMB7, indicates the degree of influence each species concentration has on the contribution and standard error of the corresponding source category. MPIN is normalized such that it takes on values from -1.0 to 1.0. Species with MPIN absolute values of 0.5 to 1.0 are associated with influential species. • U/S clusters: Maximum source uncertaintyand minimum source projection (Henry, 1992) are used to assess clusters of sources which the model cannot easily distinguish between and that are likely to be interfering with the model's ability to provide a good set of SCEs. Results should not contain uncertainty clusters. PM Data Analysis Workbook: Source Apportionment
Source Profiles • For source-receptor modeling, use profiles that are representative of the study area during the period when ambient data were collected. • Include ubiquitous sources such as gasoline and diesel exhaust, secondary components (sulfate, nitrate, ammonium), sea salt (if coastal site), vegetative burning (e.g., forest fires, residential fireplaces), and crustal material. • Include point sources identified in the emission inventory. • Try available source profiles in sensitivity tests to determine the best ones for use (minimize collinearity). Accurate source profiles are the key to successful modeling. PM Data Analysis Workbook: Source Apportionment
CMB Example Analyses (1 of 2) Seattle area, Washington • Source apportionment example for the Seattle, Washington area for 1996-1997 using PM2.5 data. • Note seasonal differences in monthly average PM2.5 source contribution percentages of burning (fall/winter) and secondary sulfate (summer). • Differences in winter and summer burning contributions are also noticeable using pie charts for the data set. From Maykut et al. (1998) Figures made using Excel. PM Data Analysis Workbook: Source Apportionment
CMB Example Analyses (2 of 2) • When additional organic carbon species and source profiles are available, CMB can be used to provide substantial breakdown of the organic carbon component. • The dominant sources of PM2.5 organic aerosol were diesel exhaust, gasoline exhaust, wood smoke, and meat cooking. Chart generated in Excel; adapted from Cass, 1997. Other organics include paved road dust, cigarette smoke, vegetative detritus, tire wear debris, and secondary organics. See also Scheff et al., 1984. PM Data Analysis Workbook: Source Apportionment
Checking Source Apportionment Results • Three Kraft paper mills in the Washington state study area are located to the south, north, and northwest of the monitoring site. • Agreement between the wind directions associated with specific profiles and the actual locations of the sources adds credibility to the source apportionment results. From Maykut et al. (1998) Radar Plot: nanograms of PM attributed to the source by wind direction. PM Data Analysis Workbook: Source Apportionment
Uncertainties and Limitations in Source-Receptor Analyses • Many emitters have similar species composition profiles. The practical implication of this limitation is that one may not be able to discern between the dust emitted by agricultural practices and dust emitted by mobile sources on unpaved roads. One approach here is to add additional species to reduce collinearity. For example, specific VOC species could be used with PM profiles to differentiate between natural soil dust from motor vehicle related dust. • Species composition profiles change between source and receptor. Most source-receptor models cannot currently account for changes due to photochemistry. Since nitrates, sulfates, and some organic carbon compounds are primarily of secondary origin, current methods cannot tie these compounds to their primary emission sources. This is discussed further in the following pages. • Receptor models cannot predict the consequences of emissions reductions. One cannot estimate source profiles resulting from changes in emissions and predict ambient concentrations using receptor models. However, source-receptor models can check if control plans achieve their desired reductions. PM Data Analysis Workbook: Source Apportionment
Handling the Apportionment of Secondary PM (1 of 2) • For many assessments, secondary formation of particles is important. While CMB is not formulated to explicitly treat secondary transformation, a surrogate procedure is available to give some information on at least the extent of secondary materials in the ambient data. The surrogate procedure was developed for CMB7 for treating secondary PM10 particles. • One of thekey assumptions made by CMB7 is that chemical species do not react with each other, i.e., that “compositions for the source categories are obtainable which represent the source profile as it is perceived at the receptor” for the chemical species of interest (e.g., U.S. EPA, 1998b). Thus, CMB7 assumes no changes to the aerosol during transport and ideally apportions the primary material that has not changed between source and receptor. However, certain species, e.g., sulfur (S), that dominate polluted airsheds have both primary and secondary sources. • In such airsheds (e.g., many designated “Serious PM10 Areas” by EPA), secondary aerosols may contribute significantly to the ambient loading seen at receptors. These secondary materials are often in the form of reactive species such as NH4+, SO4=, NO3-, and organic carbon (OC). If sources of such materials are not explicitly treated, CMB7 will tend to underaccount for total particle mass (% MASS value). As stated in the CMB Protocol, “if a compound which is secondarily formed or is normally associated with regional scale pollution (such as sulfate) is included as a fitting species, a ‘single constituent source type’... must also be included in the fit….” Use of the single constituent source profile for secondary particles was initially suggested by Watson (1979). With this technique, the secondary species are “apportioned to chemical compounds rather than directly to sources.” • Setting up secondary “source” profiles. A table in the appendix to this section illustrates an example of the way the technique was used in an actual application for California's South Coast Air Basin (Watson et al., 1994 ). Secondary source profiles consisting of “pure” ammonium sulfate (AMSUL), ammonium bisulfate (AMBSUL), ammonium nitrate (AMNIT), and organic carbon (OC) were used to apportion the remaining NH4+, SO4=, NO3-, and OC that would not be apportioned to the primary particle profiles. For some secondary species thought to be significant (e.g., note the OC column), a source profile was created which includes only that component, in which the percentage composition in the profile is set to 100%. PM Data Analysis Workbook: Source Apportionment
Handling the Apportionment of Secondary PM (2 of 2) • Secondary “source” profiles (continued). For other secondary species, only some chemical components may have been measured. For instance, elemental S and/or sulfate ion (SO4=) may be measured rather than ammonium sulfate, (NH4)2SO4. In such a case, the respective species abundances in the (NH4)2SO4 would equal the mass % of each species in (NH4)2SO4. Thus, in the AMSUL profile the abundance of S in pure (NH4)2SO4 is listed as 24.3% and the abundance of SO4= is listed as 72.7%. Examples are also given for other secondary species and their chemical components. In all cases, the uncertainty was arbitrarily set to 10%. In the CMB7 calculations, the portion of a measured secondary species not accounted for by other source types becomes assigned to its corresponding single constituent source type, as represented by profiles such as those described here. • These examples are described as profiles for secondary species. However, the secondary profile may not represent secondary aerosol exclusively. For example, Watson et al. (1994) indicate that the OC profile in the appendix table may account for contributions from fugitive sources not included in the CMB7 calculation (e.g., cooking, plant parts, or tire wear) in addition to secondary sources. In such a case, the technique may be considered as a means to get an upper estimate of the amount of aerosol attributable to secondary formation. • One of the advantages of using the single constituent source profile technique is that it can account for that part of the ambient mass that is not accounted for by the primary sources included in the CMB7 calculations. However, this technique cannot yield any information on the specific source types contributing to the species in the single constituent profiles. Furthermore, the ambient mass may still be underestimated in some cases. For example, Conner et al. (1993) reported that fine particle mass may have been underaccounted for in their CMB7 calculations because of the likelihood of some amount of water associated with hygroscopic (or deliquescent) sulfates. The amount of mass due to this water depends on the form of the sulfate and relative humidity factors. Coulter, 1999 PM Data Analysis Workbook: Source Apportionment
Handling the Apportionment of Secondary PM • CMB does not explicitly treat secondary transformation and, thus, will tend to underaccount for total particle mass. • A surrogate procedure, using single constituent source profiles, can be used to give information on secondary materials in the ambient data. • Secondary species are “apportioned to chemical compounds rather than directly to sources.” Secondary source profiles consisting of “pure” organic carbon and ammonium sulfate, bisulfate, and nitrate can be used to apportion the remaining NH4+, SO4=, NO3-, and OC that would not be apportioned to the primary particle profiles. • The secondary profile may not represent secondary aerosol exclusively. Contributions from primary fugitive sources (e.g., OC from cooking, plant parts, or tire wear) not included in the CMB calculation may be included in the results. In such a case, the technique provides an upper estimate of the amount of aerosol attributable to secondary formation. • This single constituent source profile technique cannot yield any information on the specific source types contributing to the species in the single constituent profiles. PM Data Analysis Workbook: Source Apportionment
Trajectory Approaches (1 of 2) • Detailed air mass history calculations using the CAPITA Monte Carlo model are being employed to investigate long-term, synoptic-scale meteorological conditions associated with ambient air quality and deposition measurements at various locations. • By combining multi-year sets of regional-scale meteorological data and local ambient pollution monitoring data, a long-term or “climatological” description of an airshed can be presented in probabilistic terms. • One result of these analyses is an estimate of predominant source regions for periods of high concentrations (or deposition) of specific air pollutants at a site. For example, the analyses address the question: During episodes with high concentrations of sulfate, which areas were likely upwind? Poirot et al., 1998 PM Data Analysis Workbook: Source Apportionment
Trajectory Approaches (2 of 2) • Trajectory cluster analysis is a method to categorize a large set of trajectories into groups of similar trajectories. • Goals of these analyses are to minimize differences among trajectories in a cluster and maximize differences among clusters. • The analyses result in distinct clusters representing different synoptic regimes. • The analyses are useful for estimating pollutant source regions, interpreting forecast trajectory errors, identify similar meteorological scenarios for case studies, and compare data on a cluster-by-cluster basis (example written for precipitation data, but could be applied to other pollutant data.) Rolph et al., 1999 PM Data Analysis Workbook: Source Apportionment
Example Air Mass History Analysis Upwind probabilities for high aerosol arsenic at three Champlain Basin sites • Upwind probability plots for high arsenic concentrations have a strong NW orientation at all three sites, pointing directly toward a smelter region. • The location of several large smelters are also identified in the plots, with the smelter identified as a green dot appearing to be the most likely contributor (the yellow dot is the receptor location). • High arsenic levels appear to be excellent tracers for influence in the Lake Champlain Basin from the smelter region. Poirot et al. (1998) Shaded areas show 20%, 40%, and 60% of upwind probability on highest concentration day PM Data Analysis Workbook: Source Apportionment