Artigo Acesso aberto Revisado por pares

Global-scale P wave tomography optimized for prediction of teleseismic and regional travel times for Middle East events: 1. Data set development

2011; American Geophysical Union; Volume: 116; Issue: B4 Linguagem: Inglês

10.1029/2010jb007967

ISSN

2156-2202

Autores

Stephen C. Myers, G. Jóhannesson, N. A. Simmons,

Tópico(s)

Seismology and Earthquake Studies

Resumo

Journal of Geophysical Research: Solid EarthVolume 116, Issue B4 SeismologyFree Access Global-scale P wave tomography optimized for prediction of teleseismic and regional travel times for Middle East events: 1. Data set development S. C. Myers, S. C. Myers Atmospheric, Earth and Energy Division, Lawrence Livermore National Laboratory, Livermore, California, USASearch for more papers by this authorG. Johannesson, G. Johannesson [email protected] Systems and Decision Sciences, Lawrence Livermore National Laboratory, Livermore, California, USASearch for more papers by this authorN. A. Simmons, N. A. Simmons Atmospheric, Earth and Energy Division, Lawrence Livermore National Laboratory, Livermore, California, USASearch for more papers by this author S. C. Myers, S. C. Myers Atmospheric, Earth and Energy Division, Lawrence Livermore National Laboratory, Livermore, California, USASearch for more papers by this authorG. Johannesson, G. Johannesson [email protected] Systems and Decision Sciences, Lawrence Livermore National Laboratory, Livermore, California, USASearch for more papers by this authorN. A. Simmons, N. A. Simmons Atmospheric, Earth and Energy Division, Lawrence Livermore National Laboratory, Livermore, California, USASearch for more papers by this author First published: 13 April 2011 https://doi.org/10.1029/2010JB007967Citations: 19AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Abstract [1] We extend the Bayesloc seismic multiple-event location algorithm for application to global arrival time data sets. Bayesloc is a formulation of the joint probability distribution spanning multiple-event location parameters, including hypocenters, travel time corrections, pick precision, and phase labels. Stochastic priors may be used to constrain any of the Bayesloc parameters. Markov Chain Monte Carlo sampling is used to draw samples from the joint probability distribution, and the posteriori samples are summarized to infer conventional location parameters such as the hypocenter. The first application of the broad area Bayesloc algorithm is to a data set consisting of all well-recorded events in the Middle East and the most well-recorded events with 5° spatial sampling globally. This sampling strategy is designed to provide the ray coverage needed to determine lithospheric-scale P wave velocity structure in the Middle East using the complementary ray geometry provided by regional (subhorizontal) and teleseismic (subvertical) raypaths and to determine a consistent, albeit lower-resolution, image of global mantle structure. The data set consists of 5401 events and 878,535 P, Pn, pP, sP, and PcP arrivals recorded at 4606 stations. Relocated epicenters are an average of 16 km from bulletin locations. The data set included events that are known to an accuracy of 1 km (a.k.a. GT1) based on nonseismic information. The average distance between GT1 epicenters and our relocated epicenters is 5.6 km. For arrivals labeled P, Pn, and PcP, ∼92%, ∼90%, and 96% are properly labeled with probability >0.9, respectively. Incorrect phase labels are found to be erroneous at rates of 0.6%, 0.2%, 1.6%, and 2.5% for P, Pn, PcP, and depth phases (pP and sP), respectively. Labels found to be incorrect, but not erroneous, were reassigned to another phase label. P and Pn residual standard deviation with respect to ak135 travel times are dramatically reduced from 3.45 s to 1.01 s. The differences between travel time residuals for nearly reciprocal raypaths are significantly reduced from the input event locations, suggesting that Bayesloc relocation improves data set consistency. The reciprocity tests suggest that the dominant contribution to travel time residuals calculated from information provided in global bulletins is location and picks errors, not travel time prediction errors due to 3-D structure. Modeling the whole multiple-event system results in accurate locations and an internally consistent data set that is ideal for tomography and other travel time calibration studies. Simmons et al. (2011) (companion paper) use the Bayesloc-processed data set to develop a 3-D tomographic image, which further reduces residual standard deviation to 0.50 s. Key Points Multiple seismic event location procedures provide accurate locations A stochastic location approach is essential for developing a tomographic data set Bayesian statistics identify seismic travel time outliers 1. Introduction [2] Production of accurate global and regional seismic bulletins for use in Earth model development and empirical travel time calibration remains a painstaking and costly endeavor. The International Seismic Centre (ISC) produces the most comprehensive bulletin of seismic event information, containing reconciled arrival time measurements (picks) and event hypocenters based on data contributions from the National Earthquake Information Center (NEIC) and numerous regional and local network operators. As such, the ISC bulletin is the summary of an impressive international effort to analyze a massive stream of global seismic data. Despite the wealth of information contained within the ISC bulletin, direct use of the ISC bulletin (or constituent bulletins) in travel time tomography has been discounted. A number of investigations find that tomographic signal, the component of a travel time residual attributed to deviation from a reference Earth model, is obscured by errant picks and substantial hypocenter errors [e.g., Grand, 1990; Gudmundsson et al., 1990; Creager and Boyed, 1992; Röhm et al., 1999; Husen et al., 2009]. [3] Recognizing both the wealth of information in the ISC bulletin, as well as data quality issues, Engdahl et al. [1998] developed and applied methods to identify and remove pick outliers and to correct phase-naming mistakes. Engdahl et al. [1998] also made use of core phases (e.g., PKP) and secondary phases (pP, sP, pwP) to relocate and improve hypocenter accuracy. The resulting bulletin (EHB hereafter) remains the most utilized data set for the development of global P wave tomography models [e.g., van der Hilst et al., 1997; Kennett et al., 1998; Bijwaard et al., 1998; Boschi and Dziewonski, 1999; Kárason and van der Hilst, 2000; Antolik et al., 2003; Montelli et al., 2004]. Li et al. [2008] make use of teleseismic and regional EHB data to develop, arguably, the most comprehensive image of mantle P wave structure to date. The EHB bulletin continues to evolve as detailed analysis of individual events and multiple-event analysis of event clusters (e.g., aftershock sequences) are completed [e.g., Bondár et al., 2008; Bondár and McLaughlin, 2009]. However, inaccuracies and inconsistencies are inevitable for bulletins that are based on single-event locations (e.g., EHB), because additions and improvements are generally made one event at a time. [4] Other approaches use waveform cross correlation to make relative arrival time measurements with precision on the order of the seismogram sampling rate. Recorded signals can be cross correlated with synthetic seismograms [e.g., Grand, 1994, 2002; Simmons et al., 2006, 2007], or many waveforms for a single event can be cross correlated to obtain relative arrival times [e.g., Masters et al., 2000; Bolton and Masters, 2001; Ritsema and van Heijst, 2002; Houser et al., 2008]. Despite producing exceptionally precise measurements, application of correlation methods is currently limited to long-period, teleseismic signals. Unlike teleseismic waveforms, widespread waveform similarity for recorded regional waveforms is uncommon, which severely limits the applicability of empirical waveform correlation. Matching synthetic and empirical regional waveforms is similarly challenging, because region-specific models, or fully 3-D models, are needed in order to adequately match observed regional waveforms. As a result, waveform methods require further development before a self-consistent regional and teleseismic database can be produced. [5] In this study we continue the effort to improve the accuracy and consistency of bulletin data by adapting the Bayesloc method [Myers et al., 2007, 2009] of multiple-event location for application to global seismicity. Bayesloc is a formulation of the joint probability function that spans hypocenters, travel time corrections, pick precision, and phase labels. Previous versions of Bayesloc were tailored for application to event clusters (e.g., aftershock sequences), with travel time correction and pick precision formulations that were designed for robustness. The updated Bayesloc algorithm simultaneously relocates events, assesses pick precision, and probabilistically assesses phase labels using a more generalized, datum-specific travel time correction to produce an accurate and consistent global/regional bulletin. Importantly, Bayesloc phase labels are probabilistic, and at no point is any one phase label chosen. Possible labels include all phases under consideration and the possibility that the label is erroneous. Bayesloc allows prior constraints on any aspect of the multiple-event system, enabling directly utilization of previous work that statistically characterizes the accuracy of event hypocenters and picks [e.g., Bondár et al., 2004; Bondár and McLaughlin, 2009]. The use of prior information helps to mitigate regional location bias and improve outlier identification. The first application is to a data set of all well-recorded events in the Middle East and an even sampling of the best recorded events sampling the globe. 2. Method Multievent Relocation [6] Multiple-event methods consist of simultaneously inverting arrival times for many events to determine both event locations and a set of travel time corrections. Travel time corrections typically take the form of a time term for each station/phase [e.g., Douglas, 1967], which restricts applicability to instances where travel time prediction errors at a station and for a given phase are approximately equal for all events (e.g., event clusters). Multiple-event methods are known for generating relative locations with precision of 1 km or less, but loss of location accuracy is inherent to the unconstrained station/phase formulation, resulting in consistent location bias [Douglas, 1967; Jordan and Sverdrup, 1981; Pavlis and Booker, 1983]. Multiple-event methods based on differences of travel time residual (a.k.a. double difference) have become common practice. Double difference involves first computing differences between observed and predicted travel times, then differencing the permutation of residuals. Alternatively, double differences can be computed directly using waveform cross correlation [Got et al., 1994]. Wolfe [2002] shows that double-difference methods are mathematically equivalent to other multiple-event methods, except the station phase term is implicit. Waldhauser and Ellsworth [2000] advance the double difference method by allowing the residual difference to vary slowly with event location, which enables application to larger geographic areas. Menke and Schaff [2004] show that double difference methods can resolve absolute location if network coverage is outstanding. [7] The Bayesloc method [Myers et al., 2007] parameterizes the travel time correction as an adjustment to the travel time curve for each phase, with the addition of station terms that are drawn from a zero-mean prior. Correcting the travel time curve mitigates gross travel time prediction errors, and zero-mean station phase terms maintain resolution of absolute location while accounting for path-specific travel time variations. This formulation allows resolution of absolute location and is robust to poor network configuration, both of which are important for broad application of the multiple-event method. [8] In addition to precise locations, multiple-event relocation can be used to identify and remove outlier data [e.g., Engdahl and Bergman, 2001]. In single-event algorithms, the event location can move significantly to accommodate an outlier datum, because the tolerance for overall arrival time misfit is relatively large. In multiple-event locators, the use of a self-consistent set of travel time correction parameters significantly reduces the tolerance for arrival time misfit. Reduced tolerance for data misfit reduces the distance that an event location can move to better fit an outlier datum, which results in a larger residual for outlier data and more confident outlier identification. Bayesloc Modified for a Global Data Set [9] Bayesloc is a statistical model of the multiple-event system that includes (1) event locations (x), (2) event origins times (o), (3) travel times (T) (including Earth model-based travel times (F) and plus corrections (τ)), (4) arrival time measurement (pick) precisions (σ), and (5) phase labels (W). [10] All of these parameters are estimated using the arrival times (a) and input phase labels (w) for all events under consideration. The overarching statistical model is Equation (1) decomposes the inversion of arrival times and phase label data to determine the components of the multiple-event system (left-hand side of equation (1)), into a collection of "forward" problems and prior constraints (right-hand side of equation (1)). Specifically, the first term (right-hand side) computes the probability of observing the collection of arrivals given a set of hypocenters, travel times, phase labels, and pick precisions. The second term computes the probability of all travel times, given a model-based prediction (event location is implicit) and a collection of correction parameters. The third, fourth, fifth, and sixth terms are prior constraints on hypocenters, arrival time measurement precision parameters, travel time correction parameters, and input phase labels, respectively. The denominator is the probability over all arrival data, which serves as normalization. We note that equation (1) is a general statistical formulation and each of the terms may be specified in a number of ways. For example, any Earth model can be used to compute the uncorrected travel time (F(x)), and the formulation of travel time corrections (τ) can take many forms. Specification of each term in equation (1) is provided by Myers et al. [2007, 2009], and all but the travel time correction (τ) remain the same in the application of Bayesloc to global data. [11] Myers et al. [2007, 2009] implement simple travel time corrections that include a phase-specific correction to the slope and intercept of the travel time curve, which robustly correct for gross prediction errors, and a collection of station terms that are drawn from a zero-mean distribution. This formulation is suitable for event clusters, where a station terms provide path-specific corrections from an event cluster to each station. However, a single station term does not provide a path specific correction if events are widely distributed. As such, we recast the datum-specific, Bayesloc travel time correction as where δ is the travel time correction, T is the corrected travel time, and F is the Earth model-based travel time. β is an adjustment to the slope of the travel time curve and αw is a static shift of the travel time curve, as in previous versions of Bayesloc. The double bars in the β term indicate event-station distance and the variables x and s represent event and station positions, respectively. Additional α terms are static corrections for each station (j), event (i), phase (W), or combination thereof. The collection of all travel time corrections constitutes the travel time corrections (τ) in equation (1). In effect, station and event terms (αi, αj) are small corrections with respect to the corrected travel time curve (Fw+ αw, + bw∣∣xi−sj∣∣) and the station phase and event phase terms (αiw, αjw) can further refine the travel time correction. Realizations for each category of α term (αi, αj, αiw, αjw) are drawn from a zero-mean Gaussian distribution with unknown variance. The variance of each α term is estimated throughout Bayesloc sampling. Decomposition and sampling of terms in this way allows robust determination of station and event corrections, with resolution of event phase and station phase corrections if sufficient data are available. The α terms tend toward zero unless data suggest otherwise. [12] Bayesloc uses the Markov Chain Monte Carlo (MCMC) method to sample the joint probability of the multiple-event system [see Gelman et al., 2004]. Sampling the probability function is accomplished by starting with an initial configuration of the system, then randomly proposing a new configuration using specifically designed proposal distributions [see Myers et al., 2007, 2009]. The probability of the current and proposed multiple-event configuration is computed using the forward calculations on the right-hand side of equation (1). A proposed configuration is always accepted as the new "state" of the system if the probability is greater than the current state. If the probability of the proposed state is lower than the current state, then the new state is accepted at a rate specified by the ratio of the probability for the proposed state and the current state. Clearly, configurations of event locations and travel time corrections that reduce arrival time residuals are higher-probability configurations and are, therefore, preferred by the MCMC algorithm. Likewise, fitting residuals with representative arrival time measurement precision (1/variance) parameters maximizes probability by including the most data within the high-probability portion of the distribution. [13] The process of accepting/rejecting proposed configurations is continued until adequate sampling of the joint probability density is achieved (typically 10,000 to 20,000 iterations). Graphical examination of the MCMC samples can be used to assess the nonparametric probability density or an analytical form (e.g., Gaussian) may be used to summarize the MCMC samples. For example, the mean or mode of latitude and longitude samples for given event may be used as an epicenter estimate, and an epicenter uncertainty ellipse can be computed from the latitude and longitude covariance. Myers et al. [2007, 2009] provide pseudo code specifying the MCMC sampling procedure, as well as a graphical description of the procedure. 3. Data Set Event Selection and Data Sources [14] For the first application of Bayesloc to a broad area data set we have gathered a list of 5401 events throughout the Middle East and well-recorded events that provide global epicenter coverage. Over 4000 Middle East events and associated picks are drawn from the Lawrence Livermore National Laboratory (LLNL) database [see Ruppert et al., 2005], which is a compilation of global and regional bulletins as well as ∼20,000 Pn arrival time measurements for Middle East events at regional stations made by LLNL staff. Outside of the Middle East the best recorded events in the EHB bulletin [Engdahl et al., 1998] are selected to achieve nominal global epicenter spacing of ∼5°. Sampling of the global data set is accomplished by selecting the event with the most teleseismic P phase arrivals and removing all other events within ∼5° arc distance of the selected event. The process is repeated until the event list is exhausted. In order to preserve the depth sampling afforded by the EHB bulletin, geographic event sampling is conducted in depth bins with lower bounds of 35 km, 75 km, 150 km, 300 km, 450 km, and 700 km. All together, 878,535 P, Pn, pP, sP, and PcP arrivals recorded at 4606 stations comprise the data set (Figure 1). Table 1 lists the number of arrivals for each phase. Figure 1Open in figure viewerPowerPoint Event epicenters and station locations. All well-recorded events in the Middle East are retained and global event sampling is approximately 5°. Global sampling is performed independently in event depth intervals down to 700 km (see text). The resulting data set provides horizontal and vertical ray coverage through the Middle East, which is used by Simmons et al. [2011] in a tomographic study. Table 1. Number of Picks for Each Event and Summary of Posteriori Assessment of Phase Labels Phase Number of Picks Estimated Standard Deviation Phase Label Retained With Probability >0.9 (%) Input Label Is Most Probable (%) Most Probably Erroneous (%) P 817,552 0.74 s 92 96 0.6 Pn 42,327 0.90 s 90 98 0.2 pP 10,524 1.60 s 90 95 2.1 sP 4,992 2.22 s 92 97 2.6 PcP 3,140 1.83 s 96 98 1.6 [15] The event sampling strategy is designed to provide the ray coverage needed to determine lithospheric-scale structure in the Middle East using the complementary ray geometry provided by regional (subhorizontal) and teleseismic (subvertical) raypaths, and to determine a consistent, albeit lower-resolution, image of global mantle structure. Bayesloc Relocation [16] The Bayesloc joint posteriori distribution for the Middle East/Global data set is determined using four Markov Chains. The results presented here are averages of the last 12,000 of 15,000 MCMC samples for each chain. The first 3000 samples ("burn in") were used to find the neighborhood of the mode of the posteriori distribution and to adapt MCMC sampling. As such, the first 3000 samples are not necessarily representative of the joint probability density. Chain mixing, using the parameter configuration of another chain as a proposed configuration, was used to test for local minima. In order to better sample event locations, the starting position for each chain was randomly perturbed from the location of the station having the earliest arrival time pick for that event. Starting depths were set to 15 km, except for depths greater than 70 km as listed in the EHB bulletin, in which case the EHB depth was used as the starting depth. In addition to using the EHB depths as the starting positions, we place a prior constraint on event depth of ±10 km (standard deviation) for EHB locations with depths greater than 70 km (Table 2) and we fix event depths greater than 400 km. Our justification is that EHB depth phase data were scrutinized for consistency and the effect of slow wave propagation through the water column was accounted for when the surface bounce point was determined to be in the ocean. Table 2. Prior Constraints (standard deviation)a Travel Time Curves Event Depth Correct Phase Label Input Shift Slope P 10−6 s 5 s/degree 0.9 Pn 5 s 5 s/degree 0.9 pP 10−6 s 5 s/degree 0.9 sP 10−6 s 5 s/degree 0.9 PcP 10−6 s 5 s/degree 0.9 EHB events with depth >70 km 10 km a Priors not listed are uninformative (broad). [17] The ak135 model [Kennett et al., 1995] was used for all Earth model travel time predictions (F in equation (2)). For this data set travel time corrections include change of slope of each model-based travel time curve. Myers et al. [2007] demonstrate that Bayesloc robustly determines the slope of the travel time curve, so we use noninformative priors on the slope correction (β in equation (2)). However, resolution of the shift of the travel time curve (αw in equation (2)) requires prior constraints on either the shift itself or prior constraints on some event origin times. Because the ak135 model was developed and validated using the limited set of nuclear explosions with known origin times and using locally recorded events, for which the origin time is well constrained [Kennett et al., 1995], the absolute travel time for teleseismic phases cannot be improved in this study. Therefore, we place tight constraints on the shift of teleseismic travel time curves for the P, pP, sP, and PcP phases. Unlike teleseismic phases, ak135 travel time errors for the regional Pn phase vary geographically, causing biases that can exceed several seconds in some regions. Therefore, we allow Bayesloc to model the regional Pn travel time bias in the Middle East by placing a loose prior on the shift of the travel time curve of ±5 s (standard deviation). [18] In addition to travel time curve corrections, the Bayesloc travel time correction model includes event, station, event phase, and station phase parameters. Prior constraints on the standard deviation of these parameters are uninformative, but MCMC samples are drawn from a distribution with a mean of zero, which imparts a loose prior on the posteriori samples. Prior constraints on the measurement precision are also uninformative, so data weighting is entirely determined by adapting precision parameters to fit data distributions during the MCMC sampling. Last, there is a 90% chance that the input phase label will be selected for testing. In the remaining 10% of the tests there is an equal chance that one of the other phases under consideration or the "erroneous" label will be selected for testing. 4. Results Epicenter Shifts [19] Figure 2 shows epicenter shifts for all events. Epicenter shifts for event depths greater and less than 100 km are displayed separately. Epicenters shift by 16 km on average relative to the input bulletin locations. The shift in position from the starting (single event) epicenter to the Bayesloc epicenter is not random, and the consistency and magnitude of epicenter shifts for events in and near subduction zones are particularly strong. While the direction of epicenter shift for deep and shallow events is consistent near subduction zones, the magnitude of the shift is larger for shallow events (Figure 2). The direction of epicenter shifts in subduction zones tends to be toward the subduction trench, which is consistent with the expected shift due to a slab of fast oceanic lithosphere dipping under the overriding plate [Creager and Boyed, 1992]. Outside of subduction zones, epicenters tend to shift northward in the Pacific basin, eastward in Africa, and northward in the Middle East. Little shift is observed throughout northern Asia, because most of the events are nuclear explosions for which the known location is listed in the EHB bulletin. Figure 2Open in figure viewerPowerPoint Epicenter relocation vectors. The tail of each vector is at the input epicenter. Vector length is scaled by the magnitude of the epicenter shift (see inset scales), and vector orientation is in the direction of epicenter shift. (a) Global epicenter shifts, depths ≤ 100 km. (b) Global epicenter shifts, depths > 100 km. (c) Middle East epicenter shifts, depths ≤ 100 km. (d) Middle East epicenter shifts, depths > 100 km. [20] Eight of the events in the global data set are listed in the IASPEI Reference Event List [Bondár and McLaughlin, 2009] with location accuracy of 1 km or better. The mean and median differences between reference epicenters and Bayesloc epicenters are 5.6 km and 4.7 km, respectively. This result suggests that the Bayesloc locations are substantially more accurate than the bulletin locations, given that the average epicenter shift is far larger than the error with respect to known locations. [21] Previous studies to improve global, single-event location accuracy focus on improving travel time prediction using a 3-D Earth model. Antolik et al. [2001] report on the location accuracy that can be achieved using a global mantle model and teleseismic P wave arrival times from the ISC. The mantle model is paired with a number of crustal models and the smallest median epicenter error achieved is 7.83 km. Yang et al. [2004] conduct similar location tests with a mixed regional (Pn) and teleseismic (P) data set. Arrival times are carefully culled in this case, and a median epicenter error of 5.7 km is achieved. The results for the Antolik et al. [2001] and Yang et al. [2004] studies reported here are for explosion source with location accuracy of 1 km or better, making the test data sets similar to ours. Location Example [22] Figure 3 shows the location of the 28 May 1998 Pakistan nuclear explosion and Bayesloc location predictions. The event was well recorded, but station sampling is not geographically uniform. Residual travel times at European stations are early (negative residual) with respect to the known location (http://isis-online.org/isis-reports/detail/satellite-images-of-may-28-1998-test-site/12#images) and ak135 [Kennett et al., 1995] travel times. The predominance of European stations and the negative residuals at those stations results in a northward epicenter bias when the ak135 model is used for travel time prediction (mislocation of 10.1 km). Because the prediction errors are not random, the resulting epicenter error ellipse does not cover the true location. Bayesloc travel time corrections mitigate travel time prediction bias, resulting in an epicenter error of 4.5 km. Modeling all components of the location system, including pick and model error, results in a reduction of the epicenter error ellipse area from 207 km2 to 70 km2. More importantly, the Bayesloc error ellipse covers the known location because the marginal probability of the event location integrates over the joint probability of all other multiple-event parameters. Figure 3Open in figure viewerPowerPoint Single-event and Bayesloc global relocation of the 28 May 1998 Pakistan nuclear test. (a) Network distribution used to relocate the event. The star is the event location. Triangles are stations with fill color based ak135 residuals when the event is at the satellite imagery l

Referência(s)