Source Localization vs. Source Imaging
The localization approach to MEG/EEG source estimation considers that brain activity at any time instant is generated by a relatively small number (a handful, at most) of brain regions. Each source is therefore represented by an elementary model, such as an ECD, that captures local distributions of neural currents. Ultimately, each elementary source is back projected or constrained to the subject’s brain volume or an MRI anatomical template, for further interpretation. In a nutshell, localization models are essentially compact, in terms of number of generators involved and their surface extension (from point-like to small cortical surface patches).
The alternative imaging approaches to MEG/EEG source modeling were originally inspired by the plethoric research in image restoration and reconstruction in other domains (early digital imaging, geophysics, and other biomedical imaging techniques). The resulting image source models do not yield small sets of local elementary models but rather the distribution of ‘all’ neural currents. This results in stacks of images where brain currents are estimated wherever elementary current sources had been previously positioned. This is typically achieved using a dense grid of current dipoles over the entire brain volume or limited to the cortical gray matter surface. These dipoles are fixed in location and generally, orientation, and are homologous to pixels in a digital image. The imaging procedure proceeds to the estimation of the amplitudes of all these elementary currents at once. Hence contrarily to the localization model, there is no intrinsic sense of distinct, active source regions per se. Explicit identification of activity issued from discrete brain regions usually necessitates complementary analysis, such as empirical or inference-driven amplitude thresholding, to discard elementary sources of non-significant contribution according to the statistical appraisal. In that respect, MEG/EEG source images are very similar in essence to the activation maps obtained in fMRI, with the benefit of time resolution however.
Inverse modeling: the localization (a) vs. imaging (b) approaches. Source modeling through localization consists in decomposing the MEG/EEG generators in a handful of elementary source contributions; the simplest source model in this situation being the equivalent current dipole (ECD). This is illustrated here from experimental data testing the somatotopic organization of primary cortical representations of hand fingers. The parameters of the single ECD have been adjusted on the [20, 40] ms time window following stimulus onset. The ECD was found to localize along the contralateral central sulcus as revealed from the 3D rendering obtained after the source location has been registered to the individual anatomy. In the imaging approach, the source model is spatially-distributed using a large number of ECD’s. Here, a surface model of MEG/EEG generators was constrained to the individual brain surface extracted from T1-weighted MR images. Elemental source amplitudes are interpolated onto the cortex, which yields an image-like distribution of the amplitudes of cortical currents.
Dipole Fitting: The Localization Approach
Early quantitative source localization research in electro and magnetocardiography had promoted the equivalent current dipole as a generic model of massive electrophysiological activity. Before efficient estimation techniques and software were available, electrophysiologists would empirically solve the MEG/EEG forward and inverse problems to characterize the neural generators responsible for experimental effects detected on the scalp sensors.
This approach is exemplified in (Wood, Cohen, Cuffin, Yarita, & Allison, 1985), where terms such as ‘waveform morphology’ and ‘shape of scalp topography’ are used to discuss the respective sources of MEG and EEG signals. This empirical approach to localization has considerably benefited from the constant increase in the number of sensors of MEG and EEG systems.
Indeed, surface interpolation techniques of sensor data have gained considerable popularity in MEG and EEG research (Perrin, Pernier, Bertrand, Giard, & Echallier, 1987): investigators now can routinely access surface representations of their data on an approximation of the scalp surface – as a disc, a sphere – or on the very head surface extracted from the subject’s MRI. (Wood et al.., 1985) – like many others – used the distance between the minimum and maximum magnetic distribution of the dipolar-looking field topography to infer the putative depth of a dipolar source model of the data.
Computational approaches to source localization attempt to mimic the talent of electrophysiologists, with a more quantitative benefit though. We have seen that the current dipole model has been adopted as the canonical equivalent generator of the electrical activity of a brain region considered as a functional entity. Localizing a current dipole in the head implies that 6 unknown parameters be estimated from the data:
- 3 for location per se,
- 2 for orientation and
- 1 for amplitude.
Therefore, characterizing the source model by a restricted number of parameters was considered as a possible solution to the ill-posed inverse problem and has been attractive to many MEG/EEG scientists. Without additional prior information besides the experimental data, the number of unknowns in the source estimation problem needs to be smaller than that of the instantaneous observations for the inverse problem to be well-posed, in terms of uniqueness of a solution. Therefore, recent high-density systems with about 300 sensors would theoretically allow the unambiguous identification of 50 dipolar sources; a number that would probably satisfy the modeling of brain activity in many neuroscience questions.
It appears however, that most research studies using MEG/EEG source localization bear a more conservative profile, using much fewer dipole sources (typically <5). The reasons for this are both technical and proper to MEG/EEG brain signals as we shall now discuss.
Numerical approaches to the estimation of unknown source parameters are generally based on the widely-used least-squares (LS) technique which attempts to find the set of parameter values that minimize the (square of the) difference between observations and predictions from the model (Fig. 9). Biosignals such as MEG/EEG traces are naturally contaminated by nuisance components (e.g., environmental noise and physiological artifacts), which shall not be explained by the model of brain activity. These components however, contribute to some uncertainty on the estimation of the source model parameters. As a toy example, let us consider noise components that are independent and identically-distributed on all 300 sensors. One would theoretically need to adjust as many additional free parameters in the inverse model as the number of noise components to fully account for all possible experimental (noisy) observations. However, we would end up handling a problem with 300 additional unknowns, adding to the original 300 source parameters, with only 300 data measures available.
Hence, and to avoid confusion between contributions from nuisances and signals of true interest, the MEG/EEG scientist needs to determine the respective parts of interest (the signal) versus perturbation (noise) in the experimental data. The preprocessing steps we have reviewed in the earlier sections of this chapter are therefore essential to identify, attenuate or reject some of the nuisances in the data, prior to proceeding to inverse modeling.
Once the data has been preprocessed, the basic LS approach to source estimation aims at minimizing the deviation of the model predictions from the data: that is, the part in the observations that are left unexplained by the source model.
Let us suppose for the sake of further demonstration that the data is idealistically clean from any noisy disturbance, and that we are still willing to fit 50 dipoles to 300 data points. This is in theory an ideal case where there are as many unknowns as there are instantaneous data measures. However we shall discuss that unknowns in the models do not all share the same type of dependency to the data. In the case of a dipole model, doubling the amplitude of the dipole doubles the amplitude of the sensor data. Dipole source amplitudes are therefore said to be linear parameters of the model. Dipole locations however do not depend linearly on the data: the amplitude of the sensor data is altered non-linearly with changes in depth and position of the elementary dipole source. Source orientation is a somewhat hybrid type of parameter. It is considered that small, local displacements of brain activity can be efficiently modeled by a rotating dipole source at some fixed location. Though source orientation is a non-linear parameter in theory, replacing a free-rotating dipole by a triplet of 3 orthogonal dipoles with fixed orientations is a way to express any arbitrary source orientation by a set of 3 – hence linear – amplitude parameters. Non-linear parameters are more difficult to estimate in practice than linear unknowns. The optimal set of source parameters defined from the LS criterion exists and is theoretically unique when sources are constrained to be dipolar (see e.g. (Badia, 2004)). However in practice, non-linear optimization may be trapped by suboptimal values of the source parameters corresponding to a so-called local-minimum of the LS objective. Therefore the practice of multiple dipole fitting is very sensitive to initial conditions e.g., the values assigned to the unknown parameters to initiate the search, and to the number of sources in the model, which increases the possibility of the optimization procedure to be trapped in local, suboptimal LS minima.
In summary, even though localizing a number of elementary dipoles corresponding to the amount of instantaneous observations is theoretically well-posed, we are facing two issues that will drive us to reconsider the source-fitting problem in practice:
- The risk of overfitting the data: meaning that the inverse model may account for the noise components in the observations, and
- non-linear searches that tend to be trapped in local minima of the LS objective.
A general rule of thumb when the data is noisy and the optimization principle is ruled by non-linear dependency is to keep the complexity of the estimation as low as possible. Taming complexity starts with reducing the number of unknowns so that the estimation problem becomes overdetermined. In experimental sciences, overdeterminacy is not as critical as underdeterminacy. From a pragmatic standpoint, supplementary sensors provide additional information and allow the selection of subsets of channels, which may be less contaminated by noise and artifacts.
The early MEG/EEG literature is abundant in studies reporting on single dipole source models. The somatotopy of primary somatosensory brain regions (Okada, Tanenbaum, Williamson, & Kaufman, 1984, Meunier, Lehéricy, Garnero, & Vidailhet, 2003), primary, tonotopic auditory (Zimmerman, Reite, & Zimmerman, 1981) and visual (Lehmann, Darcey, & Skrandies, 1982) responses are examples of such studies where the single dipole model contributed to the better temporal characterization of primary brain responses.
Later components of evoked fields and potentials usually necessitate more elementary source to be fitted. However, this may be detrimental to the numerical stability and significance of the inverse model. The spatio-temporal dipole model was therefore developed to localize the sources of scalp waveforms that were assumed to be generated by multiple and overlapping brain activations (Scherg & Cramon, 1985). This spatio-temporal model and associated optimization expect that an elementary source is active for a certain duration – with amplitude modulations – while remaining at the same location with the same orientation. This is typical of the introduction of prior information in the MEG/EEG source estimation problem, and this will be further developed in the imaging techniques discussed below.
The number of dipoles to be adjusted is also a model parameter that needs to be estimated. However it leads to difficult, and usually impractical optimization (Waldorp, Huizenga, Nehorai, Grasman, & Molenaar, 2005). Therefore the number of elementary sources in the model is often qualitatively assessed by expert users, which may question the reproducibility of such user-dependent analyses. Hence, special care should be brought to the evaluation of the stability and robustness of the estimated source models. With all that in mind, source localization techniques have proven to be effective, even on complex experimental paradigms (see e.g., (Helenius, Parviainen, Paetau, & Salmelin, 2009)).
Signal classification and spatial filtering techniques are efficient alternative approaches in that respect. They have gained considerable momentum in the MEG/EEG community in the recent years. They are discussed in the following subsection.
Scanning Techniques: Spatial Filters, Beamformers and Signal Classifiers
The inherent difficulties to source localization with multiple generators and noisy data have led signal processors to develop alternative approaches, most notably in the glorious field of radar and sonar in the 1970’s. Rather than attempting to identify discrete sets of sources by adjusting their non-linear location parameters, scanning techniques have emerged and proceeded by systematically sifting through the brain space to evaluate how a predetermined elementary source model would fit the data at every voxel of the brain volume. For this local model evaluation to be specific of the brain location being scanned, contributions from possible sources located elsewhere in the brain volume need to be blocked. Hence, these techniques are known as spatial-filters and beamformers (the simile is a virtual beam being directed and ‘listening’ exclusively at some brain region).
These techniques have triggered tremendous interest and applications in array signal processing and have percolated the MEG/EEG community at several instances (e.g., (Spencer, Leahy, Mosher, & Lewis, 1992) and more recently, (Hillebrand, Singh, Holliday, Furlong, & Barnes, 2005)). At each point of the brain grid, a narrow-band spatial filter is formed and evaluates the contribution to data from an elementary source model – such as a single or a triplet of current dipoles – while contributions from other brain regions are ideally muted, or at least attenuated. (Veen & Buckley, 1988) is a technical introduction to beamformers and excellent further reading.
It is sometimes claimed that beamformers do not solve an inverse problem: this is a bit overstated. Indeed, spatial filters do require a source and a forward model that will be both confronted to the observations. Beamformers scan the entire expected source space and systematically test the prediction of the source and forward models with respect to observations. These predictions compose a distributed score map, which should not be misinterpreted as a current density map. More technically – though no details are given here – the forward model needs to be inverted by the beamformer as well. It only proceeds iteratively by sifting through each source grid point and estimating the output of the corresponding spatial filter. Hence beamformers and spatial filters are truly avatars of inverse modeling.
Beamforming is therefore a convenient method to translate the source localization problem into a signal detection issue. As every method that tackles a complex estimation problem, there are drawbacks to the technique:
- Beamformers depend on the covariance statistics of the noise in the data. These latter may be estimated from the data through sample statistics. However, the number of independent data samples that are necessary for the robust – and numerically stable – estimation of covariance statistics is proportional to the square of the number of data channels, i.e. of sensors. Hence beamformers ideally require long, stationary episodes of data, such as sweeps of ongoing, unaveraged data and experimental conditions where behavioral stationarity ensures some form of statistical stationarity in the data (e.g., ongoing movements). (Cheyne, Bakhtazad, & Gaetz, 2006) have suggested that event-related brain responses can be well captured by beamformers using sample statistics estimated across single trials.
- They are more sensitive to errors in the head model. The filter outputs are typically equivalent to local estimates of SNR. However this latter is not homogeneously distributed everywhere in the brain volume: MEG/EEG signals from activity in deeper brain regions or gyral generators in MEG have weaker SNR than in the rest of the brain. The consequence is side lobe leakages from interfering sources nearby, which impede filter selectivity and therefore, the specificity of source detection (Wax & Anu, 1996);
- Beamformers may be fooled by simultaneous activations occurring in brain regions outside the filter pass-band that are highly correlated with source signals within the pass-band. External sources are interpreted as interferences by the beamformer, which blocks the signals of interest because they bear the same sample statistics than the interference.
Signal processors had long identified these issues and consequently developed multiple signal classification (MUSIC) as an alternative technique ((Schmidt, 1986)). MUSIC assumes that signal and noise components in the data are uncorrelated. Strong theoretical results in information theory show that these components live in separate, high-dimensional data subspaces, which can be identified using e.g., a PCA of the data time series (Golub, 1996). (J. C. Mosher, Baillet, & Leahy, 1999) is an extensive review of signal classification approaches to MEG and EEG source localization.
However, the practical aspects of MUSIC and its variations remain limited by their sensitivity in the accurate definition of the respective signal and noise subspaces. These techniques may be fooled by background brain activity, which signals share similar properties with the event-related responses of interest. An interesting side application of MUSIC-like powerful discrimination ability though has been developed in epilepsy spike-sorting (Ossadtchi et al.., 2004).
In summary, spatial-filters, beamformers and signal classification approaches bring us closer to a distributed representation of the brain electrical activity. As a caveat, the results generated by these techniques are not an estimation of the current density everywhere in the brain. They represent a score map of a source model – generally a current dipole – that is evaluated at the points of a predefined spatial lattice, which sometimes leads to misinterpretations. The localization issue now becomes a signal detection problem within the score map (J. Mosher, Baillet, & Leahy, 2003). The imaging approaches we are about to introduce now, push this detection problem further by estimating the brain current density globally.
Distributed Source Imaging
Source imaging approaches have developed in parallel to the other techniques discussed above. Imaging source models consist of distributions of elementary sources, generally with fixed locations and orientations, which amplitudes are estimated at once. MEG/EEG source images represent estimations of the global neural current intensity maps, distributed within the entire brain volume or constrained at the cortical surface.
Source image supports consist of either a 3D lattice of voxels or of the nodes of the triangulation of the cortical surface. These latter may be based on a template, or preferably obtained from the subject’s individual MRI and confined to a mask of the grey matter. Multiple academic software packages perform the necessary segmentation and tessellation processes from high-contrast T1-weighted MR image volumes.
The cortical surface, tessellated at two resolutions, using: (top row) 10,034 vertices (20,026 triangles with 10 mm2 average surface area) and (bottom row) 79,124 vertices (158,456 triangles with 1.3 mm2 average surface area).
As discussed elsewhere in these pages, the cortically-constrained image model derives from the assumption that MEG/EEG data originates essentially from large cortical assemblies of pyramidal cells, with currents generated from post-synaptic potentials flowing orthogonally to the local cortical surface. This orientation constraint can either be strict (Dale & Sereno, 1993) or relaxed by authorizing some controlled deviation from the surface normal (Lin, Belliveau, Dale, & Hamalainen, 2006).
In both cases, reasonable spatial sampling of the image space requires several thousands (typically ~10000) of elementary sources. Consequently, though the imaging inverse problem consists in estimating only linear parameters, it is dramatically underdetermined.
Just like in the context of source localization where e.g., the number of sources is a restrictive prior as a remedy to ill-posedness, imaging models need to be complemented by a priori information. This is properly formulated with the mathematics of regularization as we shall now briefly review.
Adding priors to the imaging model can be adequately formalized in the context of Bayesian inference where solutions to inverse modeling satisfy both the fit to observations – given some probabilistic model of the nuisances – and additional priors. From a parameter estimation perspective, the maximum of the a posteriori probability distribution of source intensity, given the observations could be considered as the ‘best possible model’. This maximum a posteriori (MAP) estimate has been extremely successful in the digital image restoration and reconstruction communities. (Geman & Geman, 1984) is a masterpiece reference of the genre. The MAP is obtained in Bayesian statistics through the optimization of the mixture of the likelihood of the noisy data – i.e., of the predictive power of a given source model – with the a priori probability of a given source model.
We do not want to detail the mathematics of Bayesian inference any further here as this would reach outside the objectives of these pages. Specific recommended further reading includes (Demoment, 1989), for a Bayesian discussion on regularization and (Baillet, Mosher, & Leahy, 2001), for an introduction to MEG/EEG imaging methods, also in the Bayesian framework.
From a practical standpoint, the priors on the source image models may take multiple faces: promote current distributions with high spatial and temporal smoothness, penalize models with currents of unrealistic, non-physiologically plausible amplitudes, favor the adequation with an fMRI activation maps, or prefer source image models made of piecewise homogeneous active regions, etc. An appealing benefit from well-chosen priors is that it may ensure the uniqueness of the optimal solution to the imaging inverse problem, despite its original underdeterminacy.
Because relevant priors for MEG/EEG imaging models are plethoric, it is important to understand that the associated source estimation methods usually belong to the same technical background. Also, the selection of image priors can be seen as arbitrary and subjective an issue as the selection of dipoles in the source localization techniques we have reviewed previously. Comprehensive solutions for this model selection issue are now emerging and will be briefly reviewed further below.
The free parameters of the imaging model are the amplitudes of the elementary source currents distributed on the brain’s geometry. The non-linear parameters (e.g., the elementary source locations) now become fixed priors as provided by anatomical information. The model estimation procedure and the very existence of a unique solution strongly depend on the mathematical nature of the image prior.
A widely-used prior in the field of image reconstruction considers that the expected source amplitudes be as small as possible on average. This is the well-described minimum-norm (MN) model. Technically speaking, we are referring to the L2-norm; the objective cost function ruling the model estimation is quadratic in the source amplitudes, with a unique analytical solution (Tarantola, 2004). The computational simplicity and uniqueness of the MN model has been very attractive in MEG/EEG early on (Wang et al.., 1992).
The basic MN estimate is problematic though as it tends to favor the most superficial brain regions (e.g., the gyral crowns) and underestimate contributions from deeper source areas (such as sulcal fundi) (Fuchs, Wagner, Köhler, & Wischmann, 1999).
As a remedy, a slight alteration of the basic MN estimator consists in weighting each elementary source amplitude by the inverse of the norm of its contribution to sensors. Such depth weighting yields a weighted MN (WMN) estimate, which still benefits from uniqueness and linearity in the observations as the basic MN (Lin, Witzel, et al.., 2006).
Despite their robustness to noise and simple computation, it is relevant to question the neurophysiological validity of MN priors. Indeed – though reasonably intuitive – there is no evidence that neural currents would systematically match the principle of minimal energy. Some authors have speculated that a more physiologically relevant prior would be that the norm of spatial derivatives (e.g., surface or volume gradient or Laplacian) of the current map be minimized (see LORETA method in (Pascual-Marqui, Michel, & Lehmann, 1994)). As a general rule of thumb however, all MN-based source imaging approaches overestimate the smoothness of the spatial distribution of neural currents. Quantitative and qualitative empirical evidence however demonstrate spatial discrimination of reasonable range at the sub-lobar brain scale (Darvas, Pantazis, Kucukaltun-Yildirim, & Leahy, 2004, Sergent et al.., 2005).
Most of the recent literature in regularized imaging models for MEG/EEG consists in struggling to improve the spatial resolution of the MN-based models (see (Baillet, Mosher, & Leahy, 2001) for a review) or to reduce the degree of arbitrariness involved in selected a generic source model a priori (Mattout, Phillips, Penny, Rugg, & Friston, 2006, Stephan, Penny, Daunizeau, Moran, & Friston, 2009). This results in notable improvements in theoretical performances, though with higher computational demands and practical optimization issues.
As a general principle, we are facing the dilemma of knowing that all priors about the source images are certainly abusive, hence that the inverse model is approximative, while hoping it is just not too approximative. This discussion is recurrent in the general context of estimation theory and model selection as we shall discuss in the next section.
Distributed source imaging of the [120,300] ms time interval following the presentation of the target face object in the visual RSVP oddball paradigm described before. The images show a slightly smoothed version of one participant’s cortical surface. Colors encode the contrast of MEG source amplitudes between responses to target versus control faces. Visual responses are detected by 120ms and rapidly propagate anteriorly. By 250 ms onwards, strong anterior mesial responses are detected in the cingular cortex. These latter are the main contributors of the brain response to target detection.