896 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 Performance Analysis of Spatial Filtering of RF Interference in Radio Astronomy Sebastiaan van der Tol and Alle-Jan van der Veen, Fellow, IEEE Abstract—Radio astronomical observations are increasingly contaminated by man-made RF interference (RFI). If these signals are continuously present, then they cannot be removed by the usual techniques of detection and blanking. We have previously proposed a spatial filtering technique, where the impact of the interferer is projected out from the estimated covariance data. Assuming that the spatial signature of the interferer is time-varying, several such estimates can be combined to recover the missing dimensions. In this paper, we give a detailed performance analysis of this algorithm. It is shown that the spatial filter introduces a small increase in variance of the estimates (because of the loss in information) and that the algorithm is unbiased in case the true spatial signatures of the interferers are known but that there may be a bias in case the signatures are estimated from the same data. Some of the bias may be removed, and moreover, the bias only affects the auto-correlations, whereas the astronomical information is mostly in the cross-correlations. Index Terms—Algorithm performance, eigenvalues, interference cancellation, radio astronomy, RF interference, spatial filtering. I. INTRODUCTION R ADIO astronomical observations are increasingly contaminated by man-made RF interference. In bands below 2 GHz, we find TV and radio signals, mobile communication (GSM), radar, satellite communication (Iridium), and localization beacons (GPS, Glonass), etc. Although some bands are specifically reserved for astronomy, the stopband filters of some communication systems are not always adequate. Moreover, scientifically relevant observations are not limited to these bands. Hence, there is a growing need for interference cancellation techniques. The output of a radio telescope is usually in the form of correlations: the auto-correlation (power) of a single telescope dish, split into frequency bins and integrated over periods of 10–30 seconds or more, and/or the cross-correlations of several dishes. The astronomer uses several hours of such correlation observations to synthesize images and to create frequency-domain spectra at specific sky locations (in particular for the study of spectral emission and absorption lines). Most interference cancellation today is done at the post-correlation level by rejecting Manuscript received August 18, 2003; revised April 2, 2004. This work was supported in part by the NOEMI project under Contract STW DEL77-476 and by VICI-SPCOM under Contract STW DTC.5893. Parts of this paper were published in . The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Constantinos B. Papadias. The authors are with the Department of Electrical Engineering, Delft University of Technology, 2628 CD Delft, The Netherlands (e-mail: [email protected] tudelft.nl; [email protected]). Digital Object Identifier 10.1109/TSP.2004.842177 suspect correlation products or by specialized imaging algorithms, but these techniques have their limits and interference rejection at shorter time scales is needed. Depending on the interference and the type of instrument, several kinds of radio frequency interference (RFI) mitigation techniques are applicable. Overviews can be found in  and , e.g., intermittent interference such as radar pulses can be detected using short-term Fourier transforms and the contaminated time-frequency cells omitted during long-term integration . However, many communication signals are continuous in time. For a single-dish telescope, there are not many other options1 than to consider an extension by a reference antenna that picks up only the interference, so that adaptive cancellation techniques based on output power minimization can be implemented –. With an array of telescope dishes (an interferometer), spatial filtering techniques are applicable as well. The desired incorrelation matrices, instrument outputs in this case are tegrated to, e.g., 10 s (in practice, this can range from a fraction of a second up to several minutes, and only the nonredundant entries are computed and retained). Based on short-term correlation matrices (integration to e.g., 10 ms) and narrow subband processing, the array signature vector of an interferer can be estimated and subsequently projected out. The resulting longterm averages of these matrices are mostly interference-free, but a correction is needed to take into account that some dimensions were underrepresented. This algorithm was introduced in —we describe it in more detail in Section II. The goal of the present paper is to give a detailed performance analysis of this algorithm. It has to be demonstrated that the information the astronomer is looking for is unaffected by the spatial filter—a delicate affair because this information is at least 15 dB below the noise level and typically much more. In Section III, we derive the bias and standard deviation of the final covariance estimate. We then analyze the effect of the interferer stationarity on the performance of the filter (Section IV). The bias is studied in more detail in Section V, and the theoretical performance equations are verified with simulations. We also show that much of the bias can be corrected. Conclusions are in Section VI. Notation: An overbar denotes complex conjugate, superscript denotes matrix transpose, and denotes complex conjugate transpose. vec denotes the stacking of the columns of a matrix in a vector, the Kronecker product, the column-wise Kronecker product (Khatri–Rao product), and the entrywise multiplication of two matrices of equal size. is the identity 1An exception is a technique based on higher order statistics . 1053-587X/$20.00 © 2005 IEEE VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY matrix, and is a vector with all ones. The covariance of an estimated matrix is defined as cov where vec vec . II. DATA MODEL AND ALGORITHM SUMMARY 897 is varying over periods longer than the short-term integration time. Note that even interferers fixed on earth will appear to move as the earth rotates and the telescopes track in the opposite direction. This effect is worked out in Section IV-B. This was the model considered in . The model is easily extended to multiple narrowband interfering sources, in which case, we obtain A5) A. Received Data Model Assume we have a telescope array with elements. We conreceived at the antennas in a sider the signals sufficiently narrow subband. For the interference-free case, the is modeled in complex baseband form array output vector as where is the vector of telescope is the received sky signal possibly due to signals at time , many astronomical sources, assumed on the time scale of 10 s to be a stationary Gaussian vector with covariance matrix (the astronomical “visibilities”), and is the noise vector with independent identically distributed Gaussian entries (this implies accurate calibration). and covariance matrix The astronomer is interested in the nonredundant off-diagonal . entries of If an interferer is present the output vector is modeled as where is the interferer signal with spatial signature vector , which is assumed stationary only over short time intervals. Without loss of generality, we can absorb the unknown ampliinto and, thus, set the power of to 1. tude of We make the following additional assumptions on this model: A1) The noise variance is known from calibration, e.g., from observations of nearby uncontamined frequencies. , so that . This is reasonA2) able as even the strongest sky sources are about 15 dB under the noise floor. This condition will be made more precise in (6). A3) The processing bandwidth is sufficiently narrow, meaning that the maximal propagation delay of a signal across the telescope array is small compared to the inverse bandwidth so that this delay can be represented by a phase shift of the signal. For a maximal baseline , the condition on the maximal bandwidth , where is the speed of light, is e.g., for a maximal baseline of 3000 m, 32 kHz. Without this assumption, it does not make . If sense to define an interferer signature vector the assumption is not satisfied, as for many existing telescopes, a form of subband processing has to be implemented. is stationary over short A4) The interferer signature processing times (say 10 ms). where and has columns corresponding to is a vector with entries. interferers, B. Covariance Model Suppose that we have obtained observations , where is the sampling period. We assume that is stationary at least over intervals of , and construct short-term covariance estimates where is the number of samples per short-term average. The interference filtering algorithm in this paper is based on apto remove the interference, folplying operations to each lowed by further averaging over resulting matrices to obtain a long-term average. as deterministic, the exConsidering the pected value of each is denoted by . According to the has model assumptions, (1) is the interference-free covariance matrix, where and is assumed to be known from calibration. The objective is to estimate the interference-free covariance . C. Spatial Filtering Using Projections In , a spatial filtering algorithm based on projections was introduced. In summary, it is as follows. Momentarily suppose that an orthogonal basis of the is subspace spanned by interferer spatial signatures span known. We can then form a spatial projection matrix (2) . When this spatial filter is applied which is such that to the data covariance matrix, all the energy due to the interferer will be nulled. Indeed, let then 898 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 When we subsequently average the modified covariance ma, we obtain a long-term estimate trices (3) is an estimate of , but it is biased due to the projection. To correct for this, we first write the two-sided multiplication as a single-sided multiplication employing the matrix identity vec . This gives vec vec vec (4) where If the interference was completely removed, then vec vec vec (5) where A detection metric for the presence of interferers, and their number can be derived from statistical principles (e.g., the classical sequential hypothesis tests –, the AIC , or Rissanen’s MDL test ; see [14, ch.7.8] for additional references). A test which is simpler to implement is that which puts a threshold on the eigenvalues (7) (this equation is based on an asymptotic formula for the largest white Gaussian noise matrix by singular value of a Edelman ). The number of eigenvalues larger than indicates the number of interferers. The eigenvalue threshold test is used here because it is easier to relate it to and to the interference-to-noise ratio (INR), which gives a useful insight. The test does not have a known false-alarm rate, but for the threshold in (7), it is about 2% over a wide range of and . The algorithm relies on the invertibility of , which is constructed from projection matrices. Each projection matrix is invertible only if the spatial is rank deficient. Hence, signature vectors that are projected out are sufficiently varying. , usually, three different In , it was noted that for is full projections are already sufficient to guarantee that rank. III. SIGNAL COVARIANCE ESTIMATE ERROR ANALYSIS In view of this, we can apply a correction the corrected estimate unvec to to obtain vec If the interference was completely projected out, then is an unbiased estimate of the covariance matrix without interference. This was the algorithm introduced in . is not known, and it has to be estiIn practice, span mated. As usual in array processing, this is done by computing an eigenvalue decomposition of the sample covariance matrix and letting in (2) contain the dominant eigenvectors. The , underlying assumption is that, without interference, so that eigenvalues significantly larger than indicate the presence of interferers, and at the same time, the corresponding eigenvectors span the subspace to be projected out. For this to is work, it is essential that the astronomical contribution small compared with the noise [viz. assumption A2)]; otherwise, it would disturb the eigenvalues and the eigenvector directions. This assumption can be made more precise by requiring are much smaller than the standard devithat the entries of ation of the entries of due to finite sample noise, or The result of the algorithm is , which is an estimate of the true covariance matrix . As a measure of accuracy, we will determine the covariance of in three cases: i) interference free, ii) the spatial signatures are known, and iii) the spatial signatures are estimated from the data. The covariance of the short-term averages is defined by cov vec vec where Assuming Gaussian sources (sky signals, interference and noise), the covariance of the short-term averages is given by the standard result (e.g., [16, eq. (A4)]) (8) cov A. Case I: Interference-Free In the interference-free case, without filtering, (6) For a known signal-to-noise ratio (SNR), this translates into a limit on the short-term integration length , e.g., the strongest dB, for which we obtain astronomical source has SNR . Usually, the value of is limited to a lower value by the (non-)stationarity of the interferer. Then, from (8), the covariance of cov is given by is given by VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY where . Because (9) cov Hence, all entries of are disturbed by uncorrelated noise with . This interference-free result gives a refervariance ence performance for the estimation of . 899 depends on the variability of , which are the In turn, is sufficiently varying spatial signatures of the interferer. If and the number of antennas is sufficiently large, then for large , , and the performance is similar as in the interferencefree case [see also (16) later in the paper]. In practice, a moderate is sufficient. Even for stationary located interferers, will change because of earth rotation. A more detailed discussion on the conditioning of in that case is found in Section IV-B. B. Case II: Interference With Known Spatial Signatures Suppose the subspace spanned by the spatial signatures of the interferers is deterministic and perfectly known. In that case, the interference will be completely removed, and the algorithms are unbiased by design. According to the algorithm in Section II-C, the estimate of is given by vec vec vec vec (10) where , , and is the . (Note that projection onto the orthogonal complement of and are Hermitian.) Thus, the covariance of the long-term estimate is cov cov where vec cov and uncorrelated for hence . The estimation errors , and vec cov Multiplication by so that ferers vec (11) and are ; vec C. Case III: Interference With Unknown Deterministic Spatial Signatures are unknown, and their In practice, the spatial signatures column span will be estimated from eigendecompositions of the . Because of the estimation error, the projection is not perfect, and there will be a residual that might affect the performance. In previous work , we have presented a first-order perturbation analysis, which said that the performance is the same as in (12): It is in first order unchanged, even if the projections are estimated. Obviously, this cannot be entirely true. A more detailed second-order analysis presented in this subsection reveals that, with projections estimated from an eigenvalue decomposition, a bias is introduced on the main diagonal of , which in certain cases can be significant compared with the standard deviation. We also show that for reasonably large , the covariance of the estimate is as derived before in (12), i.e., not affected by the interference subspace estimation. In this section, we assume for simplicity that the noise power . The main results of the secondhas been normalized to order perturbation analysis are summarized in the following theorem, and proofs are in the Appendices. be a Taylor series Theorem 1: Let expansion of . Assume that . Let be the th eigen(sorted in descending order) and the number of value of dimensions that are projected out in the th interval. Assume that the largest eigenvalues are distinct. Then projects out the contribution of the intervec cov vec . If the eigenvalues of The bias on is given by stationary (independent of ), the bias is (13) are (14) Thus For sufficiently large cov cov (12) cov Compared to (9), this indicates that determines the relative performance of the spatial filtering algorithm of Section II-C. vec Proof: See Appendix B. vec 900 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 Clearly, (13) and (14) indicate that there is a bias term. Under the assumption that the eigenvalues are stationary and the sky signal powers are much below the noise power, (14) also shows that only the diagonal entries of are biased, even if the interferer is not uniformly distributed over space (in fact, even if is almost stationary). This is acceptable for most astronomy applications (e.g., imaging, spectral analysis). The bias is studied in more detail in Section V. The variance of the estimates is the same as we obtained before in (12). The effect of the projections is described entirely . The properties of are analyzed in the next section. by It follows that and subsequently IV. EXPECTED VALUE OF From the covariance expression (12), it is seen that determines the penalty on the estimation performance due to spacontains the factors by tial filtering. The main diagonal of which the variance of is multiplied compared with the interference-free case. only depends on the projected directions, i.e., on the collection of , the spatial signatures of the interferer over time. We consider a single interferer and two extreme are independent random vectors, with normally cases: I) the are the spatial signatures of an distributed entries, and II) the interferer fixed to earth. A. Case I: Interferer With Normally Distributed Spatial Signatures If a ground-based interferer without direct line-of-sight is moving, then due to fast fading effects, it is reasonable to model the corresponding spatial signature vectors as a collection of i.i.d. random vectors with directions uniformly distributed over the -dimensional unit sphere.2 From . For , converges this model, we can determine to ; hence, converges to . and i.i.d. for different , let Thus, assuming All other cases of mary are 0 due to symmetry. In sum- vec vec is uniformly distributed over the unit sphere. Note that vec (15) The inverse is given by vec vec (16) converges to . The variance of the For large , entries of is determined by the main diagonal of . In particular var var Then, vec (17) e.g., with antennas, the variance of the diagonal entries of is multiplied by approximately 1.29, and the variance of the off-diagonal entries is multiplied by 1.31, or an increase of about 1.1 dB. Alternatively, we can say that if one dimension is always projected out, then 31% more samples are needed to compensate for the loss of information. B. Case II: Stationary Interferers To evaluate this expression, we need to establish the secondthe -th entry and fourth-order moments of . Denote by is zero, except when . of . Due to symmetry, Let ; then, has a beta distribution with parameters [17, p. 487] for which var 2In other cases, this model may represent a workable simplification which allows for a meaningful analysis. to be invertible, the spatial signatures need to be For sufficiently variable. For stationary interferers (not moving, no multipath), the only source of variability is the geometric delay compensation. This is a delay placed between each telescope and the correlator to correct for the different path lengths of the astronomical signal to each of the telescopes: After the correction, signals arriving from the look direction will arrive in-phase deand will be processed coherently. The geometric delay pends on the locations of the telescopes and the direction of the observed field in the sky. Due to the daily rotation of the earth, the stars are moving along the sky, and hence, the geometric delay is time varying. VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY 901 For practical reasons, the compensating delay is usually inserted only after frequency downconversion. In this case, the downconversion of an RF signal with carrier frequency and delay introduces a phase shift . For coherent processing of the sky signals, it is therefore necessary to apply an additional time-varying phase correction to each telescope signal after downconversion, which is called fringe stopping , . This correction will make interferers fixed on earth appear to have a telescope-dependent time-varying phase-shift. For a uniform linear East–West array of telescopes, the ge, where is ometrical delay is the longest baseline length in wavelengths, is the declination for a source at the celestial pole), of the sky source ( is the “hour angle” (azimuth) of the source and rad/s is the earth rotation speed based ( on a siderial day) . For time scales of a few seconds, the required phase correction can be approximated by a linear-phase , where is called the “fringe freprogression quency”  The effect of the time-varying fringe correction on the spatial signature of an interferer can thus be modeled as .. . (18) is the spatial signature vector of the interwhere ferer without fringe correction. deFor a stationary interferer and sufficiently large , pends mostly on the number of antennas and the total fringe over the longest baseline and over the complete inrotation tegration period of duration (19) Simulations [see Fig. 1(a) and (b)] show how the asympdiag depends on and . In totic value of Fig. 1(a), the initial interferer signature vector is . It is seen that a reasonably good performance is already obtained once the total fringe rotation is larger than about two cycles (other simulations indicated that this result is independent of the number of samples per fringe cycle). The asymptotic values actually correspond to the same levels as predicted in (16) for the uniformly distributed case. In Fig. 1(b), is selected randomly, and confidence intervals for diag are shown for telescopes. can be translated into a division of the This condition on sky in an “observable” and an “unobservable” area (sky locations that cause insufficient fringe rotations to tolerate the projection of stationary interferers). Let be an angle such that . In view of (19), a condition for sufficient fringe rotation corresponds to a certain minimum value of . Translating this to values for and , it can be shown that this corresponds geometrically to a band from the East horizon over the celestial pole to the West horizon, as is shown in Fig. 1(c), C Fig. 1. (a) max(diag( )) as function of fringe rotation . (b) Confidence intervals for a random initial signature vector (p = 8). (c) Unobservable area of the sky due to a fixed interferer. which depicts the sky as seen above the observatory in a coordinate system relative to the observatory; the vertical direction corresponds to zenith, and the latitude of the observatory was assumed to be 52 . The black ring is the unobservable area. be the smallest value of that is acceptable. The Let of the unobservable band is given by width (20) 902 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 Fig. 2. Observation of 3C48 in the presence of an interfering signal from a GPS satellite. (a) Eigenvalues of ^ for varying short-term integration length. (b) Eigenvalues of for varying long-term integration length. C R e.g., if rad, s, cm (corresponding m, to an observation frequency of about 1 GHz), and then the width of this band is 16 . C. Experimental Results The preceding results can be illustrated further using two measurement sets taken at the Westerbork Sythesis Radio Telescope (WSRT) in The Netherlands. The measurements consist of correlation products of telescopes with a maximal baseline of m, split into subbands of 40 kHz and subsequently short-term integrated over samples, thus corresponding to a short term integration time of ms. We will consider only a single subband channel. The observed source was 3C48 at declination . The data was offline calibrated to have noise power . The first data set was measured at MHz and contained interference from a GPS satellite. Fig. 2 shows the results. In Fig. 2(a), the eigenvalues of the short-term data covari- are shown for increasing levels of integration ance matrix . For an integration length shorter than 0.5 s, the eigenvalues indicate a single interferer plus white noise. After 0.5 s, the interfer starts to occupy more than one spatial dimension because its motion changes the direction of the instantaneous spatial signature vector . Since this clearly shows that the spatial signature is constant over the short-term integration interval of 10 ms, we subsequently do our spatial filtering at that level and vary the longterm integration length. The question now is whether the resulting is invertible. Fig. 2(b) shows the eigenvalues of for various levels of long-term integration (the short-term integration length is fixed at 10 ms). For integration lengths shorter than 0.5 s, has eigenvalues that are close to zero. After 0.5 s, these eigenvalues start to grow, and after about 3 s, is well conditioned. At the same time, the value of diag drops to about 1.5. The GPS satellite is in the far field and moves at a constant speed. Its spatial signatures can also be modeled by (18), i.e., by a linear phase progression, although the “fringe frequency” now does not correspond to the earth rotation. Indeed, we have verified that the data satisfies this model well and have estimated the corresponding frequency. The dotted vertical line condition is met, and it is seen shows where the that it coincides with the integration length above which is well conditioned. The second data set, shown in Fig. 3, was measured at MHz and contained interference from a stationary interferer, possibly an AM amateur broadcast. The stationarity of the source and the lower observing frequency result in a low fringe frequency. In this case as well, increasing the long-term integration period causes the smallest eigenvalues of to rise and diag to drop but at a later time than in the previous case. In this case, it is seen that at (dotted vertical line), diag has not dropped, as low as in the previous case. The reason is that one antenna was receiving most of the energy, and such an imbalance was not considered in the analysis of the previous section.3 Still, at s, diag has dropped to 2.8 and to 2.1 at s. Integration times of this length are not uncommon for the antenna configuration used for this experiment. Indeed, in radio astronomy, a condition that is used to estimate the longest permissible integration time is  where is the antenna diameter (here 25 m) and the longest baseline length (here 1296 m); as before, rad/s. As a result, s. For s and MHz, (20) gives an unobservable band with width 10 . 3Moreover, this measurement was done 2 hr after the fringe frequency reached its maximum value so that better results can still be achieved. VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY 903 and smallest for large INR: In The bias is independent of . For , this is signifthat case, it converges to icant compared with the standard deviation, which is of order . This is even more so for small values of the INR since the bias will be larger. B. Bias for Low INR For low INR (say zero INR), the result in (22) is obviously not valid. The reason is that the analysis assumed that the eigenvectors are unique, whereas for zero INRs, all eigenvalues of are the same and cannot be distinguised—and the corresponding eigenvectors are no longer unique. Hence, the analysis has to be modified for this case. If the interference is not detectable but, nonetheless, a projection is made, it will be in the direction of the largest instanta. The direction of that vector is random, neous eigenvector of but systematically, the component with the largest energy is removed, and this sorting effect will lead to a bias. is obtained as An estimate of this bias for the case , follows. Assume very low INR and SNR; hence, . With projections computed from the largest eigenvector, we have For large (23) for vectors with an arbitrary direction since (see Section IV-A), and the eigenvalues are statistically independent from the eigenvectors. With the asymptotic value of from (15), we can derive Fig. 3. Observation of 3C48 in the presence of a stationary interferer. (a) Eigenvalues of ^ for varying short-term integration length. (b) Eigenvalues of ^ for varying long-term integration length. C R vec vec (24) Thus, for sufficiently large vec vec vec V. BIAS EXPRESSIONS AND CORRECTIONS vec A. Bias for Low SNR The final result of Section III was an expression for the bias of the spatial filtering algorithm (14), viz (21) which is valid for stationary eigenvalues but otherwise independent of the interferer statistics (a normalized noise power is assumed). To gain some insight in this expression for the bias, we specialize to the case of a single interferer and very small . All eigenvalues are 1, exSNR. In that case, . If the interference power is constant cept for over time, we can define the average interference-to-noise ratio so that INR, and per antenna as INR INR (22) vec The expectation of the largest eigenvalue is approximately 4 (25) Inserting this and taking only the dominant terms, we obtain that the bias for undetectable low INR is approximately (26) This is the bias caused by doing projections when they are not needed. It is of order but only affects the main diagonal of . 4It is known that this expression is an overestimation. 904 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 9 Fig. 4. (a) Bias and (b) standard deviation of the on-diagonal entries of ^ versus input INR. (c) Bias and (d) standard deviation of the off-diagonal entries. From simulations, it is seen that the combined bias for any INR can be estimated rather well as the minimum (in absolute value) of the two estimates (22) and (26). In the above, we set for all . In practice, a different algorithm is used: The number of interferers is detected from the eigenvalues, and a projection is only done if the interference is seen. This avoids most of the bias in (26). However, some bias remains due to false alarm, as will be seen in the simulations. C. Simulations To verify the results in this section, we performed a simulainterferer of varying INR. There is one astrotion with dB and telescopes. The nomical source SNR spatial signature of the interferer in each interval is a random vector with i.i.d. complex normal distributed entries, all with the same average power (INR). The statistics are computed over 4000 Monte Carlo runs. In Fig. 4, we show the bias and standard deviation of three algorithms: i) a direct covariance estimate with no spatial filter, ii) the covariance estimate using the spatial filter with projec, and iii) tions computed from the eigenvectors, with fixed detected from the eigenvalues using a the spatial filter with threshold as in (7). A vertical “threshold” line indicates the INR that is barely detected, i.e., where (assuming large ), or INR INR Fig. 4(a) is the bias of the on-diagonal entries of . The theoretical curve is the minimum (in absolute value) of (22) and (26). It matches very well with the experimental bias of the spatial filter algorithm without detection. Note that the bias is much larger ( 10 to 20 dB) than the contribution of the astronomical signal ( 25 dB) and is also larger than the standard deviation ( 20 dB). The bias of the spatial filter with detection looks a bit erratic because it changes signs two times . For very small INR, it is equal to the value in (26) times the false alarm rate of the detector: approximately 0.031 in this simulation; hence, it is 15 dB lower than the filter without detection. For large INR, the interference is always detected, and it is the same as (22). In between, the bias is caused by the fact that the interference is present but not detected, i.e., it follows the “no filter” line. Fig. 4(c) shows the bias of the off-diagonal entries of . Theoretically, it is zero, but the eigenfilter without detection has a very small nonzero bias for small INR: It is caused by the nonzero SNR, which gives a small preferred direction to the VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY 905 to the remaining (averaged) interference. Whether the bias is above the standard deviation (hence visible) depends on ; the bias is independent of , but the standard deviation scales as . Hence, for long integration lengths, there is always an INR window where the interference remains and will be visible after integration. The bias is only present on the main diagonal of . For many astronomical applications, this is acceptable, e.g., astronomical imaging does not use the auto-correlations. For spectral line observations, there is usually a single source of interest in the covariance data, for which the power can be recovered from the off-diagonal entries using factor analysis techniques (a form of self-calibration); see, e.g., . Thus, we expect that the bias on the auto correlations will not be a major impediment, even if it is not removed. VI. CONCLUSIONS Spatial filtering can give a significant reduction in unwanted interference. Its effectiveness depends on the strength of the interferer: Stronger interference is more easily detected and can be better estimated and removed. The quality (covariance) of the resulting estimate is given by (12) and is almost as good as the unperturbed estimate in case the projection direction is sufficiently nonstationary. There is a small increase in variance due to the loss of information in the dimensions that were projected out. , a detecAssuming the noise power is normalized to tion threshold can be defined as 9 Fig. 5. (a) Bias. (b) Standard deviation of the on-diagonal entries of ^ versus input INR after bias correction. eigenvectors. Thus, the spatial filter tends to remove a small fraction of the astronomical signal. The filter with detection did not show a bias because it would be 15 dB lower. In all cases, the variance of the covariance estimate is as predicted (17). D. Bias Removal Since the preceding expressions give quite an accurate idea about the bias, it is natural to try to remove the bias using them. A complication is that the bias is dependent on the INR, on the detected number of interferers in each interval, and on the threshold and the corresponding false alarm rates (these are also dependent on the number of interferers). We have derived some experimental techniques that were able to reduce the bias to below 35 dB for low INRs or INRs above the threshold [see (35) and (36) in Appendix C] . These results are not completely general and depend on knowledge of the false alarm rates of the detectors for various numbers of interferers. Fig. 5 shows the results after this bias correction. Compared to Fig. 4(a), it is seen that most of the bias is removed, except in a region around the detection limit: Interference below this limit is not removed; hence, the bias line follows that of the unfiltered curve, but above the limit, the interference is quickly filtered out completely. Hence, the remaining bias directly corresponds INR where is the number of antennas and the short-term integration length. Interference with INR above this threshold is usually detected and filtered out. Below the threshold, the INR is usually too weak to be seen and will not be filtered out, leading to a bias on the diagonal entries of the covariance estimate. The most difficult interference has INR at this threshold. In addition, for higher INR, spatial filtering can give some bias in the on-diagonal entries of the covariance estimate. This is because the projection directions and sample covariance matrices are estimated from the same data. Some of the bias can be removed, but there are two remaining effects that are hard to compensate for: • In case of very weak/no interference, the false-alarm rate of the detector (a few percent) will lead to projections in random directions. • Interference just below the detection threshold will not be reliably detected and will remain present in the covariance estimate. For randomly oriented interference, the effect will only be visible on the main diagonal of the final covariance estimate. The use of reference antennas can improve the detection of weak interference and the estimation of its projection vectors, can handle interference with stationary projection directions, and is expected to generate smaller bias. Hence, this is an important direction for further research. Algorithms for this have been studied in , , and in our recent work. 906 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 APPENDIX A PRELIMINARY RESULTS In the proof of Theorem 1 in Appendix B, we use a few results on eigenvalue perturbation theory. They are listed in this section. The results are known—see, e.g., , –—but often derived and written in indexed form (i.e., in terms of individual matrix entries). We will write them in an integrated form using Kronecker products. be the true covariance matrix To set some notation, let be the corresponding finite sample for the th interval and estimate. All our estimates, eigenvalues, eigenvectors etc., are of and can be written in terms of the true functions and a Taylor series expansion on the error function value , or , where , and etc. We will only consider the first-order perturbation second-order perturbation . be an eigenvalue decomposition of , Let where is unitary, and is diagonal with entries sorted in descending order,5 and define Lemma 2: For the unique eigenvectors (i.e., those corresponding to distinct eigenvalues) The following result is a direct result of the preceding two lemmas and appears, e.g., in  and  where it is written using summations. Lemma 4: For the unique eigenvectors of the covariance matrix of Gaussian sources Proof: Using Lemma 2 vec vec Using Lemma 3 If , then , else . APPENDIX B PROOF OF THEOREM 1 Proof: See, e.g.,  and [25, eq. (C.2)]. Lemma 3: For Gaussian sources vec vec vec vec Proof: For the first line, see, e.g., [16, eq. (A4)]. The denote second line is derived from the first as follows: Let the th column of an identity matrix; then, a specific entry of is vec vec Apply the expectation operator and use the first line to obtain In this section, for simplicity of notation, we assume that the noise power has been scaled to unity . We derive the perturbation on in the case that interferers are present in the th period. More precisely, the algorithm computes eigendecompositions and applies the projection , where is the set of dominant eigenvectors (spanning the “signal subspace”). Then, with second-order accuracy Let of be the Taylor series expansion ; then, we can identify . Subsequently vec vec 5The phase ambiguity on the columns of manner; see . U is resolved in some default VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY Define ; then, we can identify By expansion of 907 , it can be shown that . Similarly (28) To work out vec vec , we first show that vec . Indeed vec vec vec and so that Note that has some entries on the main diagonal that are does not always positive. Since these do not average to 0, . scale with Continuing, we have vec vec vec vec vec Since likewise for the third term, only the first term remains. It follows that vec vec vec vec vec vec vec vec vec vec , and vec Inserting in (27) and using (28) gives vec vec vec Since vec vec remain, in first order, with As with , in addition, does not scale with because are always positive. Subsequently, we obcertain entries of tain vec vec , the first two terms cancel, and we vec (29) The second-order perturbation on is evaluated as vec vec vec vec vec vec vec vec vec vec vec vec vec vec vec vec vec vec vec (27) vec 908 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 If is large, then the second and third terms can be neglected, and we obtain vec vec vec Altogether, we obtain vec vec vec which is the result stated in the theorem. If the eigenvalues are stationary (independent of ), this can be further simplified as vec vec vec vec (32) (30) . Using Each of the two terms inside the summation is (the “sky” contribution is ignored, and the “noise” eigenvalues are 1), the first term is worked out as since, by definition of vec vec vec vec vec vec vec vec is with expected value 0, and with nonzero expected value. Thus is vec The bias is given by is given by vec cov vec . The covariance of the estimate of vec In the evaluation of cov , for large , we can ignore , and are known: the covariance is equal to the case where the cov The expected value of this term is worked out as (the tedious but straightforward derivation uses Lemmas 2–4) vec vec The second term in (30) is worked out as vec vec vec Using a similar derivation as before, this is worked out as vec vec (31) vec vec (note that we scaled ). This performance depends only on the total number of samples. A more detailed analysis shows (the covariance will that this result is even valid for small , which can be ignored for be larger by terms of ). APPENDIX C BIAS REMOVAL EQUATIONS . We try to remove the bias by adding correction terms to The complication here is that, even if we know quite precisely the bias as function of INR, in practice, the INR is unknown. Hence, we need to make corrections that are independent of to correct the the INR. The idea is to use the eigenvalues rather than the final . corresponding For higher INRs, we can set, based on (13) (33) VAN DER TOL AND VAN DER VEEN: PERFORMANCE ANALYSIS OF SPATIAL FILTERING OF RF INTERFERENCE IN RADIO ASTRONOMY The added term will increase by a term Bias Removal, Spatial Filtering With Detection and the premultiplication by will give the desired result. Here, is the number of interferers in interval or the number of detected interferers in case of eigenvalue filtering with detection. This correction removes most of the bias for higher INRs. For very low INRs, we need a different correction. In that case, we can set (34) Using (23), we obtain . Preby [cf. (24)] gives the required multiplication of vec . result: For growing INRs, this term would grow linearly and be too large. Hence, even if we always project out one dimension, a correct bias removal still requires detection of the presence of interference. If it is detected, we correct using (33); else (34). A further refinement is needed to take into account the false cases where the bias alarm rate. For very low INR, there are was removed according to (33) instead of (34). The following correction algorithms take this into account as well. Bias Removal, Spatial Filtering of One Dimension Consider the spatial filtering algorithm, where always, dimension is projected out (independent of the actual INR). To correct for the bias, assume that a detector is available, which, . Set in the th interval, detects and has false-alarm rate if if 909 (35) then this removes the bias for low and high INR up to second order terms. Indeed, for high INR, the interference is always detected, and the correction is as in (33). For low INR, we have in number of cases and in number of cases. The expected value of is then since for arbitrary 1-D projections. This gives us back the situation in (34). For , we can take the expression in (25). Consider the spatial filtering algorithm where the number of and the corresponding eigeninterferers is first detected vectors projected out. Only if an interferer is detected, a correction using (33) is needed. If the interference is not detected, then no correction is needed, except to correct for the bias due to false alarm. be the false alarm rate of detecting interLet be the projection ferers when there are only , and let matrix that projects out the first eigenvectors. Without further motivation, we mention the following algorithm: if if (36) interferer The algorithm tries to take into account that for and high INR (hence always detected), it might happen that a , with probsecond eigenvalue is larger than the threshold ( ). ability REFERENCES  S. van der Tol, A. J. van der Veen, and A. J. Boonstra, “Mitigation of continuous interference in radio astronomy using spatial filtering,” in Proc. URSI General Assembly, Maastricht, The Netherlands, Aug. 2002.  A. Leshem, A.-J. van der Veen, and A.-J. Boonstra, “Multichannel interference mitigation techniques in radio astronomy,” Astrophys. J. Supplement Series, vol. 131, pp. 355–373, Nov. 2000.  P. A. Fridman and W. A. Baan, “RFI mitigation methods in radio astronomy,” Astron. Astrophys., vol. 378, pp. 327–344, 2001.  P. A. Fridman, “RFI excision using a higher order statistics analysis of the power spectrum,” Astron. Astrophys., vol. 368, pp. 369–376, 2001.  C. Barnbaum and R. F. Bradley, “A new approach to interference excision in radio astronomy: real time adaptive filtering,” Astron. J., vol. 115, pp. 2598–2614, 1998.  S. W. Ellingson, J. D. Bunton, and J. F. Bell, “Removal of the GLONASS C/A signal from OH spectral line observations using a parametric modeling technique,” Astrophys. J. Supplement, vol. 135, no. 1, pp. 87–93, 2001.  L. Li and B. D. Jeffs, “Analysis of adaptive array algorithm performance for satellite interference cancellation in radio astronomy,” in Proc. URSI General Assembly, Aug. 2002.  J. Raza, A. J. Boonstra, and A. J. van der Veen, “Spatial filtering of RF interference in radio astronomy,” IEEE Signal Process. Lett., vol. 9, no. 2, pp. 64–67, Feb. 2002.  M. S. Bartlett, “Tests of significance in factor analysis,” British J. Psych. (Statist. Sect.), vol. 3, pp. 77–85, 1950.  , “A note on the multiplying factors for various approximations,” J. R. Statist. Soc., ser. B, vol. 16, pp. 296–298, 1954.  T. W. Anderson, “Asymptotic theory for principal component analysis,” Ann. Math. Stat., vol. 34, pp. 122–148, 1963.  H. Akaike, “Information theory and an extension of the Maximum Likelihood principle,” in Proc. 2nd Int. Symp. Inf. Theory, 1973, pp. 267–281.  M. Wax and I. Ziskind, “Detection of the number of coherent signals by the MDL principle,” IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 7, pp. 1190–1196, Aug. 1989.  H. L. van Trees, Optimum Array Processing (Part IV of Detection, Estimation, and Modulation Theory). New York: Wiley-Interscience, 2002. 910  A. Edelman, “The distribution and moments of the smallest eigenvalue of a random matrix of Wishart type,” Lin. Alg. Appl., vol. 159, pp. 55–80, 1991.  M. Kaveh and A. J. Barabell, “The statistical performance of the MUSIC and the minimum-norm algorithms in resolving plane waves in noise,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, pp. 331–341, Apr. 1986 [Corrections in ASSP-34 (6) 1986].  K. V. Mardia, J. T. Kent, and J. M. Bibby, Multivariate Analysis. New York: Academic, 1979.  R. A. Perley, F. Schwab, and A. H. Bridle, Eds., Synthesis Imaging in Radio Astronomy (A Collection of Lectures From the Third NRAO Synthesis Imaging Summer School). San Francisco, CA: Astronomical Soc. Pacific, 1989, vol. 6.  A. R. Thompson, J. M. Moran, and G. W. Swenson, Interferometry and Synthesis in Radioastronomy, Second ed. New York: Wiley, 2001.  A. J. Boonstra and A. J. van der Veen, “Gain calibration methods for radio telescope arrays,” IEEE Trans. Signal Process., vol. 51, pp. 25–38, Jan. 2003.  B. D. Jeffs, K. Warnick, and L. Li, “Improved interference cancellation in synthesis array radio astronomy using auxiliary antennas,” in Proc. IEEE ICASSP, Hong Kong, Apr. 2003.  B. D. Jeffs, L. Li, and K. Warnick, “Auxiliary antenna assisted interference cancellation for radio astronomy arrays,” IEEE Trans. Signal Processing, vol. 53, pp. 439–451, Feb. 2005.  J. H. Wilkinson, The Algebraic Eigenvalue Problem. Oxford, U.K.: Clarendon, 1965.  T. W. Anderson, An Introduction to Multivariate Analysis. New York: Wiley, 1984.  B. Ottersten, M. Viberg, and T. Kailath, “Analysis of subspace fitting and ML techniques for parameter estimation from sensor array data,” IEEE Trans. Signal Process., vol. 40, pp. 590–600, Mar. 1992.  N. Yuen and B. Friedlander, “Asymptotic performance analysis of ESPRIT, higher-order ESPRIT, and virtual ESPRIT algorithms,” IEEE Trans. Signal Process., vol. 44, pp. 2537–2550, Oct. 1996.  B. Friedlander and A. J. Weiss, “On the second-order statistics of the eigenvectors of sample covariance matrices,” IEEE Trans. Signal Processing, vol. 46, pp. 3136–3139, Nov. 1998. IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 3, MARCH 2005 Sebastiaan van der Tol was born in The Netherlands in 1977. He received the M.Sc. degree in electrical engineering from Delft University of Technology, Delft, The Netherlands, in 2004, where, since January 2004, he has been a research assistant and has been pursuing the Ph.D. degree in electrical engineering. His current research interests include array signal processing and interference mitigation techniques for large phased-array radio telescopes. Alle-Jan van der Veen (S’87–M’94–SM’02–F’05) was born in The Netherlands in 1966. He graduated (cum laude) from Department of Electrical Engineering, Delft University of Technology, Delft, The Netherlands, in 1988, from which he received the Ph.D. degree (cum laude) in 1993. Throughout 1994, he was a postdoctoral scholar at Stanford University, Stanford, CA. At present, he is a Full Professor in the Signal Processing Group of DIMES, Delft University of Technology. His research interests are in the general area of system theory applied to signal processing and, in particular, to algebraic methods for array signal processing. Dr. van der Veen received the 1994 and the 1997 IEEE SPS Young Author paper award and was an Associate Editor for IEEE TRANSACTIONS ON SIGNAL PROCESSING from 1998 to 2001. He was chairman of the IEEE SPS SPCOM Technical Committee from 2002 to 2004 is and Editor-in-Chief of the IEEE SIGNAL PROCESSING LETTERS.