next up previous contents
Next: Bibliography Up: S.Bhatnagar: Thesis Previous: Antenna/baseline naming convention   Contents

Subsections


Computation of antenna based complex gains

The normalized cross-correlation function (the correlator output), measured by an interferometer using two antennas labeled by $ i$ and $ j$, in the limit $ I \ll T^s_i/\eta_i$, can be written as:

\begin{displaymath}\begin{split}\rho_{ij}^{Obs}=\rho^{Obs}(u_{ij},v_{ij},w_{ij})...
...}  &{dl dm\over\sqrt{(1-l^2-m^2)}} + \epsilon_{ij}\end{split}\end{displaymath} (15.1)

where $ I(l,m)$ is the sky surface brightness, $ \eta_i$ is the sensitivity and $ T^s_i$ is the system temperature of the antenna $ i$ in units of Kelvin/Jy and Kelvin respectively, $ \epsilon_{ij}$ is the additive noise on the baseline $ i$-$ j$, and $ \phi_i$ is the antenna based phase of the signal. The rest of the symbols have the usual meaning.

In practice, however, the antenna based amplitude ( $ \sqrt{{\eta_i}/{T^s_i}}$) and phase ($ \phi_i$) are potentially time varying quantities. This could be due to changes in the ionosphere, temperature variations, ground pick up, antenna blockage, noise pick up by various electronic components, background temperature, etc. Treating the quantities under the square root in the above equation as the antenna dependent amplitude gains, these can be written as complex gains $ g_i=
a_ie^{-\iota \phi_i}$ where $ a_i=\sqrt{\eta_i/T^s_i}$. For an unresolved source at the phase tracking center, variations in this amplitude will be indistinguishable from a variations in the ratio of $ \eta$ and $ T^s$.

In terms of $ g_i$s, we can write Equation D.1 as

$\displaystyle \rho_{ij}^{Obs} = g_i g^\star_j \rho^\circ_{ij} + \epsilon_{ij}$ (15.2)

where

$\displaystyle \rho_{ij}^\circ=\int\limits_{-\infty}^{+\infty}\int\limits_{-\inf...
...pi\iota(u_{ij}l+v_{ij}m+w_{ij}\sqrt{1-l^2-m^2})} {dl dm\over\sqrt{(1-l^2-m^2)}}$ (15.3)

The use of the word ``antenna based gains'' for $ g_i$s result in confusion for many and needs some clarifications. $ g_i$s are called complex ``gains'' since they multiply with the complex quantity $ \rho_{ij}$. For an unresolved source, $ \left\vert g_i\right\vert$ represents the fraction of correlated signal and $ arg(g_i)$ represents the phase of the correlated part of the signal from the antenna with respect to the phase reference (usually the reference antenna). It is in this sense that it is referred to as ``antenna based'' gains. However, as defined here, they include $ T^s$ which in turn includes the sky background temperature. They are therefore a function of direction in the sky. However, here we assume that the angular scale over which $ g_i$s vary is larger than the antenna primary beam (isoplanatic case).

For an unresolved source at the phase tracking center, all terms in the exponent of $ \rho_{ij}^\circ$ are exactly zero. $ \rho_{ij}^\circ$ in this case would be proportional to the flux density of the source.

Assuming that the antenna dependent complex gains are independent, with a gaussian probability density function (this implies that the real and imaginary parts are independently gaussian random processes), one can estimate $ g_i$s by minimizing, with respect to $ g_i$s, the function $ S$ given by

$\displaystyle S = \sum_{{i,j} \atop {i \ne j}}{\left\vert\rho_{ij}^{Obs} - g_i g_j^\star \rho_{ij}^\circ\right\vert}^2 w_{ij}$ (15.4)

where $ w_{ij}=1/\sigma^2_{ij}$, $ \sigma_{ij}$ being the variance on the measurement of $ \rho^{Obs}_{ij}$

Dividing the above equation by $ \rho_{ij}^\circ$ (the source model, which is presumed to be known - it is trivially known for an unresolved source), and writing $ \rho_{ij}^{Obs}/\rho_{ij}^\circ =
X_{ij}$, we get

$\displaystyle S = \sum_{{i,j} \atop {i \ne j}}{\left\vert X_{ij} - g_i g_j^\star\right\vert}^2 w_{ij}$ (15.5)

If $ \rho_{ij}^\circ$ represents the structure of the source accurately, $ X_{ij}$ will have no source dependent terms and is purely a product of the two antenna dependent complex gains.

Expanding Equation D.5, we get

$\displaystyle S=\sum_{{i,j} \atop {i \ne j}}\left[ \left\vert X_{ij}\right\vert...
...X_{ij} - g_i g_j^\star X_{ij}^\star + g_i g_i^\star g_j g_j^\star\right] w_{ij}$ (15.6)

Evaluation $ {\partial S \over \partial g_i^\star}$ and equating it to zero 15.1, we get

$\displaystyle {\partial S \over \partial g_i^\star} =  \sum_{j \atop {j \ne i}}\left[-g_j X_{ij} w_{ij} +g_i g_j g_j^\star w_{ij}\right] =  0$ (15.7)

or

$\displaystyle g_i = {\sum\limits_{j \atop {j \ne i}} X_{ij} g_j w_{ij} \over \sum\limits_{j \atop {j \ne i}} \left\vert g_j\right\vert^2 w_{ij}}$ (15.8)

This can also be derived by equating the partial derivatives of $ S$ with respect to real and imaginary parts of $ g_i$ as shown in Section D.3.

Since the antenna dependent complex gains also appear on the right-hand side of Equation D.8, it has to be solved iteratively starting with some initial guess for $ g_j$s or initializing them all to 1.

Equation D.8 can be written in the iterative form as:

$\displaystyle g_i^n = g_i^{n-1} + \alpha\left[g_i^{n-1}-{\sum\limits_{j \atop {...
...sum\limits_{j \atop {j \ne i}} \left\vert g_j^{n-1}\right\vert^2 w_{ij}}\right]$ (15.9)

where $ n$ is the iteration number and $ 0<\alpha<1$. Convergence would be defined by the constraint

$\displaystyle \left\vert S_n-S_{n-1}\right\vert < \delta$ (15.10)

(the change in $ S$ from one iteration to another) where $ \delta$ is the tolerance limit.


Interpretation of the equation

Equation D.8 offers itself for some intuitive understanding in the following way.

$ X_{ij}$ is a product of two complex numbers, namely $ g_i$ and $ g_j^\star$, which we wish to determine. $ X_{ij}$ is itself derived from the measured quantity $ V^{Obs}_{ij}$. Numerically speaking, each term in the summation of the numerator of Equation D.8 will involve $ g_i$ (via $ X_{ij}$) and the multiplication of $ X_{ij}$ with $ g_j w_{ij}$ would give $ g_i$ an effective weight of $ \left\vert g_j\right\vert^2 w_{ij}$. Since the denominator is the sum of this effective weight, the right-hand side of Equation D.8 can be interpreted as the weighted average of $ g_i$ over all correlations with antenna $ i$.

In the very first iteration, when $ g_j=(1,0)$, the normalization would be incorrect since the numeric value of $ g_j$, as it appears inside $ X_{ij}$ would be different from that used in the denominator of Equation D.8. However, as the estimates of $ g_j$s improve with iterations, the equation would progressively approach a true weighted average equation. The speed of convergence will depend upon the value of $ \alpha$ and the convergence would be defined by the constraint in Equation D.10. In the ideal case when the true value of all $ g_i$s is known, right hand side of Equation D.8 also reduces of $ g_i$.

Estimating $ g_i$ for an antenna, by averaging over the measurements from all baselines in which it participates (for a unresolved source) makes sense since for an N element array, $ g_i$ would be present in N-1 measurements (all the $ \left. X_{ij}\right\vert _{j=1,N; j \ne i}$) and the best estimate of $ g_i$ would be the weighted average of all these measurements. Proper weight for $ g_i$, buried in each of the products $ X_{ij}$, can be found heuristically as follows. $ g_i$, estimated from the measurements of a given baseline, must obviously be weighted by the signal-to-noise ratio on that baseline. This is $ w_{ij}$ in the above equations. It must also be weighted by the amplitude gain of the other antenna making the baseline, namely $ g_j$, to account for variation in antenna sensitivities and $ T^s$. The total weight for $ g_i$ would then be $ \left\vert g_j\right\vert^2 w_{ij}$, the sum of which appears in the denominator of Equation D.8. Knowing that ideally $ X_{ij} = g_i g_j^\star$, each of the $ \left.
X_{ij}\right\vert _{j=1,N}$ must be multiplied by $ g_j w_{ij}$ (to apply the the above mentioned weights to $ g_i$), before being summed for all values of $ j$ and normalized by the sum of weights to form the weighted average of $ g_i$. One thus arrives at Equation D.8 using these heuristic arguments.


Estimation of the system temperature ($ T^s$)

For an unresolved source of known brightness $ I$, in the limit $ T^a
\ll T^s$, $ \rho_{ij}^\circ=I$ and Equation D.1 can be written as

$\displaystyle \rho_{ij}^{Obs} = I g_ig_j^\star \approx  I\sqrt{{\eta_i\eta_j} \over {T^s_i T^s_j}}$ (15.11)

where $ \eta_i=A_e/{2k_b}$, $ A_e$ is the effective area of the dish, $ k_b$ is the Boltzman's constant and

$\displaystyle \left\vert g_i\right\vert = \sqrt{\eta_i \over T^s_i}$ (15.12)

Hence, knowing $ \eta_i$, $ T^s_i$ can be estimated from the amplitude of the antenna dependent complex gains.

All contributions to $ \rho_{ij}^{Obs}$, which cannot be factored into antenna dependent gains, will result in the reduction of $ \left\vert g\right\vert$. $ \eta$ remaining constant, this will be indistinguishable from an increase in the effective system temperature. Since majority of later processing of interferometry data for mapping (primary calibration, bandpass calibration, SelfCal, etc.) is done by treating the visibility as a product of two antenna based numbers, this is the effective system temperature which will determine the noise in the final map (though, as a final step in the mapping process, baseline based calibration can possibly improve the noise in the map).

In the normal case of no significant baseline based terms ( $ \epsilon_{ij}$) in $ X_{ij}$, the system temperature as measured by the above method will be equivalent to any other determination of $ T^s_i$.

$ T^s$ can also be determined by recording interferometric data for a strong point source with and without an independent noise source of known temperature at each antenna. In this case

$\displaystyle T^s_i = T^n_i({{g_i^{ON}}^2 \over {g_i^{OFF}}^2 - {g_i^{ON}}^2})$ (15.13)

where $ g_i^{ON}$ and $ g_i^{OFF}$ are the antenna dependent gains with and without the noise source of temperature $ T^n$. Note that $ \eta_i$ does not enter this equation. Also, $ T^n$ should be such that $ \sqrt{T^a/(T^n+T^s)} \ge 0.1$ to ensure that the correlated signal is measured with sufficient signal-to-noise ratio. For example, for P-band, a calibrator with P-band flux density $ >5$ Jy must be used.


Derivation of $ g_i$ using real and imaginary parts

$ g_i$s are complex functions. One can therefore write $ S$ in terms of $ g_i^I$ and $ {g_i^{\it p}}$, the real and imaginary parts of $ g_i$ and minimize with respect to $ g_i^I$ and $ {g_i^{\it p}}$ separately. It is shown here that the complex arithmetic achieves exactly this and the results are same as that given by complex calculus. The superscripts $ I$ and $ R$ in the following are used to represent the real and imaginary parts of complex quantities.

Expanding Equation D.5, ignoring $ w_{ij}$s and writing it in terms of real and imaginary parts we get

\begin{displaymath}\begin{split}\sum\limits_{{i,j} \atop {i \ne j}}\left\vert X_...
...=\sum\limits_{{i,j} \atop {i \ne j}}& S_0 S_0^\star \end{split}\end{displaymath} (15.14)

where

$\displaystyle S_0=\left[X_{ij}^R- {g_i^{\it p}}{g_j^{\it p}}- g_i^Ig_j^I\right]...
...iota \left[X_{ij}^I+ {g_i^{\it p}}g_j^I- g_i^I{g_j^{\it p}}\right] %%Imag part
$ (15.15)

Taking partial derivative of $ S$ with respect to $ {g_i^{\it p}}$ and reintroducing $ w_{ij}$, we get

\begin{displaymath}\begin{split}{\partial S \over \partial {g_i^{\it p}}}=&\sum\...
...I}^2 - {g_i^{\it p}} {{g_j^{\it p}}}^2\right]w_{ij} \end{split}\end{displaymath} (15.16)

Therefore,

$\displaystyle {\partial S \over \partial {g_i^{\it p}}}= -2\sum\limits_{j \atop...
...t[Re(X_{ij}g_j^\star ) - \left\vert g_j\right\vert^2 {g_i^{\it p}}\right]w_{ij}$ (15.17)

Equating $ \partial S \over \partial {g_i^{\it p}}$ to zero, we get

$\displaystyle {g_i^{\it p}}= {\sum\limits_{j \atop {j \ne i}}Re(X_{ij}g_j^\star...
...j}) \over {\sum\limits_{j \atop {j \ne i}}\left\vert g_j \right\vert^2 w_{ij}}}$ (15.18)

Similarly

$\displaystyle {\partial S \over \partial g_i^I}=-2\sum\limits_{j \atop {j \ne i}}\left[Im(X_{ij}g_j^\star) - \left\vert g_j\right\vert^2 g_i^I\right] w_{ij}$ (15.19)

Therefore the equivalent imaginary part of Equation D.18 is

$\displaystyle g_i^I= {\sum\limits_{j \atop {j \ne i}}Im(X_{ij}g_j^\star w_{ij}) \over {\sum\limits_{j \atop {j \ne i}}\left\vert g_j \right\vert^2 w_{ij}}}$ (15.20)

writing $ g_i={g_i^{\it p}}+ \iota g_i^I$ and substituting for $ {g_i^{\it p}}$ and $ g_i^I$ from Equation D.18 and D.20 respectively, we get

$\displaystyle g_i = {\sum\limits_{j \atop {j \ne i}}X_{ij}g_j^\star w_{ij} \over {\sum\limits_{j \atop {j \ne i}}\left\vert g_j \right\vert^2 w_{ij}}}$ (15.21)

This is same as Equation D.8, which was arrived at by evaluating a complex derivative of Equation D.5 as $ \partial S/\partial g_i^\star$, treating $ g_i$ and $ g_I^\star$ as independent variables. Evaluating $ {\partial S \over \partial g_i}=0$ would give the complex conjugate of Equation D.21. Hence, $ \partial S/\partial g_i$ gives no independent information not present in $ \partial S/\partial g_i^\star$.


next up previous contents
Next: Bibliography Up: S.Bhatnagar: Thesis Previous: Antenna/baseline naming convention   Contents
Sanjay Bhatnagar 2005-07-07