This appendix describes the design of the program xtract and the macro language used to extract visibility and auxiliary data from GMRT LTA database. Section A.1 is intended for "plain" users of the program. The xtract macro language parser is also available as a stand-alone library, which can be used in other applications. The internal design of the library is described in section B.3 and is targeted for more enterprising users, who will find if useful to extend this program. Section B.4 describes the application programmers interface (API) of the library. Section B.5 describes the mechanism to extend the list of data/parameters which can be extracted.
The macro language encapsulates the fact that whatever numbers we need to extract from the visibility database are either antenna based, interferometer based (or equivalently, baseline based) and/or a function of time and/or frequency within the given observing band. One may want to extract data for a list of antennas, baselines, frequency channels with selection applied in time.
The macro language syntax is a hybrid of the implicit loops of the write statement of FORTRAN (see the manual for FORTRAN) and the format string used by the output functions of C (see documentation on printf in the manual for C). Three operators are defined in the language, namely base, chan and ant, which loop over a list of baselines, frequency channels and antennas respectively. The implicit loops loop over the body of the operators. The body is a list of semicolon (';') separated list of elements enclosed in pair of curly braces ('{' and '}'). Each operator must be followed by a body and the elements of the body can be another operator-body pair. Hence, nested loops are possible.
In the xtract program, the list of values for the operators is supplied as a list of comma (',') separated values for the keywords baselines, channels, and antenna respectively. An operator for time range selection is also required, but not explicitly defined. All macros are internally the body of this time range operator. The time range selection can be specified via the timestamps keyword.
Various elements which the syntax recognizes are listed in Table B.1. Some of these are independent of any operator, while others need to be part of the body of one or more operators. The elements and the operators required by these elements are also listed in the table.
Elements | Operators Required | Description |
ua,va,wa | ant (antenna based) | (u,v,w) co-ordinates of the antennas |
u,v,w | base (baseline based) | ![]() ![]() ![]() |
re,im | base,chan | Real and Imaginary parts of the |
visibility | ||
a,p | base,chan | Amplitude and phase of the |
visibility | ||
cno | chan | Frequency channel number |
ha,ist,lst | none | Hour Angle, IST time stamp and |
LST in hrs. | ||
az, el | none | Antenna Azimuth and Elevation |
angles in degree | ||
delay,phs0, | ant | The delay in ![]() |
dphs | antennas at the delay cards, | |
the fringe rotation phase, | ||
and the phase ramp applied at the | ||
output of FFT |
ua,va,wa are the co-ordinates of the antenna in the ()
co-ordinate system in units of the wavelength of the center of the
observing band and u=ua
-ua
, where the subscripts refer
to the two antennas of a baseline.
The macro to produce a table of rows with Hour Angle () value in
the first column followed by two columns for the real and imaginary
parts of the visibility at a single frequency for all selected
baselines, would be
fmt=base{ha;chan{re;im};\\n}The special element
The (u,v,w) values for each baseline can be added to each row of the table by the following macro
fmt=base{ha;u;v;w;chan{re;im};\\n}However note that the following macro is in error
fmt=ha;u;v;w;base{chan{re;im};\\n}This is because the elements u,v,w are a function of the baseline and they do not appear as part of the body of the base operator. The macro parser will generate an error message pointing out the possible error in this macro.
The elements can also be qualified by a C-styled printf
format field. Hence, for example, if the value needs to be
written with field length of 8 characters and precision of 3 digits,
the fmt string would become
fmt=base{ha%8.3f;u;v;w;chan{re;im};\\n}The format for the numbers can be of type 'f','g','G','e', or 'E' (see the documentation on printf function of C language for more details).
Various output formats can be generated by changing the order of loops and elements in this syntax. Here are some examples. Each of these will generate a table. The values in the various columns will be as given in the explanation.
fmt=base{ha;u;v;w;chan{re;im};\\n}
Column 1 will be the Hour Angle. Columns 2,3, and 4 will have the
u,v,w values followed by columns for real and
imaginary values for the N values that the chan operator can take.
There will be one such row in the table for each value of the base
operator.
fmt=ha;lst;\\n;base{u;v;w;chan{re;im};chan{a;p};\\n}
This format will generate a table with rows of unequal lengths.
Row 1 will have only and
values.
Row 2 will have u,v,w in the first 3 columns followed by real, imaginary, amplitude and phase for all channels listed in the chan operator. There will be one such row for every value of the base operator.
fmt=ha;base{u;v;w};\\n;base{chan{re;im}};\\n
This macro will generate a table of set of two rows of unequal lengths per input data record.
First row will have the and
values for each selected
baseline.
Second row will have the real and imaginary values of the visibilities for all channels of the chan operator and for all values of the base operator.
The macro language is used by the application program xtract to extract data from the GMRT visibility database. Most common use of xtract is to extract a data in the form of an ASCII table for display and/or further processing (e.g., to compute the antenna pointing errors). The output of xtract can be supplied to another program in two ways.
By default, xtract writes the output on the standard output. Hence if xtract is started as
xtract | MyProgthe output of xtract will be piped to the standard input of the program named MyProg. The other, probably more convenient, method of piping data is to set the out keyword to '|MyProg'.
The output will be written in ASCII format, preceded by a simple header. Apart from other fields, the header contains information about the number of rows and columns and the labels for each of the columns. This header always ends with a string ``#End'', after which the data is written. A line beginning with '#' is also written per LTA-scan. It is hoped that users will utilize these facilities to generate more filters to process and display data externally.
If the output file name begins with a '*', the file name is constructed after stripping the initial '*' character and the data is written in binary format (floating point numbers of size determined by the operator sizeof(float) of C or C++). The data itself is preceded by the ASCII header mentioned above. Hence, out=*tst.bin will produce a file tst.bin, which will contain the output in binary format and out=*|MyProg will pipe the binary data to MyProg.
For convenience of usage, a filter has been incorporated on the output stream of xtract which will supply the data directly to the QDP line plotting package. This filter can be invoked by setting out=>QDP. The output, in this case will be displayed as a stack of line plots using QDP.
A more general and usable graphical interface to the multiplot features of the freely available line plotting program Gnuplot has been developed by (Kudale & BhatnagarNCRA Tech. Rep. - in prepration). The data to this software can be supplied using the piping mechanism described above. A graphical user interface then allows the user to select the available baselines/antennas and plot them interactively in a flexible manner.
The xtract macros are first interpreted and then compiled in the memory. This complied code is then executed for every input data record. The details about the compilation and execution of the format string are given below.
The process of compilation of the format string involves two steps.
First, all the loops represented by the operators in the macro are exploded into a linked list (also called the symbol table), with each node of the list corresponding to a valid element of the language. Each element is represented in the memory by a structure of the following type:
typedef struct StructSymbType { char Name[NAMELEN]; char Fmt[FMTLEN]; int abc[3]; unsigned int Type; float (*func)(char *,float **,int); float *fargv[NARGV]; int fargc; float *ptr; struct StructSymbType *next; } SymbType;
All recognized elements (symbols) are tabulated in the
memory in a temporary table, which is a list of
structure of the following type:
typedef struct TT { char *Name; unsigned int Class,Type; } TypeTable;
This table is hard-coded in the file table.h and is used only
to validate the symbols in the macro. Once validated, the Class
and Type information for this table is transfered to the actual
symbol table and the temporary table destroyed.
Apart from the name of the element and the C styled format string, the nodes of the symbol table also have information about the mechanism to get the numeric value associated with the element. This information is in the field Type of the structure above. Valid types for the elements are listed in Table B.2.
The abc field of the element structure shown earlier, holds the values of the three operators (ant,base, and chan) applicable to the element.
Before an element is added to the symbol table, a check is made to ensure that all the required operators (listed in Table B.1) are active. To generate this information about the required operators, elements are further categorized into one of the classes listed in Table B.3.
Second step in the process of compilation is to fill in the information about the mechanism to get the numeric values of each elements in the list. The Type of the element and, if required, the values in the abc array are used for filling in this information.
For elements of type PTYPE , the ptr field is made to
point to the location in the memory where the required value is to be
found. This type of element refer to particular values in the
buffer in the memory and need the offsets in the buffer which can be
computed using the abc array. The buffer in the memory is
generally the buffer in which data records from the LTA-file is read.
Examples of this kind of elements are , real/imaginary values
of the visibility, etc.
For elements of type FTYPE, the func field is filled
with a pointer to a function which will be called when the value of
the element is required. If the computation of the value requires
some data, the pointers to this data is put in the field fargv
and the total number of such pointer is put in the field fargc.
These will be passed as arguments to the function when the value of
the element is required. The first argument passed to the function
will be the name of the element. Examples of this kind of elements
are , amplitude/phase of the visibility, etc.
For elements of type CHARType, nothing needs to be done. The name of such elements is the character that is to be copied to the output during execution.
The process of ``execution'' of the compiled list of elements is rather simple. The program steps through the entire list of elements and checks the type of each element on the list. If the type is PTYPE, the value of memory location to which ptr points, is copied to the output stream using the format in the Fmt field of the element. If the type is FTYPE, the function specified by func is called with Name, fargv, and fargc as the arguments. The value returned by this function is then copied to the output stream using the format in the Fmt field of the element. If the type is CHARType, the first character of the Name field is copied to the output stream.
Following is an example of a simple routine used for execution of the compiled macros:
/* $Id: xtract.tex,v 1.9 2000/02/18 03:58:24 sanjay Exp sanjay $ */ #include <stdio.h> #include <fmt.h> int ExecuteDef(FILE *fd,SymbType *P,float *buf,int len) { SymbType *i; int N=0; for (i=P;i;i=i->next) { switch(i->Type) { case PType: {buf[N++]= *i->ptr;break;} case FType: {buf[N++]=i->func(i->Name,i->fargv,i->fargc);break;} case CHARType: return N; default: fprintf(stderr,"###Error: Unknown type in ExecuteDef\n"); } } return N; }If the output in required is the binary format, one can write an equivalent Execute routine, which will ignore the Fmt field and CHARType elements and output the values in the binary format.
The process of compilation and execution of the xtract macro described above is done via a stand-alone library. This section describes the Application Programming Interface (API) of this library.
The C/C++ interface of this library is defined in fmt.h, which must be included in the code and linked to libjump.a, in addition to all other GMRT Off-line libraries (liboff.a, libregex.a, libkum.a).
The xtract macros are interpreted via the following function call:
int interpret(char *fmtString, struct fftmac *fm, Parameters *Params, SymbType *Inst)The first argument is the macro as a NULL terminated string. Second argument is the sanjay/Offline/gstruct">fftmac (Bhatnagar1997a) structure which holds the various mappings for the LTA database (e.g., sampler to the MAC mapping, etc.). This structure must be filled using services provided by the sanjay/Offline/ltaobj">GMRT Offline Library (getFFTMac method). The third argument is a pointer to the structure of the type Parameters. This structure holds the various parameters which the library uses while executing the macro. Various fields of this structure are described in Section B.4.3. The value of some of the fields of this structure are defined by the user, while others are to be extracted from the LTA database. The fourth argument is a pointer to a structure of type SymbType. This is the table of elements mentioned earlier and must be initialized to NULL before being passed to this routine. A return value of less than EOF (
Compilation of the macro string is done via a call to:
int Compile(SymbType *Inst,struct fftmac *fm, struct AntCoord *Tab, Parameters *Params)The first argument is the symbol table returned by a call to interpret. It now points to the head of a linked list of nodes of type SymbType. The last node of this list is NULL. The second argument is the fftmac structure. The third argument is the table of antenna co-ordinates. This can be retrieved from the LTA database via the services provided by the sanjay/Offline/ltaobj">GMRT Offline Library (getFFTMac method). The fourth argument is a pointer to the Parameters structure. A return value of less than EOF(
If the interpretation and compilation of the macro was successful, the compiled macro can be executed via calls to a user-supplied function of signature
int Exec(FILE *fd, SymbType *Inst, float *Buf, int ProgSize) fd refers to the output file already opened for writing. Inst is the symbol table returned by interpret. In case the output data is not to be written to any file, the user can write versions of this routine which will fill the data in the buffer Buf. ProgSize is the value returned by Compile.
The data field of the Parameters structure (see section B.4.3) must be made to point to the buffer in which the LTA-data buffers are read. To generate a regular stream of output, corresponding to each input data record, this function must be called every time a new LTA-data record is read.
Few types of Exec functions are provided in the library. These include:
It writes the output data to the fd file descriptor. It does not use the Buf pointer.
This writes the output data to the buffer pointed by Buf. The size of the this buffer must be big enough to hold one floating point number per node of the symbol table (return value of Compile). This does not use the file descriptor.
This supplies output data to the QDP program via a pipe opened via the libpipe.a library. This uses the Buf pointer but does not use fd.
The Parameters structure is of the following type:
typedef struct StructParamType{ int Norm; int *BList, *SList, *AList, *CList; int NBase, NScans, NAnt, NChan; int dBNBase, dBNChan, dBNAnt; int dBStartChan; int TimeOff, ParOff,DataOff; float Lambda; float sd,cd, TUnits; char *data; } Parameters;Various fields and their use is as follows:
This must be set to 1 if the visibility data is to be normalized by the geometric mean of the self correlations. Otherwise this must be set to 0.
These are pointers to the user selected list of the baseline, scan, antenna and channel numbers respectively. The list of channel numbers must be 0-relative and not the absolute channel index of the data base (which could start with number between 0 to maximum number of channels).
Typically, the user selects the baselines and antennas via the baseline/antenna names. These are supplied as strings by the user. Two functions, MkBaselines and MkAntNo, are provided to convert these stings to a list of bit fields in which the bits corresponding to the selected baselines are set to 1. Another routine toIntList, is provided to convert the bit fields to list of integers representing the selected baselines. These functions are available in the library liboff.a and are described in Appendix A of sanjay/Offline/ltaobj">GMRT Offline Library.
These are the lengths of BList, SList, AList and CList.
These are the number of the baselines, channels and antenna in the data base.
This is the number of the first frequency channel in the data base.
These are the offsets within the LTA-data buffer to locate the time stamp, the auxiliary parameters, and the visibility data itself. These offsets can be extracted from the global header of the database.
These are the values of
and
where
is the declination of the pointing center of the telescope.
These are used for the calculation of the (
) co-ordinates
during execution.
This is the multiplication factor used to convert the time stamp in the data to seconds of time. This is also extracted from the global header.
This the wavelength of the observing frequency in meters.
This is the pointer to the beginning of the LTA-data buffer.
To add a new elements to the xtract macro language, one needs to define the values of Name, Class, and Type of the new symbol in the table of valid elements. This is done by adding to the table in the file table.h (make sure the last element of this list is left unaltered).
One also needs to add a piece of C-code, which will fill the required fields of the structure SymbType (depending upon the Type of the element - the ptr field for PTYPE elements and the func, fargv, and fargc fields for FTYPE elements). It is the responsibility of the programmer to make sure that this code is correct in terms of getting the numeric value of the elements. Also, the programmer must make sure that this code is compatible with the Type of the element. Failing to do so will either generate wrong values or crash the program at the time of execution. This code is to be added in the function Compile in the file Compile.c. The application will need to be rebuilt for the new symbol to be recognized in the fmt syntax.