Appendix B
xtract: Usage and programming with the library

This appendix describes the design of the program xtract and the macro language used to extract visibility and auxiliary data from GMRT LTA database. Section A.1 is intended for ”plain” users of the program. The xtract macro language parser is also available as a stand-alone library, which can be used in other applications. The internal design of the library is described in section B.3 and is targeted for more enterprising users, who will find if useful to extend this program. Section B.4 describes the application programmers interface (API) of the library. Section B.5 describes the mechanism to extend the list of data/parameters which can be extracted.

B.1 The macro language

The macro language encapsulates the fact that whatever numbers we need to extract from the visibility database are either antenna based, interferometer based (or equivalently, baseline based) and/or a function of time and/or frequency within the given observing band. One may want to extract data for a list of antennas, baselines, frequency channels with selection applied in time.

The macro language syntax is a hybrid of the implicit loops of the write statement of FORTRAN (see the manual for FORTRAN) and the format string used by the output functions of C (see documentation on printf in the manual for C). Three operators are defined in the language, namely base, chan and ant, which loop over a list of baselines, frequency channels and antennas respectively. The implicit loops loop over the body of the operators. The body is a list of semicolon (’;’) separated list of elements enclosed in pair of curly braces (’{’ and ’}’). Each operator must be followed by a body and the elements of the body can be another operator-body pair. Hence, nested loops are possible.

In the xtract program, the list of values for the operators is supplied as a list of comma (’,’) separated values for the keywords baselines, channels, and antenna respectively. An operator for time range selection is also required, but not explicitly defined. All macros are internally the body of this time range operator. The time range selection can be specified via the timestamps keyword.

Various elements which the syntax recognizes are listed in Table B.1. Some of these are independent of any operator, while others need to be part of the body of one or more operators. The elements and the operators required by these elements are also listed in the table.



Table B.1: List of elements which can be part of the body of operators in xtract macros



Elements Operators Required Description



ua,va,wa ant (antenna based) (u,v,w) co-ordinates of the antennas
u,v,w base (baseline based)uij (=uai - uaj) for the baseline ij
re,im base,chan  Real and Imaginary parts of the
visibility
a,p base,chan  Amplitude and phase of the
visibility
cno chan  Frequency channel number
ha,ist,lst none Hour Angle, IST time stamp and
LST in hrs.
az, el none Antenna Azimuth and Elevation
angles in degree
delay,phs0,ant  The delay in μs applied to the
dphs antennas at the delay cards,
the fringe rotation phase,
and the phase ramp applied at the
output of FFT




ua,va,wa are the co-ordinates of the antenna in the (u,v,w) co-ordinate system in units of the wavelength of the center of the observing band and u=ua1-ua2, where the subscripts refer to the two antennas of a baseline.

The macro to produce a table of rows with Hour Angle (HA) value in the first column followed by two columns for the real and imaginary parts of the visibility at a single frequency for all selected baselines, would be

                fmt=base{ha;chan{re;im};\\n}

The special element \\n represents the actual character that will appear at the given position in the output (which is the NEWLINE character here). The only other special element that this syntax currently allows is \\t (TAB).

The (u,v,w) values for each baseline can be added to each row of the table by the following macro

                fmt=base{ha;u;v;w;chan{re;im};\\n}

However note that the following macro is in error

                fmt=ha;u;v;w;base{chan{re;im};\\n}

This is because the elements u,v,w are a function of the baseline and they do not appear as part of the body of the base operator. The macro parser will generate an error message pointing out the possible error in this macro.

The elements can also be qualified by a C-styled printf format field. Hence, for example, if the value HA needs to be written with field length of 8 characters and precision of 3 digits, the fmt string would become

                fmt=base{ha%8.3f;u;v;w;chan{re;im};\\n}

The format for the numbers can be of type ’f’,’g’,’G’,’e’, or ’E’ (see the documentation on printf function of C language for more details).

Various output formats can be generated by changing the order of loops and elements in this syntax. Here are some examples. Each of these will generate a table. The values in the various columns will be as given in the explanation.

B.2 Output filters

The macro language is used by the application program xtract to extract data from the GMRT visibility database. Most common use of xtract is to extract a data in the form of an ASCII table for display and/or further processing (e.g., to compute the antenna pointing errors). The output of xtract can be supplied to another program in two ways.

By default, xtract writes the output on the standard output. Hence if xtract is started as

             xtract | MyProg

the output of xtract will be piped to the standard input of the program named MyProg. The other, probably more convenient, method of piping data is to set the out keyword to ’|MyProg’.

The output will be written in ASCII format, preceded by a simple header. Apart from other fields, the header contains information about the number of rows and columns and the labels for each of the columns. This header always ends with a string “#End”, after which the data is written. A line beginning with ’#’ is also written per LTA-scan. It is hoped that users will utilize these facilities to generate more filters to process and display data externally.

If the output file name begins with a ’*’, the file name is constructed after stripping the initial ’*’ character and the data is written in binary format (floating point numbers of size determined by the operator sizeof(float) of C or C++). The data itself is preceded by the ASCII header mentioned above. Hence, out=*tst.bin will produce a file tst.bin, which will contain the output in binary format and out=*|MyProg will pipe the binary data to MyProg.

For convenience of usage, a filter has been incorporated on the output stream of xtract which will supply the data directly to the QDP line plotting package. This filter can be invoked by setting out=>QDP. The output, in this case will be displayed as a stack of line plots using QDP.

A more general and usable graphical interface to the multiplot features of the freely available line plotting program Gnuplot has been developed by (Kudale & Bhatnagar NCRA Tech. Rep. - in prepration). The data to this software can be supplied using the piping mechanism described above. A graphical user interface then allows the user to select the available baselines/antennas and plot them interactively in a flexible manner.

B.3 Internal design

The xtract macros are first interpreted and then compiled in the memory. This complied code is then executed for every input data record. The details about the compilation and execution of the format string are given below.

B.3.1 Macro compilation

The process of compilation of the format string involves two steps.

First, all the loops represented by the operators in the macro are exploded into a linked list (also called the symbol table), with each node of the list corresponding to a valid element of the language. Each element is represented in the memory by a structure of the following type:

                 typedef struct StructSymbType {  
                   char Name[NAMELEN];  
                   char Fmt[FMTLEN];  
                   int abc[3];  
                   unsigned int Type;  
                   float (*func)(char *,float **,int);  
                   float *fargv[NARGV];  
                   int fargc;  
                   float *ptr;  
                   struct StructSymbType *next;  
                 } SymbType;

All recognized elements (symbols) are tabulated in the memory in a temporary table, which is a list of structure of the following type:

                 typedef struct TT {  
                   char *Name;  
                   unsigned int Class,Type;  
                 } TypeTable;

This table is hard-coded in the file table.h and is used only to validate the symbols in the macro. Once validated, the Class and Type information for this table is transfered to the actual symbol table and the temporary table destroyed.

Apart from the name of the element and the C styled format string, the nodes of the symbol table also have information about the mechanism to get the numeric value associated with the element. This information is in the field Type of the structure above. Valid types for the elements are listed in Table B.2.


Table B.2: Table of element types in xtract macro language



Type Meaning


CHARTypeRepresents a character to output
FTYPE Function type: the value will be
returned by a call to the function func
PTYPE Pointer type: the value will be in the buffer
at the location pointed to by ptr



The abc field of the element structure shown earlier, holds the values of the three operators (ant,base, and chan) applicable to the element.

Before an element is added to the symbol table, a check is made to ensure that all the required operators (listed in Table B.1) are active. To generate this information about the required operators, elements are further categorized into one of the classes listed in Table B.3.


Table B.3: Table of valid classes of the elements of xtract macro language



Class Operators Required


IV None
AV ant 
BV base 
CV chan 
BCV base,chan 
ABCVant,base,chan 



Once the element is validated for the required active operators, a new link is created in the symbol table and filled with the Name, Type and Class of the element. By this time, the loops (represented by the list of values associated with various operators) have already been exploded (i.e., a node created in the symbol table for each value of the operator). Information about the values of the operators is transfered to the symbol table for every value of the active operators and the values of the required operators are put in the abc array (passive operators are assigned a value of -1). By this time, if no error has occurred, it is assured that the syntax was correct and all the elements in the macro were recognized.

Second step in the process of compilation is to fill in the information about the mechanism to get the numeric values of each elements in the list. The Type of the element and, if required, the values in the abc array are used for filling in this information.

For elements of type PTYPE , the ptr field is made to point to the location in the memory where the required value is to be found. This type of element refer to particular values in the buffer in the memory and need the offsets in the buffer which can be computed using the abc array. The buffer in the memory is generally the buffer in which data records from the LTA-file is read. Examples of this kind of elements are IST, real/imaginary values of the visibility, etc.

For elements of type FTYPE, the func field is filled with a pointer to a function which will be called when the value of the element is required. If the computation of the value requires some data, the pointers to this data is put in the field fargv and the total number of such pointer is put in the field fargc. These will be passed as arguments to the function when the value of the element is required. The first argument passed to the function will be the name of the element. Examples of this kind of elements are HA, amplitude/phase of the visibility, etc.

For elements of type CHARType, nothing needs to be done. The name of such elements is the character that is to be copied to the output during execution.

B.3.2 Macro execution

The process of “execution” of the compiled list of elements is rather simple. The program steps through the entire list of elements and checks the type of each element on the list. If the type is PTYPE, the value of memory location to which ptr points, is copied to the output stream using the format in the Fmt field of the element. If the type is FTYPE, the function specified by func is called with Name, fargv, and fargc as the arguments. The value returned by this function is then copied to the output stream using the format in the Fmt field of the element. If the type is CHARType, the first character of the Name field is copied to the output stream.

Following is an example of a simple routine used for execution of the compiled macros:

 /* $Id: xtract.tex,v 1.9 2000/02/18 03:58:24 sanjay Exp sanjay $ */  
 #include <stdio.h>  
 #include <fmt.h>  
 
 int ExecuteDef(FILE *fd,SymbType *P,float *buf,int len)  
 {  
   SymbType *i;  
   int N=0;  
 
   for (i=P;i;i=i->next)  
     {  
      switch(i->Type)  
        {  
         case PType:  
          {buf[N++]= *i->ptr;break;}  
         case FType:  
          {buf[N++]=i->func(i->Name,i->fargv,i->fargc);break;}  
         case CHARType: return N;  
         default:  
           fprintf(stderr,"###Error: Unknown type in ExecuteDef\n");  
        }  
     }  
   return N;  
 }

If the output in required is the binary format, one can write an equivalent Execute routine, which will ignore the Fmt field and CHARType elements and output the values in the binary format.

B.4 Programming with the xtract library

The process of compilation and execution of the xtract macro described above is done via a stand-alone library. This section describes the Application Programming Interface (API) of this library.

The C/C++ interface of this library is defined in fmt.h, which must be included in the code and linked to libjump.a, in addition to all other GMRT Off-line libraries (liboff.a, libregex.a, libkum.a).

B.4.1 Interpretation and compilation of the format string

The xtract macros are interpreted via the following function call:

      int interpret(char *fmtString, struct fftmac *fm,  
                    Parameters *Params, SymbType *Inst)

The first argument is the macro as a NULL terminated string. Second argument is the fftmac (Bhatnagar 1997a)1 structure which holds the various mappings for the LTA database (e.g., sampler to the MAC mapping, etc.). This structure must be filled using services provided by the GMRT Offline Library2 (getFFTMac method). The third argument is a pointer to the structure of the type Parameters. This structure holds the various parameters which the library uses while executing the macro. Various fields of this structure are described in Section B.4.3. The value of some of the fields of this structure are defined by the user, while others are to be extracted from the LTA database. The fourth argument is a pointer to a structure of type SymbType. This is the table of elements mentioned earlier and must be initialized to NULL before being passed to this routine. A return value of less than EOF (-1), indicates a syntax error in the macro.

Compilation of the macro string is done via a call to:

      int Compile(SymbType *Inst,struct fftmac *fm,  
                  struct AntCoord *Tab, Parameters *Params)

The first argument is the symbol table returned by a call to interpret. It now points to the head of a linked list of nodes of type SymbType. The last node of this list is NULL. The second argument is the fftmac structure. The third argument is the table of antenna co-ordinates. This can be retrieved from the LTA database via the services provided by the GMRT Offline Library3 (getFFTMac method). The fourth argument is a pointer to the Parameters structure. A return value of less than EOF(-1) indicates error in compilation of the macro. On successful compilation, it returns the size of the compiled symbol table in units of the size of the structure SymbType.

B.4.2 Execution of the compiled macro

If the interpretation and compilation of the macro was successful, the compiled macro can be executed via calls to a user-supplied function of signature

int Exec(FILE *fd, SymbType *Inst, float *Buf, int ProgSize) fd refers to the output file already opened for writing. Inst is the symbol table returned by interpret. In case the output data is not to be written to any file, the user can write versions of this routine which will fill the data in the buffer Buf. ProgSize is the value returned by Compile.

The data field of the Parameters structure (see section B.4.3) must be made to point to the buffer in which the LTA-data buffers are read. To generate a regular stream of output, corresponding to each input data record, this function must be called every time a new LTA-data record is read.

Few types of Exec functions are provided in the library. These include:

To generate any other functionality, the programmers need to write their versions of this function. The recommended route for writing a new function is to modify Execute or ExecuteDef functions.

B.4.3 The Parameters structure

The Parameters structure is of the following type:

            typedef struct StructParamType{  
               int Norm;  
               int *BList, *SList, *AList, *CList;  
               int  NBase,  NScans, NAnt,   NChan;  
               int dBNBase, dBNChan, dBNAnt;  
               int dBStartChan;  
               int TimeOff, ParOff,DataOff;  
 
               float Lambda;  
               float sd,cd, TUnits;  
               char *data;  
            } Parameters;

Various fields and their use is as follows:

B.5 Adding new elements to the syntax

To add a new elements to the xtract macro language, one needs to define the values of Name, Class, and Type of the new symbol in the table of valid elements. This is done by adding to the table in the file table.h (make sure the last element of this list is left unaltered).

One also needs to add a piece of C-code, which will fill the required fields of the structure SymbType (depending upon the Type of the element – the ptr field for PTYPE elements and the func, fargv, and fargc fields for FTYPE elements). It is the responsibility of the programmer to make sure that this code is correct in terms of getting the numeric value of the elements. Also, the programmer must make sure that this code is compatible with the Type of the element. Failing to do so will either generate wrong values or crash the program at the time of execution. This code is to be added in the function Compile in the file Compile.c. The application will need to be rebuilt for the new symbol to be recognized in the fmt syntax.