Date: Tue, 06 Jul 2004 08:55:26 -0400
From: Ed Fomalont <efomalon@nrao.edu>
To: Joe McMullin <jmcmulli@nrao.edu>
Cc: smyers@nrao.edu
Subject: Reduction of AW602

General comments between AIPS and AIPS++ processing of AW602

   Below are comments concerning the reduction of AW02 in AIPS and
AIPS++.  I have attached the aips script and the aips++ script for
the reductions.  I hope this is useful.

   Cheers,  Ed


1.  Reading in the data:

    AIPS and AIPS++ both read in the data in about the same time.
However, the AIPS flexibility in tying the different spectral windows
and polarization is much better, although this took a little time to
figure out..  I made two AIPS data bases, each associated with the two
basic frequency settings, and each containing RR and LL polarizations,
and then the reductions then went smoothly.  The 16 different spectral
ID's in AIPS++ are confusing.  It is unclear why the RR and LL
polarizations are not put in the same spectral ID.  I think the
vlafiller should contain parameters which allow decreasing the number
of spectral ID's.  This will save a lot of confusion later in the
reductions.  All that is needed is a parameter which states that
frequencies within xxxx kHz should be in the same spectral_id.  For
the AIPS run, xxxx=1000 (cparm(7) in fillm).

    There were difference in the AIPS and AIPS++ data sets.  The major
difference was that antenaa 24 was noise-like in the AIPS++ data base,
but looks okay in the AIPS data base.  There were other minor
differences as well.


2.  Setjy or running imager for the first time.

    Running setjy or the first time a user goes into imager is a
definite downer to anyone running the system.  The time spent in
adding the three additional columns to the ms, initializing the model
and corrected data and the natural weights take more run time that the
AIPS time to calibrate and make all of the images (well, at least the
dirty images).  Although this step has been sped up a lot over the
years, it is still unacceptably long.

    I have never gotten a satisfactory answer to this: Why can't these
additional columns and their initialization be made when the data is
written in the the ms by vlafiller?  How much would this increase
vlafiller?  These columns are needed in almost all cases.

3.  Calibration and Editing:

    AIPS is still faster than AIPS++ for calibration and editing.  For
example, the filtering of all points greater than 2 Jy took
significantly longer in AIPS++, and the other flagging functions were
a bit slower in AIPS++.  The use of flagging tables in AIPS clearly is
very efficient compared with flags in the data base.  Determining the
gain calibration and the bandpass calibration is about a factor of
three faster in AIPS.  This difference is somewhat contrary to the
work and effort that George put in to speed up calibration; I'm not
sure why there is a difference.

    The need to use spwmap made life complicated, and I finally
figured out what to do.  Defects have been supplied.  What I want to
do to apply the bandpass is the following:

     sp_id 13 --> sp_id 1,5,9,13
     sp_id 14 --> sp_id 2,6,10,14
     sp_id 15 --> sp_id 3,7,11,15
     sp_id 16 --> sp_id 3,7,12,16

After an hours of fooling around, I found that
    spwmap = [13,14,15,16,13,14,15,16,13,14,15,16,13,14,15,16]
      is what I wanted.

To apply the gain solutions of calibrator 0336, I needed
    spwmap = [1,2,1,2,5,6,5,6,1,2,5,6,1,2,5,6]

Without detailed explanations in the URM, this will take a long
time to figure out.  Again, I think you are making life much
more difficult for the user but not letting vlafiller combine
sp_ids as needed for smooth calibration.

4.  Fluxscale

    I could not use calibrater.fluxscale.  I think this tool needs
spwmap.

5.  Calibrated Data

    I removed outliers before calibration, but found that the
calibrated data had large, unexpected, outliers for the calibrators
and the sources.  Most of the data should be less than 5 Jy, but there
were spikes as large at 400 Jy.  On inspection, they were associated
with antenna 14 near the beginning and end of two scans.  This
suggests that the end points of the calibration, occasionally, are
not applied correctly.  I filtered these outliers..  Otherwise, the
calibration looks okay.

6.  Making Images

    I am having trouble making dirty images of the appropriate
spectral channels of the targets.  I am problem doing something
wrong in the script, although I used Debra's cookbook in setting
up the channel descriptions.


Below is a quick and dirty comparison of the run times for similar
AIPS and AIPS++ task.  It is not always easy making an exact
one-to-one correspondence.


TIME OF AIPS SCRIPT for AW602:
                                    TIME in SECONDS  AIPS++ timing and comments
                                       REAL CPU       REAL   comments

FILLM:  Filling of data                 68   27
FILLM:  Second day of data              90   21
UVCOP:  First frequency copy            27   21
UVCOP:  Second frequency copy           28   21              aips++ filler is
                                                             faster, but has
                                                             too many sp_id
INDXR:  First frequency copy             7    7
INDXR:  Second frequency copy            7    7        175
                                                             too much time
SETJY:  Flux for 0548+498                1    1        842   making new cols.

CLIPM:  Clip fluxes over 2 Jy           38   35        220   filter takes time

UVFLG:  Flag selected bad data           1    1         30

CALIB:  Amp & Phase cal of 3 sources    30   14         89
CLCAL:  Apply calibrations               3    2
BPASS:  Determine bandpass               9    4         25
CALIB:  Calib with bandpass             28   18         88
                                                             could not do this
GETJY:  Get flux of 0316 and 0336        1    1              in aips++

CLCAL:  Apply calibrations               2    1         14

        extra flagging of calib data                   178

IMAGR:  Image and clean 0336 (200 comp) 70   65
IMAGR:  Image and clean 0336 (200 comp) 71   65        210   Made both freqs.

IMAGR:  Dirty image of 10 sources                            Not done yet
        channels 4 to 57 by 3
        18 images x 10 sources         
        512x512, 0.5" pixels           240  177
                                      ---------
TOTAL TIME                             721  498      >2000   guessing for last
                                                             imager

==============================================================================

See scripts:

aips++   :  2004-07-06-AW602.g

AIPS     :  2004-07-06-AW602.001

==============================================================================

Date: Wed, 7 Jul 2004 16:09:02 -0600 (MDT)
From: George Moellenbrock <gmoellen@aoc.nrao.edu>
To: Ed Fomalont <efomalon@nrao.edu>, Joe McMullin <jmcmulli@zia.aoc.nrao.edu>,
     Steve Myers <smyers@zia.aoc.nrao.edu>,
     Kumar Golap <kgolap@zia.aoc.nrao.edu>
Subject: Re: Reduction of AW602 (fwd)

Hi Ed,

Thanks for your detailed work and  comments.  Have yourself a drink.
(Sounds like you might need one.)

> 
> 1.  Reading in the data:
> 
>     AIPS and AIPS++ both read in the data in about the same time.
> However, the AIPS flexibility in tying the different spectral windows
> and polarization is much better, although this took a little time to

Yes we expect to provide for combining spws as does aips, despite the fact
that some data will effectively be mislabelled in the MS as a result. On
principle, we'd rather not do this, but the practical advantages probably
trump our conservatism.  It is worthwhile pointing out that observations
can be made in such a way to avoid this problem, i.e., doppler track in
the direction of the target even when observing the calibrators (Bryan B.
tells me this is possible).  Apparently, people find this unintuitive, and
end up complicating their observations (and the archive) substantially
when they manually set the topo freqs on the calibrators to match the
latest (corresponding) topo frequency on the target.  Yancy showed me a
dataset where the calibrator spw kept changing during the observation.  
It should just be cast in the LSR w.r.t. the target!

The issue of the RR and LL appearing in different spws is due simply to
ignorance of this mode by the vlafiller.  We'll sort this out.

>     There were difference in the AIPS and AIPS++ data sets.  The major
> difference was that antenaa 24 was noise-like in the AIPS++ data base,
> but looks okay in the AIPS data base.  There were other minor
> differences as well.

Hmmmm.  Submit a defect for this, if you haven't already.

> 2.  Setjy or running imager for the first time.
> 
>     Running setjy or the first time a user goes into imager is a
> definite downer to anyone running the system.  

Yes, we are working hard on this.  Forget lack of answers in the past, and
please be patient a little longer.  This is a key area of remaining work
in performance, and it is especially a problem for channelized data where
the scratch columns are bigger.  Rest assured, we are not ignoring it.


> 3.  Calibration and Editing:
> 
>     AIPS is still faster than AIPS++ for calibration and editing.  

There are still some outstanding issues that are especially important
for channelized data in calibrater.

> For
> example, the filtering of all points greater than 2 Jy took
> significantly longer in AIPS++, and the other flagging functions were
> a bit slower in AIPS++.  

As pointed out many times, the flagger tool is in glish.  The autoflag
tool is much faster---we need better examples of how to use it.  It can be
tricky to run, but things like quack, flagac, and clipping are available
here.  I haven't yet used it enough myself to be able to provide you any
useful hints, but it is pretty clever.

> The use of flagging tables in AIPS clearly is
> very efficient compared with flags in the data base. 

I am not sure this is so clear.  We have some decisions to make about
how to procede with storing flagging info.  We just haven't had a chance
to work on this in any detail since responding to Bill's concerns last
summer.


> Determining the
> gain calibration and the bandpass calibration is about a factor of
> three faster in AIPS.  This difference is somewhat contrary to the
> work and effort that George put in to speed up calibration; I'm not
> sure why there is a difference.
> 

No, not contrary;  our (relatively new) VLA spec line benchmark suffers
similarly.  For a point source model, we are hit by unnecessary I/O of the
model data column, as well as some bookkeeping issues when applying
calibration to data.  Stay tuned for significant improvements here.  The
MS SOURCE table provides for storing a source model (e.g., a point source
flux density, etc.), and I will soon add a feature using this (it is one
of the next things on my list, after I get done with the CDR).  This
should improve performance significantly, and is also related to avoiding
the MODEL_DATA scratch column cost at the beginning!

>     The need to use spwmap made life complicated, 

Sounds like this really annoyed you.  Sorry I didn't explain this better.  
Stay tuned for improvements in the doc.  Please note: spwmap is not just a
(perhaps clumsy) means of accomodating bunches of closely-related spws
(which is better handled by interpretting the dataset liberally in the
filler).  Spwmap will be used, e.g., to transfer calibration from one band
to another, e.g., 15 GHz solutions to 22 GHz data, with the appropriate
rate multiplier automatically included.  (Don't try this yet!  The
multiplier hasn't been added yet.)  Also, transfer of calibration from one
dataset to another (e.g., D solutions) will take advantage of spwmap when
different spw numbers requires it.  The way we are using it at the moment
will not be the best use of it in the long run!

> 4.  Fluxscale
> 
>     I could not use calibrater.fluxscale.  I think this tool needs
> spwmap.
> 

As I pointed out in replying to your spwmap defect, this has been
added (a week or two ago).  

> 5.  Calibrated Data
> 
>     I removed outliers before calibration, but found that the
> calibrated data had large, unexpected, outliers for the calibrators
> and the sources.  Most of the data should be less than 5 Jy, but there
> were spikes as large at 400 Jy.  On inspection, they were associated
> with antenna 14 near the beginning and end of two scans.  This
> suggests that the end points of the calibration, occasionally, are
> not applied correctly.  I filtered these outliers..  Otherwise, the
> calibration looks okay.
> 

I am working on things relevant to assigning calibration solutions to 
data--I'll keep this issue in mind.  


Finally, one other thing:  in the lastest stable (#615), imager seems to
have suffered a setback in performance compared to the previous stable
(#549), at least for sp line data.  Kumar is working on this.  Stay tuned.

Cheers,
George

==============================================================================

Date: Thu, 08 Jul 2004 10:17:32 -0400
From: Ed Fomalont <efomalon@nrao.edu>
To: George Moellenbrock <gmoellen@zia.aoc.nrao.edu>
Cc: Joe McMullin <jmcmulli@zia.aoc.nrao.edu>,
     Steve Myers <smyers@zia.aoc.nrao.edu>,
     Kumar Golap <kgolap@zia.aoc.nrao.edu>
Subject: Re: Reduction of AW602 (fwd)

Hi George,

    Thanks for your comments on my comments.  I'll check on
more carefully looking at the differences in the data after loading
with aips and aips++.

    Cheers,  Ed

==============================================================================

Date: Fri, 9 Jul 2004 11:23:19 -0600 (MDT)
From: George Moellenbrock <gmoellen@aoc.nrao.edu>
To: Joe McMullin <jmcmulli@zia.aoc.nrao.edu>,
     Kumar Golap <kgolap@zia.aoc.nrao.edu>,
     Steve Myers <smyers@zia.aoc.nrao.edu>
Subject: Re: Reduction of AW602 (fwd)


Joe, Kumar, Steve,

I've already replied to Ed on his usage comments.  Here is a reply to his
performance numbers, which, in fact, are indeed consistent with what I
presented for the G192 benchmark. (abridged - STM)

-George

General:  AW602 is quite similar to the G192 benchmark case in terms
of calibration.  It is a mosaic, so the imaging requirments are quite
different.

> 
> TIME OF AIPS SCRIPT for AW602:
>                                     TIME in SECONDS  AIPS++ timing and 
> comments
>                                        REAL CPU
> 
> FILLM:  Filling of data                 68   27
> FILLM:  Second day of data              90   21       REAL   comments
> UVCOP:  First frequency copy            27   21
> UVCOP:  Second frequency copy           28   21              aips++ 
> filler is
>                                                              faster, but has
>                                                              too many sp_id

we will enable a freq tolerance mode which will provide for
recognizing the many spws as one when practically appropriate.
copying out the data for different freqs will not be necessary
in aips++.  also, the polarizations (RR,LL) will be combined
(this is a mode not described in the vla archive format doc, and
so isn't yet supported in vlafiller).


> INDXR:  First frequency copy             7    7
> INDXR:  Second frequency copy            7    7        175

Is "175" the filler time for aips++?  the spw grouping shouldn't 

>                                                              too much time
> SETJY:  Flux for 0548+498                1    1        842   making new 
> cols.
> 

this is the scratch cols.  known problem described in perf talk at CDR.


> CLIPM:  Clip fluxes over 2 Jy           38   35        220   filter 
> takes time
> 

this is because flagger tool is in glish.  the benchmark reported
in at the CDR is using the autoflag tool, which matches aips performance
in that case. 


> UVFLG:  Flag selected bad data           1    1         30
> 

again, due to flagger in glish.  not sure what was done here, but
autoflag may already support it.  otherwise, it will be addressed
when flagging issues get attention later this (?) year.


> CALIB:  Amp & Phase cal of 3 sources    30   14         89
> CLCAL:  Apply calibrations               3    2

"Apply calibrations" here means CLCAL, not actual application to data.

> BPASS:  Determine bandpass               9    4         25
> CALIB:  Calib with bandpass             28   18         88
>                                                              could not 
> do this
> GETJY:  Get flux of 0316 and 0336        1    1              in aips++
> 

this is, in fact, available in a more recent version.  this is not
covered in the benchmark, but it is a trivial step, as in aips.

> CLCAL:  Apply calibrations               2    1         14
> 

these calibration steps are essentially consistent with the benchmark
reported in the CDR.  as described in the CDR perf talk, the solve steps
will benefit greatly from handling the trivial model case, and the apply
has some identified optimizations that haven't actually been implemented
yet.  also, recent investigations indicate that the cal apply improves
substantially if calibration is applied to different fields separately.
since this is mosaic (more fields) this is a bigger problem here.  the
script can be written to do better, and (better), the code (MS
Iterator) can be tuned to resolve this as well.  finally, the fact that
the polarizations are stored separately is probably causing a small
performance hit.  

>         extra flagging of calib data                   178
> 

again, autoflag would be better.  there are some thing to check in
calibrater that may make this step unnecessary.

> IMAGR:  Image and clean 0336 (200 comp) 70   65
> IMAGR:  Image and clean 0336 (200 comp) 71   65        210   Made for 
> both freqs.
> 

these are continuum images.  in aips++, the algorithm is mfs.  this
example is actually better than the case reported in the CDR perf talk.

> IMAGR:  Dirty image of 10 sources                            Not done yet
>         channels 4 to 57 by 3
>         18 images x 10 sources         
>         512x512, 0.5" pixels           240  177
>                                       ---------
> TOTAL TIME                             721  498      >2000   guessing 
> for last
>                                                              imager

kumar will be helping ed get the mosaicing working on the aips++ side. it
will be interesting to compare linear mosaicing results with the joint
deconvolution in terms of image quality (and necessity or not) and balance
this against the performance question.

==============================================================================