From akemball@aoc.nrao.edu Wed Aug 15 17:49:45 2001
Date: Wed, 15 Aug 2001 17:27:48 -0600 (MDT)
From: Athol Kemball <akemball@aoc.nrao.edu>
To: fowen@zia.aoc.NRAO.EDU, efomalon@cv3.cv.nrao.edu
Cc: aips2-naug@zia.aoc.NRAO.EDU
Subject: Re: AIPS - AIPS++ Imaging test


> From efomalon@cv3.cv.nrao.edu Fri Aug 10 08:38 MDT 2001
> Date: Fri, 10 Aug 2001 10:38:44 -0400
> 
> Hello all,
> 
>     Here is a note that I just submitted to the aips++ group.
> On the whole it demonstrates that the aips++ cleaning algorithms work
> well.  But, they are about a factor of three times slower than those in
> aips for this limited testing range.

Ed,

We've looked at this over the past few days and don't see such
large differences using the same dataset. There are possibly
several factors at work which we should examine first before
proceeding:

1) Memory configuration: AIPS++ has a memory configuration
   control (search aipsrcdata on the AIPS++ web page) which
   is set in the .aipsrc file. It is 64 MB by default but
   should be set to the true available memory (786 MB on this
   system). In .aipsrc, set:

     system.resources.memory:   786

   This controls the gridding cache size in imager (see also
   imager.setoptions() in the user ref man). A conservative
   gridding cache is chosen by default so the available
   system memory should be properly specified and can
   make a considerable difference in gridding performance.

2) Scratch file placement. AIPS++ uses temporary scratch
   files like AIPS - these shouldn't be on NFS-mounted
   disks. In .aipsrc, set:

     user.directories.work:  .

   It's worth checking this is not set to a remote disk.

In a direct Clark CLEAN on these data using the same
imaging parameters we see nearly identical system and
user CPU times (1.5 in wall-clock with AIPS++ slower;
could be several reasons for the latter). The AIPS++ 
wfclark algorithm is doing a more aggressive 
model reconciliation than the clark algorithm. This
can be finely controlled by imager.setmfoptions(). In
the case here, with nfacets=1, this control can be
varied to make the wfclark more like the clark algorithm,
and a closer test with AIPS.

We have used benchmark tests to profile AIPS++ against
AIPS and MIRIAD several times in the past but we have
decided to formalize this as a benchmark suite over
a range of imaging parameters, dataset sizes and
deconvolution algorithms, and to monitor these results
on our web page. We are also using profiling tools this
week to check whether any code has drifted off in 
performance (which can happen as code is modified for 
other reasons over time). This will take a little
time but we will summarize those results separately.

Thanks again for running the tests.

Athol
     

> 
> 
>           A COMPARISON OF AIPS++ and AIPS IMAGING SOFTWARE:
> 
>                        August 10, 2001
> 
>                          Ed Fomalont
> 
>     This memo gives the results of a simple comparison between the
> imaging and cleaning methods which are available in aips and aips++.
> A simple data set is used and the comparison concentrates on two
> aspects: the execution time (a little on disk space) and the type of
> cleaning method for which aips++ has several options.
> 
>     The tests were made on a relatively empty linux machine with 786M
> of memory.  The execution times are given in seconds.  They are about
> 1.3 times longer than the cpu times unless otherwise noted.
> 
> SUMMARY OF COMPARISON:
> 
> 1.  At the level of a dynamic range of about 50, all aips++ and aips
> convolution methods produce clean images which are in good agreement.
> The field emission is relatively simple, so that these experiments are
> not a test of the ultimate accuracy of the various algorithms.
> 
> 2.  The typical execution times for aips++ deconvolutions range from
> 3 to 7 times slower than the comparable aips deconvolutions.  It is
> unclear how this will scale with larger data sets.
> 
> 3.  The number of deconvolution options in aips++ are somewhat confusing
> and overlapping and need a bit more explanation.
> 
> ---------------------------------------------------------------------------
> 
> THE DATA SET
> 
>    The data set contains 2 hours of data (over a span of 8 hours) at 5
> GHz from the VLA with two IF's and 10-sec data sampling.  It contains
> 180,000 data points and is a typical relatively small wide-band
> continuum data set from the VLA.  It is 1/32th of the complete data
> set spread over eight days at four mozaic pointings.  The data were
> calibrated and edited in aips, and then tranferred to aips++ using
> fits data sets.  All weights were set to 1.0 before transferral.
> Some problem with weight transfer is being looked into.
> 
> 
> READING DATA FROM DISK:
> 
>     System        real time           size of all files
> 
>      aips++          55                   19.9 mb 
>      aips            32                   10.8 mb  (compressed)
>                                           20.5 mb (uncompressed)
> 
>     The timing difference is not very significant.  The file sizes
> reflects the general tendency that a u-v data base in aips++ is about
> twice the size of a comparable compressed data base in aips, but about
> the same size as an aips uncompressed u-v data set.
> 
> 
> MAKING DIRTY IMAGES AND BEAMS:
>    
>     The image size was 2048x2048 with 1" cell size.  This covers the
> primary beam for the VLA 5 GHz, C-configuration data.  Natural
> weighting was used to make the comparison as similar as possible.  The
> comparison of execution between aips++ and aips are:
> 
>     System        real time(s)     Tasks
> 
>      aips++          42            making 3 cols (only once)
>                      66              make beam
>                      66              make map
>                     132            Total time (excluding 3 col formation)
> 
>      aips            22            Total time for map and beam
> 
>     The execution in aips++ is a factor of six longer than in aips,
> even without the overhead of making the three columns on initial use
> of imager.
> 
> 
> CLEANING:
> 
>     The radio emission is relatively simple and confined to the inner
> quarter of the radio field.  The dynamic range in the image is 50, so
> that all deconvolution processes should work reasonably well.  The
> field will be cleaned with 2000 iterations with no boxes.
> 
>     Two deconvolution procedures are available in aips: Hobgom clean
> using the dirty image and beam (APCLN), which does not go back to the
> u-v data.  Clark clean done in IMAGR has a combination of image
> subtraction and u-v subtraction during the cleaning process.
> 
>     The aips++ counterparts, I think, are as follows:
> 
>   Task                Parameter         Execution time (sec) 
> 
>   aips task           APCLN              72 
>   aips++ algorithm    hogbom            356
>   aips++ algorithm    deconvolver       >1000 (something wrong with the iterator)
> 
>   aips task           IMAGR             205 
>   aips++algorithm     wfclark           750
>   aips++algorithm     clark+restore     380
> 
> 
> COMMENTS on AIPS++ Deconvolution methods:
> 
> 1.  The Hogbom cleans take much too long.  Part of the problem is that
> the algorithm for choosing when to stop finding peaks on the image and
> subtracting the beam patterns needs some improvement.
> 
> 2.  What's the difference between the clark and wfclark algorithms if
> only one field is cleaned?  I am guessing but I think the `clark'
> algorithm does the u-v subtraction by making a simple fft of the clean
> components and in this way may be little different from the hogbom
> clean.  I believe that `wfclark' algorithm does the subtraction of the
> clean components more accurately.  This is why the whole image is
> cleaned with wfclark and only the inner quarter area with clark,
> before the restore step.  IMAGR does clean all of the image except at
> the very boundries.  In any case, more discussion of these
> deconvolving algorithms are needed and recommendations of their use.
> 
> 3.  The aips++ deconvolvers work satisfactorily.  If they can be
> sped-up by a factor of about three in execution speed, they would have
> about the same efficiency as the comparable processors in aips - at
> least for these moderately small data sets.
>