Comparison of AIPS++ and AIPS Imaging/Deconvolution Speeds Ed Fomalont January 10, 2002 SUMMARY I have compared the timing of several imaging and deconvolution tasks in AIPS++ and AIPS using a calibrated and edited data set with 627,000 points from 6-hours of continuum observing with the VLA. On the average, the cpu times for running imaging/deconvolution tasks with AIPS++ are about a factor 5 longer than that for AIPS. Even with the relatively simple task of making a dirty beam and a dirty image, AIPS clearly outperforms AIPS++; 12 s versus 70 s (see details below), using a Dual/Xcon 1.7 GHz cpu. The AIPS and AIPS++ images agree extremely well. The field contains over 100 sources, mostly unresolved, with a peak flux density of 40 mJy and an rms noise of 40 uJy - 1000:1 dynamic range. The quality of the AIPS++ image is marginally better than the AIPS image. The summary of the comparison times are as follows: COMPARATIVE EXECUTION SPEEDS TASK CPU TIME (sec) Comments AIPS++ AIPS Reading in U-V data 470 4 AIPS++: 200 Mb data volume on disk Writing U-V data 90 3 AIPS: 40 Mb data volume on disk 73 Mb uncompressed Making dirty map and beam 70 12 2kx2k image APCLN (cleaning images) 47 19 2kx2k, 4000 iterations IMAGR (Wide field clean) 466 70 2kx2k, 4 major cycles 2000 components IMAGR (9 facet clean) 2767 835 2kx2k, 4 major cycles 2000 components IMAGR (25 facet clean) 10000 1445 2kx2k, 4 major cycles, 4000 components Selfcalibration of data set 346 168 1. The Data Set: The input data set consists of 6 hours of VLA data at 1.4 GHz, taken in the CnB configuration, in a standard UV-FITS file of size 75 Mb. The data base contains 627,000 u-v points with 2 IF's (1.465 and 1.385 GHz), each with four Stokes parameters. The data set was calibrated, edited and passed through one iteration of phase-self-cal in AIPS in order to obtain a data set which was ready for deep imaging and cleaning. The data after these calibrations is stored in: efomalon@thuban.aoc.nrao.edu:/DATA/THUBAN_1/FITS/efomalon/AXAFL_BEST efomalon@bonobo.cv.nrao.edu:/DATA/BONOBO_1/aips++/AXAFL_BEST The AIPS++ scripts and AIPS run files used in these comparisons are available in this same directory. 2. The Comparisons: The timing results presented here are the cpu times using bonobo in Charlottesville (192.33.115.174) with a 1.7 GHz processor and 1 Gb of memory. The system.resources.memory was set to 800 Mb. Other computers with different resources have been used for similar tests over the last month, and the ratio of AIPS++/AIPS execution times was nearly identical across a change of resource power of about a factor 10. None of the machines used less than 256 Mb of memory. The computers were virtually empty except for these executions, although the cpu time does not vary much with moderate machine loads. One curious time interval, 35 s, seems to be present in many AIPS++ reduction steps for this data base. My uninformed guess is that this period may be the time needed to read through this data set in order to calculate, for example, each of the many residual images as the deconvolution and the faceting progress. For example, in making each facet during the widefield imaging, AIPS++ took 35 s whereas AIPS only took about 4 s. 3. Detailed comparison of execution speeds: The following table compares the execution time (cpu time) in the computer bonobo in Charlottesville (192.33.115.174) with a 1.7 GHz processor and 1 Gb of memory. The visibility data, scripts and run files are in the directory indicated above. A. Reading/Writing from/to a UVFITS file SCRIPT SYSTEM EXECUTION TIME DATA BASE SIZE fitstoms.g AIPS++ 470 s 200 Mb F2MS.* AIPS 4 s 40 Mb (73 Mb uncomp) mstofits.g AIPS++ 90 s MS2F.* AIPS 3 s COMMENTS: All UVFITs data bases were in the working directory of AIPS or AIPS++. The execution time for AIPS++ is slower than what I have remembered in the past, and the difference in reading and writing time for the fits format is surprisingly large. B. Generate a dirty image and beam. 2048x2048 image. SCRIPT SYSTEM EXECUTION TIME and FUNCTION dirtymapbeam.g AIPS++ 7 s Set data (36 s) Initialize three column (35 s) Image weighting (38 s) UVrange (18 s) filter 35 s Make beam 38 s Make image ------- 70 s TOTAL (no weighting changes) DIMAGE.* AIPS 12 s Complex image (map and beam) 1 s Adding several weighting options ------- 13 s TOTAL COMMENTS: AIPS makes a map and beam (one complex image) in 12 s. If you specify uvranges, filters, tapers, the execution time increases by 1 s. AIPS++ takes about 35 s to make a map and 35 s to make a beam and 7 s to specify the data. If you further weight or taper the data, then up to 100 s can be added to the execution time. This task would be a good place to begin the investigation of the timing differences. Making dirty images from the residual data is a basic part of nearly all of the deconvolution methods. Hence, improvements in efficiency of making the dirty images would transfer into better efficiency of most of the imaging and deconvolution tasks. C. APCLN (Clark clean of dirty images) APCLN.g AIPS++ 47 s 4000 iterations 8 major cycles APCLN.* AIPS 19 s 4000 iterations 8 major cycles Comments: Using the AIPS++ script, I made an APCLN like execution which cleans an image from the dirty map and beam without going to the u-v data. This type of cleaning is relatively accurate in the inner quarter of the field and is much quicker for large data bases. In this task the AIPS++/AIPS comparison is better than usual, with a ratio of the deconvolution times of 2.5. This type of deconvolution does not utilize the u-v data; hence may not be subject to possible inefficiencies which might be associated with uv-data access in AIPS++. D. Wide Field Clark Clean: 2048x2048 for 2000 clean iterations. Clean most of field SCRIPT SYSTEM EXECUTION TIME imagr.g AIPS++ 350 s Cleaning 2000 iter., 2 major cycles 486 s Cleaning 2000 iter., 5 major cycles DCLARK.* AIPS 35 s Cleaning 2000 iter., 2 major cycles 70 s Cleaning 2000 iter., 5 major cycles COMMENTS: This is the basic high-fidelity imaging task which is most commonly used in AIPS, IMAGR. Most of the image area can be cleaned using this method since the clean components are subtracted directly from the data. One has to be careful to make an equivalent AIPS++ and AIPS test, since each program has somewhat different controls. I cleaned down to the same residual level and tweaked parameters so that the same number of major cycles were used. The images from AIPS++ and AIPS were in good agreement, with the AIPS++ image marginally cleaner. E. Wide Field Clark Clean with 3x3 facets cleaned to same depth ~2000 iterations SCRIPT SYSTEM EXECUTION TIME clarkclean.g AIPS++ 7 s Set data 2760 s Cleaning facets, 3 major cycles 2767 s TOTAL FACETS9.* AIPS 460 s Cleaning facets, 3 major cycles 55 s Glue together 835 s TOTAL Comments: The quality of both images are nearly the same. The execution time depends on the number of major cycles which is difficult to control in both systems. Most of the execution time is taken in making each of the residual images for each facet after each major cycle. For AIPS++ each facet takes about 35 s to make; for AIPS each facet takes about 4 s to make. This is where the difference in timing is coming from. F. Wide Field Clark Clean with 5x5 facets cleaned to same depth SCRIPT SYSTEM EXECUTION TIME clarkclean.g AIPS++ 7 s Set data 4860 s Cleaning facets, 2 major cycles 19348 s Cleaning facets, 10 major cycles 10000 s TOTAL (estimated for 4 major cycles) FACETS9.* AIPS 1390 s Cleaning facets, 4 major cycles 55 s Glue together 1445 s TOTAL Comments: The same comments apply here for the 25 facet clean as for the 9 facet clean. By the way, the recommended number of facets for full field cleaning is 81, or 9x9. The AIPS++/AIPS execution ratio is worse for the 5x5 facet cleaning as compared with the 3x3 facet cleaning. This may be because of the inefficiency for AIPS++ to calculate a facet, compared with other parts of the algorithm. G. Selfcalibration of Field AIPS++ 36 s FFT to get model 272 s for solutions 38 s to correct 346 s TOTAL AIPS 167 s for solutions 1 s to correct 168 s TOTAL Comments: The difference in speeds for the selfcalibration steps are not too different. The major time is spent for obtaining the solutions, with AIPS++ about 50% slower than AIPS. The difference in time to correct the data is not surprising. AIPS makes a new CL table, whereas AIPS++ writes the calibrated data in the measurement set.