AUTOCLEAN_V1.PY Version 1.0 Original 2009-07-31 Amy Kimball first "release" Version 1.0 Updated 2009-08-03 STM updated README The autoclean function iteratively cleans an image in CASA using an auto-boxing algorithm similar to that outlined in AIPS Memo #115. It first creates a dirty image, from which it chooses initial clean boxes. It then cleans using CASA's CLEAN task, chooses more boxes if necessary, and continues in this fashion until it reaches a stopping point, determined by the user's input parameters. Autoclean is currently written as a python function. To use autoclean in CASA, put the autoclean_v1.0.py into your python path. Then: CASA : import autoclean_v1 CASA : autoclean_v1.autoclean( ) This program is under development. We welcome your comments: akimball@nrao.edu HOW NEW CLEAN REGIONS ARE CHOSEN (and some important input parameters) 1. Select potential clean regions by finding "islands": contiguous sets of pixels above some threshold value (similar to the ISLAND procedure developed for AIPS). The threshold is equal to 'island_rms' * rms, where rms is the current rms in the residual image (outside of the current clean regions), and 'island_rms' is an input parameter with a default value of 4. 2. Narrow down potential clean regions further by making selections based on the peak pixel in each island. Islands are rejected if their peak pixel is less than the maximum of ('peak_rms' * rms; 'gain_threshold' * max_residual), where rms is the rms in the residual image outside of the current clean regions, max_residual is the current maximum residual 'peak_rms' is an input parameter with a default value of 6, and 'gain_threshold' is an input parameter with a default value of 0.1. Islands are also rejected if they are one pixel wide in either direction, unless the peak is more than 2.5 times the larger of the two cutoffs given earlier in this paragraph. 3. The brightest 'Npeak' of the remaining islands (based on their peak pixel) are retained, where 'Npeak' is an input parameter with default value of 3. 4. A new clean region is added for each island's peak which is not already in the clean mask. 5. Shape: Clean regions can be circles, boxes, or either, depending on the input parameter 'shape' (0 for circles, 1 for boxes, 2 for both; default value is 2). If input parameter 'shape'=2, the shape of each region is determined by the island's sizes in the x and y direction. If x and y sizes are within one pixel, the new region is a circle, otherwise it is a rectangle. 6. Size: For circular clean regions, the diameter of the circle is equal to the larger of the x and y sizes of the island and is centered on the center of the island. For rectangular clean regions, the box is placed over the island, using corners returned by the "island" procedure (see point 1 above). The size of a circular or rectangular clean region is extended outward by a number of pixels equal to 'boxstretch', an input parameter with a default value of 1. 'boxstretch' must be between -1 and 5, inclusive. WHEN DOES AUTOCLEAN STOP CLEANING? Autoclean will continue cleaning and searching for new clean boxes until one of the following conditions is met: 1. The current maximum residual is less than input parameter 'clean_threshold', with a default value of 0. Parameter 'clean_threshold' gets passed to the clean task as its own input parameter 'threshold'. 2. The current maximum residual is less than 'Nrms' * rms, where rms is the rms outside of the current clean regions, and 'Nrms' is an input parameter with a default value of 6. For small images (256 X 256), a smaller value of 'Nrms'~5 is more appropriate. A value of ~6 is appropriate for 2048 X 2048. 3. The number of clean/autobox iterations reaches 'maxcycle', an input parameter with a default value of 20. SUMMARY OF AUTOCLEAN'S INPUT PARAMETERS outputfile: redirect standard output to this file, if given. (default='') maxcycle: maximum number of clean/autobox iterations before stopping (default=20) clean_threshold: autoclean stops if the current maximum residual is less than this value. Doubles as clean's input parameter 'threshold'. (default=0) Nrms: autoclean stops if the current maximum residual is less than 'Nrms' * rms, where rms is determined outside of the current clean regions. (default=6) island_rms: threshold for identifying "islands", i.e. potential clean regions, made of contiguous pixels above threshold value of 'island_rms'*rms. This parameter does not affect where clean regions are placed, but it determines their size. (default=4) peak_rms: potential clean regions are rejected if their peak is less than 'peak_rms'*rms. (default=6) gain_threshold: potential clean regions are rejected if their peak is less than 'gain_threshold' * current maximum residual. Helps to keep from putting clean regions around sidelobes. (default=0.1) Npeak: maximum number of new clean regions added in each iteration. (default=3) shape: determines shape of clean regions. 0 = all circles. 1 = all rectangles. 2 = circle or rectangle, depending on intrinsic shape of the "island". (default=2) boxstretch: number of pixels to increase size of clean region beyond the size of the given "island". (restricted to -1 --> 5, default=1) AUTOCLEAN ALSO NEEDS CLEAN'S INPUT PARAMETERS Currently, Autoclean is a python function, not a CASA task. Thus any CLEAN task parameters (other than the default values) must be passed to autoclean as input parameters. Autoclean_v1.0 supports the following CLEAN input parameters: vis, imagename, field, spw, selectdata, niter, gain, psfmode, imagermode, cyclefactor, cyclespeedup, imsize, cell, phasecenter, restfreq, weighting, robust, npixels. Other CLEAN task parameters which will remain at the default values (for now): mode='mfs', multiscale=[], interactive=False, mask=[], stokes='I', uvtaper=False, modelimage='', restoringbeam=[''], pbcor=False, minpb=0.1, async=False The threshold parameter is set with autoclean input parameter clean_threshold (see above). EXAMPLE USING A VLA SNAPSHOT IMAGE AT 3.6 CM The autoclean code can be tested easily using some example data from the CASA Training page (casa.nrao.edu/casatraining.html). Data link: http://casa.nrao.edu/Data/VLA/B1608+656/AM484_B950818.xp1 Demo link: http://casa.nrao.edu/Doc/Scripts/b1608_demo.py (interactive) or http://casa.nrao.edu/Doc/Scripts/b1608_demo_regression.py (non-interactive) After running the demo, you can run autoclean on some of the output data (after one stage of self-calibration). Or you can grab the data already calibrated by this script as a tarball: http://www.aoc.nrao.edu/~smyers/naug/Data/AutoClean/b1608_data.tgz For autoclean, copy/paste the following lines in CASA: import autoclean_v1 # if you didn't do this already vis = 'b1608.demo.1608+656.split.ms' imagename = 'b1608.demo.autoclean' outputfile = 'b1608.demo.autoclean.out' autoclean_v1.autoclean( vis=vis, imagename=imagename, island_rms=4, peak_rms=6, Npeak=3, gain_threshold=0.3, clean_threshold=0, Nrms=5, niter=50, maxcycle=10, shape=2, psfmode='hogbom', imagermode='csclean', cell=[0.05,0.05], imsize=[256,256], weighting='briggs', robust=0.5, boxstretch=3, outputfile=outputfile ) You can then view the chosen clean regions by looking at the 'b1608.demo.autoclean.mask' in CASA's viewer(). Note that this is snapshot data with some weak components and some residual calibration errors, so the autoboxing is quite sensitive to your choice of niter (per cycle), Nrms, and other parameters. You might try varying these for effect. EXAMPLE USING A MULTIFREQUENCY VLA IMAGE OF IC2233 Data link: /home/rishi2/sanjay/CASARegressions/TmpTests/IC2233/ic2233_regression_data/ic2233.lband.ms Also available as a tarball from: http://www.aoc.nrao.edu/~smyers/naug/Data/AutoClean/ic2233_data.tgz This is a 2048x2048 86-channel image and takes about 45min (on my NRAO machine) to run with the following input parameters: vis = 'ic2233.lband.ms' imagename = 'imIC2233.autoclean' imsize = 2048; spw = '0' cell = "4arcsec"; maxcycle = 20 Nrms = 6 Npeak = 3 island_rms = 4 peak_rms = 6 gain_threshold = 0.4 boxstretch = 3 weighting = 'natural' imagermode = 'csclean' robust = 0.5 psfmode = 'hogbom' autoclean_v1.autoclean( vis=vis, imagename=imagename, imsize=imsize, cell=cell, spw=spw, maxcycle=maxcycle, Nrms=Nrms, Npeak=Npeak, island_rms=island_rms, peak_rms=peak_rms, gain_threshold=gain_threshold, boxstretch=boxstretch, weighting=weighting, imagermode=imagermode, robust=robust, psfmode=psfmode ) Note that if you take the above parameters plus the final mask generated by the autoclean ('imIC2233.autoclean.mask') and do a single clean (e.g. with niter=100000 and threshold='0.00014Jy') it will take about 9min (vs. 45min), so the autoboxing is about 5 times slower than just cleaning with the final boxes in this example.