AUTOLEAN_V2.PY The autoclean function iteratively cleans an image in CASA using an auto-boxing algorithm similar to that outlined in AIPS Memo #115. For spectral line data cubes, autoclean cleans/boxes one channel at a time, searching efficiently for new clean regions in each channel. Autoclean starts by making a dirty image from which it chooses initial clean boxes. It then cleans using CASA's CLEAN task, chooses more boxes if necessary, and continues in this fashion until it reaches a stopping point, determined by the user's input parameters. For spectral line data, this involves creating a set of temporary images which are then concatenated into a single final image. Autoclean writes a "saved" file with input parameters for CLEAN, such that the image can be CLEANed from the start using the final autoclean mask. Note: it is possible for the user to input an initial mask (file, region, or boxes) which will be applied to all channels. Autoclean is currently written as a python function. To use autoclean in CASA, put the autoclean_v2.py into your python path. Then: CASA : import autoclean_v2 CASA : autoclean_v2.autoclean( ) This program is under development. We welcome your comments: akimball@nrao.edu HOW NEW CLEAN REGIONS ARE CHOSEN (in each channel) and some important input parameters 1. Select potential clean regions by finding "islands": contiguous sets of pixels above some threshold value (similar to the ISLAND procedure developed for AIPS). The threshold is equal to 'island_rms' * rms, where rms is the current rms in the residual image (outside of the current clean regions), and 'island_rms' is an input parameter with a default value of 4. 2. Narrow down potential clean regions further by making selections based on the peak pixel in each island. Islands are rejected if their peak pixel is less than the maximum of ('peak_rms' * rms; 'gain_threshold' * max_residual), where rms is the rms in the residual image outside of the current clean regions, max_residual is the current maximum residual 'peak_rms' is an input parameter with a default value of 6, and 'gain_threshold' is an input parameter with a default value of 0.1. Islands are also rejected if they are one pixel wide in either direction, unless the peak is more than 2.5 times the larger of the two cutoffs given earlier in this paragraph. 3. The brightest 'Npeak' of the remaining islands (based on their peak pixel) are retained, where 'Npeak' is an input parameter with default value of 3. 4. A new clean region is added for each island's peak which is not already in the clean mask. 5. Shape: Clean regions can be circles, boxes, or either, depending on the input parameter 'shape' (0 for circles, 1 for boxes, 2 for both; default value is 2). If input parameter 'shape'=2, the shape of each region is determined by the island's sizes in the x and y direction. If x and y sizes are within one pixel, the new region is a circle, otherwise it is a rectangle. 6. Size: For circular clean regions, the diameter of the circle is equal to the larger of the x and y sizes of the island and is centered on the center of the island. For rectangular clean regions, the box is placed over the island, using corners returned by the "island" procedure (see point 1 above). The size of a circular or rectangular clean region is extended outward by a number of pixels equal to 'boxstretch', an input parameter with a default value of 1. 'boxstretch' must be between -1 and 5, inclusive. WHEN DOES AUTOCLEAN STOP CLEANING A CHANNEL? Autoclean will continue cleaning and searching for new clean boxes until one of the following conditions is met (before moving on to the next channel): 1. The current maximum residual is less than input parameter 'clean_threshold', with a default value of 0. Parameter 'clean_threshold' gets passed to the clean task as its own input parameter 'threshold'. 2. The current maximum residual is less than 'Nrms' * rms, where rms is the rms outside of the current clean regions, and 'Nrms' is an input parameter with a default value of 6. For small images (256 X 256), a smaller value of 'Nrms'~5 is more appropriate. A value of ~6 is appropriate for 2048 X 2048. 3. The number of clean/autoboxing iterations reaches 'maxcycle', an input parameter with a default value of 20. SUMMARY OF AUTOCLEAN'S INPUT PARAMETERS outputfile: redirect standard output to this file, if given. (default='') maxcycle: maximum number of clean/autobox iterations before stopping (default=20) clean_threshold: autoclean stops if the current maximum residual is less than this value. Doubles as clean's input parameter 'threshold'. Can be strings, e.g. '1.2mJy' or a single float in Jy (default=0) Nrms: autoclean stops if the current maximum residual is less than 'Nrms' * rms, where rms is determined outside of the current clean regions. (default=6) island_rms: threshold for identifying "islands", i.e. potential clean regions, made of contiguous pixels above threshold value of 'island_rms'*rms. This parameter does not affect where clean regions are placed, but it determines their size. (default=4) peak_rms: potential clean regions are rejected if their peak is less than 'peak_rms'*rms. (default=6) gain_threshold: potential clean regions are rejected if their peak is less than 'gain_threshold' * current maximum residual. Helps to keep from putting clean regions around sidelobes. (default=0.1) Npeak: maximum number of new clean regions added in each iteration. (default=3) shape: determines shape of clean regions. 0 = all circles. 1 = all rectangles. 2 = circle or rectangle, depending on intrinsic shape of the "island". (default=2) boxstretch: number of pixels to increase size of clean region beyond the size of the given "island". (restricted to -1 --> 5, default=1) AUTOCLEAN ALSO NEEDS CLEAN'S INPUT PARAMETERS Currently, Autoclean is a python function, not a CASA task. Thus any CLEAN task parameters (other than the default values) must be passed to autoclean as input parameters. Autoclean_v2 supports the following CLEAN input parameters: vis, imagename, field, spw, selectdata, mode, nchan, start, width, interpolation, niter, gain, psfmode, imagermode, cyclefactor, cyclespeedup, imsize, cell, phasecenter, restfreq, weighting, robust, npixels. Other CLEAN task parameters which will remain at the default values (for now): multiscale=[], interactive=False, mask=[], stokes='I',uvtaper=False, modelimage='', restoringbeam=[''], pbcor=False, minpb=0.1, async=False CLEAN's threshold parameter is set with autoclean input parameter clean_threshold (see above). -------------------------------------------------------------- EXAMPLE USING A VLA SNAPSHOT IMAGE AT 3.6 CM The autoclean code can be tested easily using some example data from the CASA Training page (casa.nrao.edu/casatraining.html). Data link: http://casa.nrao.edu/Data/VLA/B1608+656/AM484_B950818.xp1 Demo link: http://casa.nrao.edu/Doc/Scripts/b1608_demo.py (interactive) or http://casa.nrao.edu/Doc/Scripts/b1608_demo_regression.py (non-interactive) After running the demo, you can run autoclean on some of the output data (after one stage of self-calibration). Or you can grab the data already calibrated by this script as a tarball: http://www.aoc.nrao.edu/~smyers/naug/Data/AutoClean/b1608_data.tgz Copy/paste the following lines in CASA: vis = 'b1608.demo.1608+656.split.ms' imagename = 'b1608.demo.autoclean' outputfile = 'b1608.demo.autoclean.out' autoclean_v2.autoclean( vis=vis, imagename=imagename, island_rms=4, peak_rms=6, Npeak=3, gain_threshold=0.3, clean_threshold=0, Nrms=5, niter=50, maxcycle=10, shape=2, psfmode='hogbom', imagermode='csclean', cell=[0.05,0.05], imsize=[256,256], weighting='briggs', robust=0.5, boxstretch=3 ) You can then view the chosen clean regions by looking at the 'b1608.demo.autoclean.mask' in CASA's viewer(). To reclean from the beginning using autoclean's mask: default(clean) imagename = 'b1608.demo.autoclean' execfile(imagename+'.final.saved') clean() -------------------------------------------------------------- EXAMPLE USING A MULTIFREQUENCY VLA IMAGE OF IC2233 Data link: /home/rishi2/sanjay/CASARegressions/TmpTests/IC2233/ic2233_regression_data/ic2233.lband.ms Also available as a tarball from: http://www.aoc.nrao.edu/~smyers/naug/Data/AutoClean/ic2233_data.tgz This is a 2048x2048 86-channel image and takes about 45min (on my NRAO machine) to run with the following input parameters: vis = 'ic2233.lband.ms' imagename = 'imIC2233.autoclean' imsize = 2048; spw = '0' cell = "4arcsec"; maxcycle = 20 Nrms = 6 Npeak = 3 island_rms = 4 peak_rms = 6 gain_threshold = 0.4 boxstretch = 3 weighting = 'natural' imagermode = 'csclean' robust = 0.5 psfmode = 'hogbom' autoclean_v2.autoclean( vis=vis, imagename=imagename, imsize=imsize, cell=cell, spw=spw, maxcycle=maxcycle, Nrms=Nrms, Npeak=Npeak, island_rms=island_rms, peak_rms=peak_rms, gain_threshold=gain_threshold, boxstretch=boxstretch, weighting=weighting, imagermode=imagermode, robust=robust, psfmode=psfmode ) Note that if you take the above parameters plus the final mask generated by the autoclean ('imIC2233.autoclean.mask') and do a single clean (e.g. with niter=100000 and threshold='0.00014Jy') it will take about 9min (vs. 45min), so the autoboxing is about 5 times slower than just cleaning with the final boxes in this example. -------------------------------------------------------------- EXAMPLES USING A VLA IMAGE OF NGC5921 Demo link: http://casa.nrao.edu/Doc/Scripts/ngc5921_demo.py After running the demo, you can run autoclean on the resulting MS 'ngc5921.demo.src.split.ms.contsub'. This calibrated/contsub MS is also available as a tarball from: http://www.aoc.nrao.edu/~smyers/naug/Data/AutoClean/ngc5921_data.tgz For example, the following parameters will autoclean in channel mode: vis = 'ngc5921.demo.src.split.ms.contsub' imagename = 'ngc5921.autoclean.channel' outputfile = imagename+'.out' autoclean_v2.autoclean( vis=vis, imagename=imagename, island_rms=4, peak_rms=6, Npeak=1, gain_threshold=0.3, clean_threshold=0, Nrms=5, niter=50, maxcycle=10, shape=2, boxstretch=3, psfmode='hogbom', imagermode='csclean', cell=[15,15], imsize=[256,256], weighting='briggs', robust=0.5, mode='channel', start=5, nchan=46) The "saved" file will be imagename+'.final.saved'. To clean with the final mask: default(clean) imagename = 'ngc5921.autoclean.channel' execfile(imagename+'.final.saved') clean() # (*See note below at bottom of file) -------------------------------------------------------------- Suggested parameters to run in 'velocity' mode: vis = 'ngc5921.demo.src.split.ms.contsub' imagename = 'ngc5921.autoclean.channel' outputfile = imagename+'.out' autoclean_v2.autoclean( vis=vis, imagename=imagename, island_rms=4, peak_rms=6, Npeak=1, gain_threshold=0.3, clean_threshold=0, Nrms=5, niter=50, maxcycle=10, shape=2, boxstretch=3, psfmode='hogbom', imagermode='csclean', cell=[15,15], imsize=[256,256], weighting='briggs', robust=0.5, mode='velocity', start='1506km/s', width='5km/s', nchan=20) -------------------------------------------------------------- Suggested parameters to run in 'frequency' mode: vis = 'ngc5921.demo.src.split.ms.contsub' imagename = 'ngc5921.autoclean.channel' outputfile = imagename+'.out' autoclean_v2.autoclean( vis=vis, imagename=imagename, island_rms=4, peak_rms=6, Npeak=1, gain_threshold=0.3, clean_threshold=0, Nrms=5, niter=50, maxcycle=10, shape=2, boxstretch=3, psfmode='hogbom', imagermode='csclean', cell=[15,15], imsize=[256,256], weighting='briggs', robust=0.5, mode='frequency', start='1.4128GHz', width='30kHz', nchan=35) -------------------------------------------------------------- * Note: When running CLEAN on a data cube using autoclean's output mask file and with the ".final.saved" parameters, some channels can diverge resulting in poor results. Reducing the cleaning depth, such as setting "cyclefactor = 2" (default value is 1.5) can help alleviate this problem. --------------------------------------------------------------