SD Prototype Tasks ------------------ Updated 2007-03-06 STM This file is available online at: http://www.aoc.nrao.edu/~smyers/naug/tasks/readme.txt After sitting down with the ASAP manual and the various tool-based regression scripts, I decided to make a go at writing my own tasks for Single Dish calibration, plotting, statistics and line-fitting using ASAP. This ended up taking a day to get the first trial task running (mostly figuring out Python scripting tricks) and a number of days refining the tasks. This is a good proof-of-concept for scientists contributing to CASA task-writing using Python, and can serve as prototypes for more detailed tasks if desired. Note that these tasks are not fool-proof, and still have some buggy modes (particularly the sdfit task when data is in multiple IFs). For those cases, you can dig into the toolkit using these tasks or the regression scripts as a springboard. You can find the latest versions of my tasks in the NAUG archive at: http://www.aoc.nrao.edu/~smyers/naug/tasks/ In addition, the most-recent versions checked into the CASA system reside in AIPSPATH/code/xmlcasa/scripts/, e.g. for the stable version: /home/casa/code/xmlcasa/scripts/ These are made available when the user does asap_init ------------------------------------------------------------------------- Overview of the SDtasks ------------------------------------------------------------------------- File Task Purpose ____ ____ _______ sdcal.py sdcal - select/calibrate/average/smooth/baseline-fit SD data sdfit.py sdfit - line fitting sdlist.py sdlist - print a summary of a dataset sdplot.py sdplot - plotting of spectra sdstat.py sdstat - region stats NOTE: All the sdtasks work from a file on disk rather than from a scantable in memory. Inside the tasks we invoke a call to sd.scantable to read in the data. The task sdcal is the workhorse for the calibration, selection, averaging, baseline fitting, smoothing, and writing of datasets. It is the only SDtask that can write out a dataset. Its operation is controlled by three main "mode" parameters: calmode (which selects the type of calibration, if any, to be applied), kernel (which selects the smoothing), and blmode (which selects baseline fitting). There are also parameters controlling the selection such as scanlist, iflist, field, scanaverage, timeaverage, polaverage. Note that sdcal can be run with calmode='none' to allow re-selection or writing out of data that is already calibrated. There is a "wiring diagram" of the dataflow and control inputs for sdcal. You can find this at: http://www.aoc.nrao.edu/~smyers/naug/tasks/wiring_diagram_sdcal.txt This might help you chart your course through the calibration. Input formats currently supported: ASAP (scantables), MS (casa measurement set), RPFITS and SDFITS (flavors of FITS). Output formats currently supported: ASAP (scantables), MS (casa measurement set), ASCII (text file), SDFITS (a flavor of SD FITS). You can get a brief summary of the data in a file using the sdlist task. Plotting of spectra is handled in the sdplot task. It also offers some selection, averaging and smoothing options in case you are working from a dataset that has not been split or averaged. Note that there is some rudimentary plotting capability in the sdcal and sdfit tasks, controlled through the plotlevel parameter, to aid in the assessment of the performance of these tasks. Basic statistics on spectral regions is available in the sdstat task. Results are passed in a Python dictionary return variable xstat. Basic Gaussian line-fitting is handled by the sdfit task. It can deal with the simpler cases, and offers some automation, but more complicated fitting is best accomplished through the toolkit (sd.fitter). ------------------------------------------------------------------------- A Guided Tour of the SDtasks ------------------------------------------------------------------------- A "usecase" or detailed walk-through of one of the test cases is there as run_sdusecase_orions.py This serves as a tutorial for the use of the SDtasks and is heavily annotated. It can also be run as a test script. ------------------------------------------------------------------------- Some regression scripts are also available: run_sdregress_orions.py run_sdregress_irc.py run_sdregress_fls3hi.py run_sdregress_mopra.py These are based upon the datasets being used for ALMA Test 5. You may have to change the preamble to this to point to the data in your system. ------------------------------------------------------------------------- Loading your own versions of the SDtasks ------------------------------------------------------------------------- NOTE: These are currently loaded into daily CASA, so you don't need to do anything (STM 2007/02/28). I provide these if you are running a version without them, or for hacking. In that case, put the versions you want into sdtasks.py (cut out the old ones and past yours in), then instead of doing asap_init in CASA, do from sdtasks import * You can pick up the sdtasks.py file which contains all the individual tasks: sdtasks.py - contains the prototype tasks that are meant to be loaded from your current directory using from sdtasks import * and uses the "sdinp" to get inputs, not "inp" All the above SD tasks are contained in sdtasks.py If you want to load the sd tasks individually, pick up sdinp.py sdinp, sdsave - inp and saveinputs for the sd tasks This is included in sdtasks.py There is also a local version of sdplot that uses a line catalog jpl.tbl in the local directory instead of jpl_asap.tbl in $AIPSPATH/data/catalogs/lines/ sdplot_local.py --> looks for jpl.tbl in local directory jpl.tgz --> tarball of jpl.tbl line catalog ------------------------------------------------------------------------- Known Issues, Problems, Deficiencies and Features ------------------------------------------------------------------------- There are a number of issues with ASAP and the SDtasks that are known and are under repair. Some of these are non-obvious "features" of the way ASAP or sd is implemented. These currently include: sd.plotter Currently you can get hardcopy only after making a viewed plot. Ideally, ASAP should allow you to choose the device for plotting when you set up the plotter. Multi-panel plotting is poor. Currently you can only add things (like lines, text, etc.) to the first panel. Also, sd.plotter.set_range() sets the same range for multiple panels, while we would like it to be able to set the range for each independently, including the default ranges. The appearance of the plots need to be made a lot better. In principle matplotlib can make "publication quality" figures, but in practice you have to do alot of work to make it do that, and our plots are not good. The sd.plotter object remembers things throughout the session and thus can easily get confused. For example you have to reset the range sd.plotter.set_range() if you have ever set it manually. This is not always the expected behavior but is a consequence of having sd.plotter be its own object that you feed data and commands to. Eventually we would like the capability to interactively set things using the plots, like select frequency ranges, identify lines, start fitting. sd.selector The selector object only allows one selection of each type. It would be nice to be able to make a union of selections (without resorting to query) for the set_name - note that the others like scans and IFs work off lists which is fine. Should make set_name work off lists of names. sd.scantable There is no useful inline help on the scantable constructor when you do "help sd.scantable", nor in "help sd" The inline help for scantable.summary claims that there is a verbose parameter, but there is not. The scantable.verbosesummary asaprc parameter (e.g. in sd.rcParams) does nothing. GBT data has incorrect fluxunit ('Jy', should be 'K'), freqframe ('LSRK', is really 'TOPO') and reference frequency (set to that of the first IF only). You cannot usually set the rest frequencies for GBT data, mostly because scantable.set_restfreqs iterates over IFs and if you have missing IFs (like you selected 0 and 2) it will crash dealing with the missing ones. You can do it at the beginning when you have consecutive IFs but only for consecutive ones starting from 0. It should work on the IFs that are actually in the scantable, and in that order. THIS IS THE MOST SERIOUS BUG RIGHT NOW. Need to add to scantable.stats: maxord, minord - the ordinate (channel, vel, freq) of the max/min general We should have CASA use the ASAP scantables rather than requiring conversion to MS. There should be a "sdhelp" equivalent of toolhelp and tasklist for the sd tools and tasks. The current output of ASAP is verbose, and is controlled by setting sd.rcParams['verbose']=False (or True). At the least we should make some of the output less cryptic. Strip off leading and trailing whitespace on string parameters. sdtasks general The sdtasks work off of files saved onto disk in one of the scantable supported formats. It might be useful to be able to work off of scantables in memory (passing the objects) but this would require changes to the tasking system. Note that this behavior is consistent throughout the casapy tasks. sdcal Can crash if timeaverage=True and/or polaverage=True and you give a list of scans that contain a combination of IFs. We need to make the tools smarter about this, but in the meantime you should restrict your scanlist and iflist to scans with the same set of IFs. sdfit Handles multiple IFs poorly (a general problem currently in the package). No way to input guesses. ------------------------------------------------------------------------- Final Words ------------------------------------------------------------------------- Comments on these tasks are welcome! I hope you find them useful. Steve Myers CASA Project Scientist