Date: Tue, 06 Jul 2004 08:55:26 -0400 From: Ed Fomalont To: Joe McMullin Cc: smyers@nrao.edu Subject: Reduction of AW602 General comments between AIPS and AIPS++ processing of AW602 Below are comments concerning the reduction of AW02 in AIPS and AIPS++. I have attached the aips script and the aips++ script for the reductions. I hope this is useful. Cheers, Ed 1. Reading in the data: AIPS and AIPS++ both read in the data in about the same time. However, the AIPS flexibility in tying the different spectral windows and polarization is much better, although this took a little time to figure out.. I made two AIPS data bases, each associated with the two basic frequency settings, and each containing RR and LL polarizations, and then the reductions then went smoothly. The 16 different spectral ID's in AIPS++ are confusing. It is unclear why the RR and LL polarizations are not put in the same spectral ID. I think the vlafiller should contain parameters which allow decreasing the number of spectral ID's. This will save a lot of confusion later in the reductions. All that is needed is a parameter which states that frequencies within xxxx kHz should be in the same spectral_id. For the AIPS run, xxxx=1000 (cparm(7) in fillm). There were difference in the AIPS and AIPS++ data sets. The major difference was that antenaa 24 was noise-like in the AIPS++ data base, but looks okay in the AIPS data base. There were other minor differences as well. 2. Setjy or running imager for the first time. Running setjy or the first time a user goes into imager is a definite downer to anyone running the system. The time spent in adding the three additional columns to the ms, initializing the model and corrected data and the natural weights take more run time that the AIPS time to calibrate and make all of the images (well, at least the dirty images). Although this step has been sped up a lot over the years, it is still unacceptably long. I have never gotten a satisfactory answer to this: Why can't these additional columns and their initialization be made when the data is written in the the ms by vlafiller? How much would this increase vlafiller? These columns are needed in almost all cases. 3. Calibration and Editing: AIPS is still faster than AIPS++ for calibration and editing. For example, the filtering of all points greater than 2 Jy took significantly longer in AIPS++, and the other flagging functions were a bit slower in AIPS++. The use of flagging tables in AIPS clearly is very efficient compared with flags in the data base. Determining the gain calibration and the bandpass calibration is about a factor of three faster in AIPS. This difference is somewhat contrary to the work and effort that George put in to speed up calibration; I'm not sure why there is a difference. The need to use spwmap made life complicated, and I finally figured out what to do. Defects have been supplied. What I want to do to apply the bandpass is the following: sp_id 13 --> sp_id 1,5,9,13 sp_id 14 --> sp_id 2,6,10,14 sp_id 15 --> sp_id 3,7,11,15 sp_id 16 --> sp_id 3,7,12,16 After an hours of fooling around, I found that spwmap = [13,14,15,16,13,14,15,16,13,14,15,16,13,14,15,16] is what I wanted. To apply the gain solutions of calibrator 0336, I needed spwmap = [1,2,1,2,5,6,5,6,1,2,5,6,1,2,5,6] Without detailed explanations in the URM, this will take a long time to figure out. Again, I think you are making life much more difficult for the user but not letting vlafiller combine sp_ids as needed for smooth calibration. 4. Fluxscale I could not use calibrater.fluxscale. I think this tool needs spwmap. 5. Calibrated Data I removed outliers before calibration, but found that the calibrated data had large, unexpected, outliers for the calibrators and the sources. Most of the data should be less than 5 Jy, but there were spikes as large at 400 Jy. On inspection, they were associated with antenna 14 near the beginning and end of two scans. This suggests that the end points of the calibration, occasionally, are not applied correctly. I filtered these outliers.. Otherwise, the calibration looks okay. 6. Making Images I am having trouble making dirty images of the appropriate spectral channels of the targets. I am problem doing something wrong in the script, although I used Debra's cookbook in setting up the channel descriptions. Below is a quick and dirty comparison of the run times for similar AIPS and AIPS++ task. It is not always easy making an exact one-to-one correspondence. TIME OF AIPS SCRIPT for AW602: TIME in SECONDS AIPS++ timing and comments REAL CPU REAL comments FILLM: Filling of data 68 27 FILLM: Second day of data 90 21 UVCOP: First frequency copy 27 21 UVCOP: Second frequency copy 28 21 aips++ filler is faster, but has too many sp_id INDXR: First frequency copy 7 7 INDXR: Second frequency copy 7 7 175 too much time SETJY: Flux for 0548+498 1 1 842 making new cols. CLIPM: Clip fluxes over 2 Jy 38 35 220 filter takes time UVFLG: Flag selected bad data 1 1 30 CALIB: Amp & Phase cal of 3 sources 30 14 89 CLCAL: Apply calibrations 3 2 BPASS: Determine bandpass 9 4 25 CALIB: Calib with bandpass 28 18 88 could not do this GETJY: Get flux of 0316 and 0336 1 1 in aips++ CLCAL: Apply calibrations 2 1 14 extra flagging of calib data 178 IMAGR: Image and clean 0336 (200 comp) 70 65 IMAGR: Image and clean 0336 (200 comp) 71 65 210 Made both freqs. IMAGR: Dirty image of 10 sources Not done yet channels 4 to 57 by 3 18 images x 10 sources 512x512, 0.5" pixels 240 177 --------- TOTAL TIME 721 498 >2000 guessing for last imager ============================================================================== See scripts: aips++ : 2004-07-06-AW602.g AIPS : 2004-07-06-AW602.001 ============================================================================== Date: Wed, 7 Jul 2004 16:09:02 -0600 (MDT) From: George Moellenbrock To: Ed Fomalont , Joe McMullin , Steve Myers , Kumar Golap Subject: Re: Reduction of AW602 (fwd) Hi Ed, Thanks for your detailed work and comments. Have yourself a drink. (Sounds like you might need one.) > > 1. Reading in the data: > > AIPS and AIPS++ both read in the data in about the same time. > However, the AIPS flexibility in tying the different spectral windows > and polarization is much better, although this took a little time to Yes we expect to provide for combining spws as does aips, despite the fact that some data will effectively be mislabelled in the MS as a result. On principle, we'd rather not do this, but the practical advantages probably trump our conservatism. It is worthwhile pointing out that observations can be made in such a way to avoid this problem, i.e., doppler track in the direction of the target even when observing the calibrators (Bryan B. tells me this is possible). Apparently, people find this unintuitive, and end up complicating their observations (and the archive) substantially when they manually set the topo freqs on the calibrators to match the latest (corresponding) topo frequency on the target. Yancy showed me a dataset where the calibrator spw kept changing during the observation. It should just be cast in the LSR w.r.t. the target! The issue of the RR and LL appearing in different spws is due simply to ignorance of this mode by the vlafiller. We'll sort this out. > There were difference in the AIPS and AIPS++ data sets. The major > difference was that antenaa 24 was noise-like in the AIPS++ data base, > but looks okay in the AIPS data base. There were other minor > differences as well. Hmmmm. Submit a defect for this, if you haven't already. > 2. Setjy or running imager for the first time. > > Running setjy or the first time a user goes into imager is a > definite downer to anyone running the system. Yes, we are working hard on this. Forget lack of answers in the past, and please be patient a little longer. This is a key area of remaining work in performance, and it is especially a problem for channelized data where the scratch columns are bigger. Rest assured, we are not ignoring it. > 3. Calibration and Editing: > > AIPS is still faster than AIPS++ for calibration and editing. There are still some outstanding issues that are especially important for channelized data in calibrater. > For > example, the filtering of all points greater than 2 Jy took > significantly longer in AIPS++, and the other flagging functions were > a bit slower in AIPS++. As pointed out many times, the flagger tool is in glish. The autoflag tool is much faster---we need better examples of how to use it. It can be tricky to run, but things like quack, flagac, and clipping are available here. I haven't yet used it enough myself to be able to provide you any useful hints, but it is pretty clever. > The use of flagging tables in AIPS clearly is > very efficient compared with flags in the data base. I am not sure this is so clear. We have some decisions to make about how to procede with storing flagging info. We just haven't had a chance to work on this in any detail since responding to Bill's concerns last summer. > Determining the > gain calibration and the bandpass calibration is about a factor of > three faster in AIPS. This difference is somewhat contrary to the > work and effort that George put in to speed up calibration; I'm not > sure why there is a difference. > No, not contrary; our (relatively new) VLA spec line benchmark suffers similarly. For a point source model, we are hit by unnecessary I/O of the model data column, as well as some bookkeeping issues when applying calibration to data. Stay tuned for significant improvements here. The MS SOURCE table provides for storing a source model (e.g., a point source flux density, etc.), and I will soon add a feature using this (it is one of the next things on my list, after I get done with the CDR). This should improve performance significantly, and is also related to avoiding the MODEL_DATA scratch column cost at the beginning! > The need to use spwmap made life complicated, Sounds like this really annoyed you. Sorry I didn't explain this better. Stay tuned for improvements in the doc. Please note: spwmap is not just a (perhaps clumsy) means of accomodating bunches of closely-related spws (which is better handled by interpretting the dataset liberally in the filler). Spwmap will be used, e.g., to transfer calibration from one band to another, e.g., 15 GHz solutions to 22 GHz data, with the appropriate rate multiplier automatically included. (Don't try this yet! The multiplier hasn't been added yet.) Also, transfer of calibration from one dataset to another (e.g., D solutions) will take advantage of spwmap when different spw numbers requires it. The way we are using it at the moment will not be the best use of it in the long run! > 4. Fluxscale > > I could not use calibrater.fluxscale. I think this tool needs > spwmap. > As I pointed out in replying to your spwmap defect, this has been added (a week or two ago). > 5. Calibrated Data > > I removed outliers before calibration, but found that the > calibrated data had large, unexpected, outliers for the calibrators > and the sources. Most of the data should be less than 5 Jy, but there > were spikes as large at 400 Jy. On inspection, they were associated > with antenna 14 near the beginning and end of two scans. This > suggests that the end points of the calibration, occasionally, are > not applied correctly. I filtered these outliers.. Otherwise, the > calibration looks okay. > I am working on things relevant to assigning calibration solutions to data--I'll keep this issue in mind. Finally, one other thing: in the lastest stable (#615), imager seems to have suffered a setback in performance compared to the previous stable (#549), at least for sp line data. Kumar is working on this. Stay tuned. Cheers, George ============================================================================== Date: Thu, 08 Jul 2004 10:17:32 -0400 From: Ed Fomalont To: George Moellenbrock Cc: Joe McMullin , Steve Myers , Kumar Golap Subject: Re: Reduction of AW602 (fwd) Hi George, Thanks for your comments on my comments. I'll check on more carefully looking at the differences in the data after loading with aips and aips++. Cheers, Ed ============================================================================== Date: Fri, 9 Jul 2004 11:23:19 -0600 (MDT) From: George Moellenbrock To: Joe McMullin , Kumar Golap , Steve Myers Subject: Re: Reduction of AW602 (fwd) Joe, Kumar, Steve, I've already replied to Ed on his usage comments. Here is a reply to his performance numbers, which, in fact, are indeed consistent with what I presented for the G192 benchmark. (abridged - STM) -George General: AW602 is quite similar to the G192 benchmark case in terms of calibration. It is a mosaic, so the imaging requirments are quite different. > > TIME OF AIPS SCRIPT for AW602: > TIME in SECONDS AIPS++ timing and > comments > REAL CPU > > FILLM: Filling of data 68 27 > FILLM: Second day of data 90 21 REAL comments > UVCOP: First frequency copy 27 21 > UVCOP: Second frequency copy 28 21 aips++ > filler is > faster, but has > too many sp_id we will enable a freq tolerance mode which will provide for recognizing the many spws as one when practically appropriate. copying out the data for different freqs will not be necessary in aips++. also, the polarizations (RR,LL) will be combined (this is a mode not described in the vla archive format doc, and so isn't yet supported in vlafiller). > INDXR: First frequency copy 7 7 > INDXR: Second frequency copy 7 7 175 Is "175" the filler time for aips++? the spw grouping shouldn't > too much time > SETJY: Flux for 0548+498 1 1 842 making new > cols. > this is the scratch cols. known problem described in perf talk at CDR. > CLIPM: Clip fluxes over 2 Jy 38 35 220 filter > takes time > this is because flagger tool is in glish. the benchmark reported in at the CDR is using the autoflag tool, which matches aips performance in that case. > UVFLG: Flag selected bad data 1 1 30 > again, due to flagger in glish. not sure what was done here, but autoflag may already support it. otherwise, it will be addressed when flagging issues get attention later this (?) year. > CALIB: Amp & Phase cal of 3 sources 30 14 89 > CLCAL: Apply calibrations 3 2 "Apply calibrations" here means CLCAL, not actual application to data. > BPASS: Determine bandpass 9 4 25 > CALIB: Calib with bandpass 28 18 88 > could not > do this > GETJY: Get flux of 0316 and 0336 1 1 in aips++ > this is, in fact, available in a more recent version. this is not covered in the benchmark, but it is a trivial step, as in aips. > CLCAL: Apply calibrations 2 1 14 > these calibration steps are essentially consistent with the benchmark reported in the CDR. as described in the CDR perf talk, the solve steps will benefit greatly from handling the trivial model case, and the apply has some identified optimizations that haven't actually been implemented yet. also, recent investigations indicate that the cal apply improves substantially if calibration is applied to different fields separately. since this is mosaic (more fields) this is a bigger problem here. the script can be written to do better, and (better), the code (MS Iterator) can be tuned to resolve this as well. finally, the fact that the polarizations are stored separately is probably causing a small performance hit. > extra flagging of calib data 178 > again, autoflag would be better. there are some thing to check in calibrater that may make this step unnecessary. > IMAGR: Image and clean 0336 (200 comp) 70 65 > IMAGR: Image and clean 0336 (200 comp) 71 65 210 Made for > both freqs. > these are continuum images. in aips++, the algorithm is mfs. this example is actually better than the case reported in the CDR perf talk. > IMAGR: Dirty image of 10 sources Not done yet > channels 4 to 57 by 3 > 18 images x 10 sources > 512x512, 0.5" pixels 240 177 > --------- > TOTAL TIME 721 498 >2000 guessing > for last > imager kumar will be helping ed get the mosaicing working on the aips++ side. it will be interesting to compare linear mosaicing results with the joint deconvolution in terms of image quality (and necessity or not) and balance this against the performance question. ==============================================================================