NRAO AIPS++ Users Group Meeting Date: 2003-8-27 (Wednesday) Time: 1300 MDT Video Hub: CV-conf Rooms: SOC317/CV311/GB241/TUCN505 AIPS++ Threat Level is: Yellow = Elevated [We have some breathing space, but the pressure is on.] Minutes: 1. NAUG News (Steve) o NAUG meetings - given that GB has spun-off, I propose to de-schedule the GB and TUC rooms and use SOC as the hub. *If there are objections let me know! o Requirements task areas (corresponding to the sections in the ALMA and EVLA offline req docs) are: - 1.0 General [Myers] - 2.0 Interface & Documentation [Shepherd, Van Moorsel] - 3.0 Data Handling [Myers, Butler] - 4.0 Calibration & Editing [Hibbard, Butler] - 5.0 Imaging [Fomalont, Butler] - 6.0 Analysis [Hibbard, Shepherd] - 7.0 Visualization [Brisken] - 8.0 Simulation & Special Modes [various] These were deemed acceptable, though it was noted that 4.0 Cal/Edit and 5.0 Imaging will see the most action, so additional testers will be assigned to these for September. We will try to set up a system where the NAUG members are notified of stuff in each area to be tested (sort of like defect assignment within AIPS++, but more informal). *Action item - Joe and I will set up the testing assignment procedure o The highest priority is for pre-testing of ALMA deliverables for Release 1.0 (Oct03). *Action item - Joe and I will prepare the testing targets for September 2. AIPS++/ISD Status Report (Joe, Steve) o Project Office - see latest targets and info at: http://projectoffice.aips2.nrao.edu/ *New items: - there is now a column on the release for "NAUG Testing" showing status = -/Ready/Scheduled/Passed. - the stable build page has been updated and shows the resolved defects - there are links to ALMA Pipeline documents o Personnel - A split ALMA/AIPS++ programmer position has been filled. An AIPS++ architect position (Athol replacement) will probably be re-advertized. o Benchmarking - the target is to be within a factor of 2 of equivalent packages (AIPS & GILDAS) by R1.1. Sanjay and the team have been making good progress. Stay tuned. *FYI - Interested NAUG members should check out Sanjay's documents on the performance, which are linked under the first two items on the Current Release Status table on the Project Office page: Imager: http://aips2.nrao.edu/projectoffice/imager.ps Calibrater: http://aips2.nrao.edu/projectoffice/calibrater.pdf *See also George's discussion item #5 below, and Sanjay's imaging presentation next meeting. o The proposal for changing the way code is checked in and built (from RCS to CVS) was discussed. This has some (good) implications for testing. *Action item - Joe is preparing a change proposal, this will be forwarded to the NAUG when ready. o Discussions are continuing with the former consortium partner ATNF regarding mutual development of AIPS++. *High-level issues to be resolved, stay tuned. o The new stable snapshot is v1.9 Build 047, see http://aips2.nrao.edu/docs/reference/updates.html (Note - we have uprev'd from 1.8 to 1.9.) *SS1 is available for testing There is also a new stable v1.9 b075 as of Tuesday 26 Aug. *Only minor enhancements from SS1 3. ALMA (Kumar, Steve, Joe) o There was an ANASAC meeting in Chicago this past Sunday and Monday. I should have gotten a report from Brian G., but by all accounts it went well (particularly the reception to the performance breakdown and improvements). *Action item - get summary from Brian o Debra has assembled test datasets for the ALMA testing which could also form the basis for NAUG testing (it would good to pre-test using the ALMA datasets!). No discussion on this topic, but here are the sets and links: *NGC7538 - a project that Miller has allowed me to work with. There are many fields, not overlapping, looking at different sources. Src D has the good continuum emission and expected weak NH3 line emission. I've done a real rough data reduction on the source D field but no continuum subtraction, careful flux cal, or analysis. I think I'll choose this data as an ALMA test but it is also applicable for VLA/EVLA testing. Data, preliminary script available in /home/sola3/dss3/alma/tsts/offline/tst1.jan04/NGC7538 *G192 - science observations of NH3, continuum subtraction required. My summer student and I have been working on this. It is almost done so it might not be appropriate for new testing although if someone else wants to reduce it along a different path, I'm happy to provide the data locally. I can also provide the script. Data and complete script available: /home/sola2/dss2/g192/vla.nh3/reduce *G192 - quick snapshot of water maser observations. Self-calibration needed (no flux cal or bandpass obs available). I've done a quick reduction in AIPS++, no self-cal done yet. The data and rough script are available at /home/sola/dss/current.projects/g192/vla.water.maser *Other - see also the benchmarking datasets at: http://www.aoc.nrao.edu/~dshepher/alma/benchmarks/ 4. EVLA (George, Steve, Joe) o Bryan Butler has taken on the EVLA Software Project Scientist job. He will start 1 October. Bryan reported that he will get together with Debra, Steve & Joe to see how EVLA and ALMA development can proceed. He also has the EVLA e2e stuff to deal with. o Data rates, test datasets, and use cases - Frazer, Sanjay, Doug and Rick have been discussing these issues for the EVLA, particularly the processing projections for EVLA Phase I and II data rates for standard and hard projects. Frazer stated that more work needs to be done to refine the computing estimates (beyon those in Memo 24 and the proposal), such as more detailed use cases. A set of realistic fiducial experiments (beyond those in the proposal?) might help, a sort of "Design Refernce Mission" to use NASA and ALMA words... *Some relevant docs: - EVLA Memo 24 "Computing for EVLA Calibration and Imaging" (Cornwell) http://www.aoc.nrao.edu/evla/geninfo/memoseries/evlamemo24.pdf - There is a table in the EVLA Phase II proposal outlining 4 "experiments", you might hunt this down. 5. Main Event - Calibrater performance (George Moellenbrock) o George presented a summary of the current status of performance improvements to calibrater. *See http://www.aoc.nrao.edu/~smyers/aips++/notes/2003-08-27-moellen.txt which is inserted here: -------------------------------------------- Calibrater Performance Improvements Thus Far -------------------------------------------- George Moellenbrock 2003 Aug 27 A lot of improvements have been made in calibrater performance since 2003 July and more is coming. Throughout this process of improvement, performance has been monitored using a 2 hour VLA simulation (27 antennas, 1 spectral window with 1 channel, 10 second integrations, full polarization = 252720 (RR,RL,LR,LL) visibilities). The simulated observation is of a point source with only ordinary gain errors (which change abruptly every 30 minutes, a la a source change, but which are constant during the intervals), and noise (snr=8 per visibility). Recently, channelized datasets (8, 64, 128) with the *same* net sensitivity have been added to the work. This has revealed some new and interesting components of the performance picture. The calibration trials involve obtaining a series of solutions (1,6,11,23,45,90,360, or 720) by appropriately sub-dividing the dataset (using appropriate solution interval settings). This dataset is unrealistic in some ways, but it is a solid basis upon which to explore performance improvements and comparisons with other packages. More-realistic systematic errors (including other effects) and lower snr will be attempted in the near future. The improvements are (see plot 1): 1. minimized slot counting (bookkeeping of solutions) 2. Improved convergence criterion 3. Predict from previous solution. 4. Remove log messaging and an extra chi2 calculation 5. more conservative convergence criterion (handles real data better) (this was a step backwards in performance) 6. patch memory leak 7. avoid unnecessary gain matrix inversion 8. initial calibration store optimization A fair amount of the work (1,7,8) has involved recognizing different operational contexts in the calibrater tool, e.g., solving vs. applying calibration, such that processing methods they would otherwise have in common are specialized appropriately to the different contexts. A few bugs have been fixed (4,6), and the convergence criterion and solution prediction have been massaged (2,3,5). Plot 2 shows the comparsion of aips++ and aips for 1 and 8 channel datasets. At the low slot count end (left), aips++ is ~3X slower for 1 channel (3 sec vs. 1 sec), and ~2X slower for 8 channels (6 sec vs 2 sec). As plot 3 shows, the slope of the aips++ curve is dominated by the calibration table write step. This is due to row-wise (not column-wise) I/O, and we believe we know how to fix this. Plot 4 shows that with increasing channel number, aips++ becomes more competitive with aips. This is most likely due to the tiled disk I/O used in aips++. However, the slope also increases, although the solve process itself (and the write step) is not a function of number of channels (the channel data are averaged before the solve). Work is continuing on performance improvements.... Plot 5 shows that introducing the trivial model assumption to the I/O step reduces the execution time dramatically. This is the first optimization (other than the log messages excision) which substantially reduces the y-intercept in the performance curves. Previously, aips++ has been reading the model from the MODEL_DATA column (on disk), which is unnecessary when the model is trivial (a point source). This issue is complicated somewhat when a priori calibrations are considered. Some thought is currently going into how to do the appropriate logic to make sure that a priori calibration, data normalization, and frequency and time averaging are done in the optimal order while maintaining the proper algebraic order of the calibration terms (some don't commute conveniently with others, and with the normalization and averaging steps). The trivial model assumption simplifies this logic in most cases. Plot 6 shows that (for 128 channel data), the solve is dominated by I/O (data-only in this plot, not model), pre-solve time and frequency averaging (which will reduce by a factor of ~2 when the trivial model assumption is introduced to them), and by unnecessary in-memory data copying. This last item in the solve step is responsible for the increasing slope(nchan) of the aips++ curves. The fundamental solve components themselves are very small. Summary 1. For larger numbers of solutions, the current stable calibrater is dramatically faster than that of ~2 months ago. 2. Performance related enhancements include a significant measure of operational context- and mode-dependent specializations, as well as some genuine errors in the earlier code. So far, the fundamental generality of the calibrater's solver has not been compromised, so these performance improvements are realized in other solve contexts (e.g., B, D). 3. Several outstanding performance issues are currently being worked, including: optimizing the cal table write, a full implementation of the trivial model assumption, and cleaning out unnecessary in-memory data copies in the solve. These should be done in the next few weeks. 4. I will be discussing many of these issues with Eric, including how the aips curves are so flat, and how aips does the normalization/averaging prior to the solution. Plots Fig 1 - sim_27ant_2h_10s_1ch_allcals http://www.aoc.nrao.edu/~smyers/aips++/plots/gm.20030827.fig1.pdf Fig 2 - comp1and8 http://www.aoc.nrao.edu/~smyers/aips++/plots/gm.20030827.fig2.pdf Fig 3 - calsolvetimecomp http://www.aoc.nrao.edu/~smyers/aips++/plots/gm.20030827.fig3.pdf Fig 4 - comp8and64and128 http://www.aoc.nrao.edu/~smyers/aips++/plots/gm.20030827.fig4.pdf Fig 5 - comp8and64and128_triv http://www.aoc.nrao.edu/~smyers/aips++/plots/gm.20030827.fig5.pdf Fig 6 - solvecomp2 http://www.aoc.nrao.edu/~smyers/aips++/plots/gm.20030827.fig6.pdf -------------------------------------------- 6. AIPS++ Developments (Joe, Kumar, George, David, Sanjay) o Specific development items (new): *imager - all Stokes using polarized primary beam has been implemented. See EVLA Memo 62: http://www.aoc.nrao.edu/evla/geninfo/memoseries/evlamemo62.pdf o Specific development items (carried over from minutes of last meeting): *Viewer - Dave King has added more MS editing functionality, such as statistical editing (RMS, deviation vs. mean), zoom fixes, and prototype image blinking. See the viewer docs for viewerdisplaypanel option records: http://aips2.nrao.edu/docs/user/Display/node182.html e.g. under "MS and Visibility selection" for RMS and running mean difference. Blinking demo was completed, and a usable version is forthcoming shortly. *uv-plane continuum subtraction (ms.uvlsf) - George has checked in this new function. See the daily URM entry for ms.uvlsf, currently http://aips2.nrao.edu/daily/docs/user/General/node355.html *image-plane continuum subtraction (image.continuumsub) - George has checked this in, see the daily URM entry for image.continuumsub, currently http://aips2.nrao.edu/daily/docs/user/General/node44.html *calibrater scan-based time gridding (calibrater.setsolve) - George has added this capability to calibrater, along with some critical performance improvements. From the URM: "The solution interval t, if > 0.0, specifies the duration of data used for each calibration solution. In general, the solution intervals are measured from the beginning of data segments for each field and spectral window. If t is large enough, a single solution may encompass data from more than one scan (as long as the field and spectral window are the same). The solution interval represents a coherence time, not an integration time w.r.t. gaps in the time series; in effect, such gaps are ignored, and the lastest time in the solution is never more than t seconds after the earliest time. If t = 0.0, one solution per scan will be performed and delivered, regardless of the (variable) scan durations." See the daily URM entry for calibrater.setsolve, currently http://aips2.nrao.edu/daily/docs/user/SynthesisRef/node24.html Comments on any of these are welcome. 7. Upcoming NAUG meetings: o Next meeting (Sep 10) *Steve will be traveling back from GB, so Joe will run this meeting *Main event - Imager performance (Sanjay, Kumar) This will include the item deferred from today: FFTs in AIPS++ - Sanjay will lead a discussion of an issue regarding FFTs for imaging. In particular "With current changes, the AIPS++ CS-Clean is less than a factor of 2 slower than AIPS. The difference now is largely coming from our use of full polarization formulation of the emission which requires use of Complex->Complex FFTs at places where a specialization can use Real->Complex FFTs. Whether we want to make this specialization now is a scientific discussion and needs to be discussed more seriously." o Following meeting (Sep 24) *Main event - Visualization (Brisken) Other upcoming meetings and deadlines: o 2003 Sep 5-6 ASAC (ALMA) Meeting (Hamilton, Ontario, Canada) o 2003 Sep 8-9 EVLA Advisory Committee Meeting (SOC) o 2003 Oct 1 ALMA Release 1.0 o 2004 Apr 1 ALMA Release 1.1 o 2004 Jun 1 ALMA CDR2 = Judgement Day! The agendas for past NAUG meetings are archived at: http://www.aoc.nrao.edu/~smyers/aips++/agenda/ The minutes for past NAUG meetings are archived at: http://www.aoc.nrao.edu/~smyers/aips++/minutes/