EVLA CASA Test: 18-22dec06 ========================== I. Introduction The second EVLA CASA test was held 18-22 December 2006. This test was... * the first full test of CASA (as compared to aips++), the primary difference being the switch from glish to python * the first test of the revised user interface, with the introduction of tasks and inline help. * the first ``wide, not deep'' test, with the testers encouraged to try using CASA to reduce their own data, rather than working on a few pre-tested data sets or closely following pre-tested scripts. The test itself was focused on the user interface (including documentation) and basic calibration (including data weights). Self-calibration was not tested; imaging and (rudimentary) flagging were available, but were used primarily to check the calibration. Two additional tests, specifically for the EVLA, were planned for this week as well: one to check whether SDM data could be read into CASA and written out to UVFITS files usable by AIPS; and another to test pointing self-calibration. More details are given in the charge to the testers, available at http://projectoffice.aips2.nrao.edu/EVLA2006.12/EVLA2006.12.html . A. The Process There were eight primary testers, all NRAO employees. Of these, six are based in Socorro (Walter Brisken, Mark Claussen, Gustaaf van Moorsel, Frazer Owen, Michael Rupen, and David Whysong), and two are based in Charlottesville (Ed Fomalont and Crystal Brogan), although EF was visiting the AOC during this test. The primary CASA pundits on call for the week, also all at the AOC, were Kumar Golap, Joe McMullin, and George Moellenbrock. The testers generally worked for several hours each day, sending bug reports and enquiries throughout, and summarizing their experiences in daily written reports. We met with the CASA group for 30-60 minutes each morning to discuss progress and outstanding issues. There were also occasional smaller sessions on specific topics (e.g., smoothing vs. interpolation) that came up during testing. Finally, the testers and the CASA group (as well as Brian Glendenning) met together on Friday afternoon for a general discussion of the results. II. Results The detailed daily logs of the testers are available at the Web site above. The same Web page also includes a link to JM's spreadsheet summarizing the bugs reported and progress on their resolution. Here we give a broader overview of the results. A. Overview Both testers and programmers found this test a very useful one. While there were stumbling blocks and annoyances, all of the testers left with the feeling that CASA is at last on the road to becoming a usable data reduction package for ordinary astronomers. There is a great deal to do, but the user interface in particular has been improved tremendously. On the CASA side, the wide array of data sets (and users!) thrown at the package proved very effective in discovering defects, in providing useful suggestions, and in beginning interesting dialogs between users and programmers. The challenge now is to ensure that this sort of vibrant interaction and progress continues in the months ahead. B. Interface and Ease-of-Use One of the main aims of this test was to provide a first check of the CASA team's response to the suggestions that came out of the User Interface Charette in June. In addition to a number of smaller innovations, the main items tested here were (1) the implementation of the first tasks, and (2) the new in-line help. The testers uniformly found the user interface vastly improved, as is evident in the number of relatively small suggestions that would have been far "below the noise" previously ("Oooh, this is really cool! Could you maybe do this too???"). The task interface and the in-line help were both widely praised; even the tools were felt easier to use, thanks both to the available in-line help, and to the on-going rationalization of the various methods and inputs. More basically, there was overall improvement in both stability and responsiveness, compared to the glish (aips++) interface. While much remains to be done these first steps are clearly very positive ones. One possibly surprising result in this area was that many testers actually did read the cookbook, suggesting that effort spent on that document would be richly rewarded. On the other hand, at least one, more experienced tester was able to avoid the cookbook altogether, because he found the inline help and inputs were sufficient. This seems an excellent situation, with neophytes able to learn usefully from the cookbook, and pundits able to stick with in-line documentation rather than having to refer to printed or Web-based documents. C. Calibration The other main test area was basic interferometric calibration, including (for the first time) the calibration of data weights. Here we have a more mixed review. On the positive side, the basic calibration structure seemed reasonable and worked fairly well for a wide variety of data sets. Detailed comparisons by EF also found an excellent correspondence between the data and weights for one or two data sets calibrated independently in AIPS and in CASA. However, there were a few problems as well. Flagging was not part of this test, but testing calibration on unflagged data was problematic at best, so many testers either battled with the existing CASA autoflagger (with much cursing), or flagged in AIPS before reading the data into CASA. More seriously, plotting calibration results (plotcal) was painful in various ways, making it difficult to evaluate those results. We also found several problems with polarization calibration, uv-range restrictions, and calibrator models, many of which were resolved during the testing week. The testers would like some method of tracking the calibration --- what has been applied and how. The documentation was seen as too technical and occasionally confusing (especially "accumulate"). We would also like to carry error bars or signal-to-noise ratios along with the calibration solutions, at least to enable the user to set an SNR cutoff for acceptable solutions. Finally, some of the CASA code assumes that the observations are in dual polarization, and some bugs associated with single polarization data should be fact. In sum, this area clearly needs some more work, as well as more thorough testing, but the basic approach seems reasonably attractive. D. Missing capabilities A number of areas were not officially part of this test, but were felt to be absolutely essential to "real world" use of the package, and thus deserving of fairly immediate attention. These included: * Support of remote sites: the one tester in Charlottesville (CB) had a very difficult time of it. Partly this was just chance, as her particular (albeit fairly simple) data set manifested a series of unfortunate interactions with several important tasks. But overall it was clear that we need some better way to support rapid bug fixes for remote sites. Obviously this will become more and more important as the use of CASA spreads beyond NRAO. * Basic flagging. * Plotting, both of the data themselves, and of the calibration results. There are innumerable annoyances, and occasional actual bugs, in both areas. * CLEAN boxes. Now that data weights are calibrated, this is seen as the major impediment to high-quality imaging in CASA. * Tool-level in-line help and the URM. Both are spotty at best at the moment. E. EVLA-specific test: converting an SDM/MS for AIPS We had hoped to do a full round-trip test, reading a complicated SDM into CASA and writing it out as UVFITS files readable by AIPS. In the event the only available "certified" ASDM data set was rather boring, and the only test (done by EF) was to perform a very detailed check of a complicated PdB MS with a variety of spectral properties. CASA correctly wrote multiple UVFITS data sets, each of which were successfully read by AIPS (FITLD). Both the data and the important tables (e.g., the antenna positions) seemed to agree perfectly between CASA and AIPS. The main negative is that reading/writing data seems to be a factor 10 slower in CASA than in AIPS for uv-data with either multiple sources or multiple spectral windows (or both). In sum, the results were very heartening, but the speed issue must be addressed, and more extensive tests should be done as the SDM develops further. F. EVLA-specific test: pointing self-calibration The pointing self-calibration code proved impossible to port over to CASA in time for this test. III. Summary Overall this was quite a useful and generally a successful test. Much work remains to be done, but the current release of CASA is an enormous improvement over what we had even six months ago, a tribute to the impressive efforts put forth by the CASA group this year. The highest priority in the near term (six to 12 months) should be on general use issues (ease-of-use, basic capabilities, documentation, interface), to continue this excellent progress. Currently CASA is by no means ready for general use, or for most scientific processing. The challenge now is to meet a very aggressive 2007 schedule, with a first public release in September, while continuing to address algorithmic and technical development (e.g., speed issues) in the context of a new emphasis on user support. ================================================================================