Pipeline User Test Plan Second Draft August 29, 2003 C. Wilson, L. Davis ********************************************************************* General Considerations: We envisage four types of User Tests for the Pipeline: (1) testing the heuristics for the science pipeline in an off-line or stand-alone mode (should not require an integrated release) This kind of test requires close coordination with the offline system. In principal the coordination should be such that new functionality in the offline engines is officially tested first in the offline subsystem environment, bugs in the offline engines are fixed, and the officially fixed engines are then tested in the pipeline scripting environment. The user heuristics tests should concentrate on the appropriateness and correctness of the heuristics and not on the correctness of the engines. Given the pipeline's dependence on the offline schedule, the release cycle, and limited personal a separation in time is difficult to implement in the schedule. However we think it can be acchieved unofficially by making use of inhouse NRAO testing. (2) testing the heuristics for the science pipeline in the pipeline environment (requires the appropriate integrated release) In principal this kind of test should assume the pipeline heuristics scripts are correct and concentrate on testing how they execute in the pipeline environment, e.g. are the connections to the other subsystems working correction, are things happening fast enough, etc. (3) testing the Quick Look Pipeline user interfaces (for both the Operator/Staff Astronomer and the PI) that apply to Calibration and Array Monitoring and Data Processing Quick Look operations (requires the appropriate integrated release) The Operator / Staff astronomer may a different view of the user interfaces than the PI, i.e. expert user vs occasional user. We need to understand the relationship of the PI to the interactive Quick Look pipeline better. (4) testing the simpler heuristics required for the Data Processing capabilities of the Quick Look Pipeline in some kind of an off-line mode, perhaps using some kind of simulator mode (should be done BEFORE any integrated releases that contain real Quick Look Data Processing capabilities). It may be worth thinking about whether the heuristics modes developed for ALMA Early Science would be easily adapted to function as Quick Look heuristics. Notes: (1) The AIPS++ offline system (with the exception of the simulator) does not follow the ALMA release schedule. It has its own internal 2 month release schedule which is coordinated with the ALMA release schedule. This means that the internal pipeline testing will occur against more frequent offline releases so hopefully the offline engine bugs will get fixed at a more frequent interval that the ALMA release cycle. We should also test the pipeline using the same version of AIPS++ (in the scientific capabilities sense) as the offline testers. (2) Full-up tests of the integrated Pipeline require simulated data (calibration, meta, and raw). We will soon have both the AIPS++ and Gypsy simulators available which should cover most of the issues of generating reasonable test data. We assume the SSR will define reasonable data sets for processing. One possibility would be to define an "ideal" science data set and use the simulators to apply a range of observing conditions to the that same data set. This would test the heuristics in an interesting way. Basic metadata / data formats will be defined early next year, and data collections sometime after that ? At that point it should be possible to start simulating actual data ... ********************************************************************* Pipeline User Test Plan ********************************************************************* Pipeline data processing, user test 1: (Occurs before R1.1) ** OPTIONAL ** Test 1 (stand-alone): Jan-Feb 2004, single field, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line (without continuum subtraction), no self-calibration. Testing Focus: - calibration and/or imaging - provide early testing experience to guide later, more critical tests Lower priority: - automatic data editing - pointing, Tsys, weather etc. to identify bad data - choice of best deconvolution algorithm and/or region - quality assessment - robustness of heuristics script to variations in organization of data set Requires: - raw data if calibration being tested, calibrated data if only imaging is being tested. - partial heuristics script for single field imaging (Early Science Case) - some components of offline package required by heuristics script are available - one tester beyond the heuristics developers (the Subsystem Scientist) ********************************************************************* COMMENTS ON TEST 1: Whether this test is carried out at all depends on whether tools are available in the Pipeline Prototype Infrastructure for the heuristics team to use to develop a partial or complete heuristics script. ********************************************************************* Pipeline data processing, user test 2: (Occurs before R2) Test 2 (stand-alone): July 2004, single field, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line (with and without continuum subtraction), no self-calibration. Testing Focus: - automatic data editing - calibration - imaging Lower priority: - pointing, Tsys, weather etc. to identify bad data - choice of best deconvolution algorithm and/or region - quality assessment - robustness of heuristics script to variations in organization of data set Requires: - raw data (real and/or simulated) - heuristics script for single field imaging (Early Science) - required components of offline package available to be called by heuristics script - meta-data (lower priority?) - results from Tel Cal processing of calibration observations (lower priority?) - two testers (Subsystem Scientist + one other (Offline Subsystem Scientist?)) ********************************************************************* COMMENTS ON TEST 2: We may need to think carefully about how best to do a stand-alone user test of the heuristics script in a sensible way, so that a lot of extra effort is not required to prepare two versions of the same script (one for the stand-alone test, a second for the integrated test). ********************************************************************* Pipeline data processing, user test 3: (Occurs after IR2, before R2.1) Test 3a (integrated): Jan-Feb 2005, running previously tested heuristics case (single field, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line (with and without continuum subtraction), no self-calibration) in the integrated pipeline environment Testing Focus: - automatic data editing - calibration - imaging Lower priority: - pointing, Tsys, weather etc. to identify bad data - choice of best deconvolution algorithm and/or region - quality assessment - robustness of heuristics script to variations in organization of data set Requires: - simulated raw data - tested heuristics script for single field imaging (see Test 2) - components of offline package used by heuristics script (see Test 2) - simulated meta-data - simulated results from Tel Cal processing of calibration observations - Subsystem Scientist plus at least two outside testers Test 3b (integrated): Jan-Feb 2005, first Quick Look pipeline tests, (single field, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line) in the integrated pipeline environment Testing Focus: - Calibration Monitoring by Operator/Staff Astronomer, at least pointing and focus results Lower priority: - Calibration monitoring of phase calibration, amplitude calibration, bandpass calibration - Array Monitoring by Operator/Staff Astronomer - Calibration or Array Monitoring for mosaic mode - Calibration or Array Monitoring by PI - Quick Look Data Processing (any mode) Requires: - simulated results from Tel Cal processing of calibration observations - Quick Look Calibration Monitoring structure at least partly in place - need outside testers who can act like operators or staff astronomers Test 3c (stand-alone): Jan-Feb 2005, single field, no single dish PLUS small multi-field mosaic imaging, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line (with and without continuum subtraction), no self-calibration. Testing Focus: - automatic data editing - calibration - imaging - robustness of heuristics script to variations in organization of data set Lower priority: - pointing, Tsys, weather etc. to identify bad data - choice of best deconvolution algorithm and/or region - quality assessment Requires: - raw data - meta-data - heuristics scripts for single field imaging and mosaic imaging (Early Science) - required components of offline package available to be called by heuristics script - results from Tel Cal processing of calibration observations (lower priority?) - Subsystem Scientist + at least two outside users as testers ********************************************************************* COMMENTS ON TEST 3: There are three components to this testing, so it will be a larger test effort (1) integrated tests of the Science Pipeline (focus on a single heuristics mode) (2) first integrated tests of the Quick Look Pipeline (limited capabilities) (3) stand-alone tests of one or more additional heuristics modes (i.e. Early Science Mosaics, possibly also ALMA Single Field, ALMA Mosaics; still no single-dish data added) ********************************************************************* Pipeline data processing, user test 4: (Occurs before R3) Test 4a (integrated): July 2005, running previously tested heuristics cases (single field, PLUS small multi-field mosaic imaging, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line (with and without continuum subtraction), no self-calibration) in the integrated pipeline environment Testing Focus: - automatic data editing - calibration - imaging Lower priority: - pointing, Tsys, weather etc. to identify bad data - choice of best deconvolution algorithm and/or region - quality assessment - robustness of heuristics script to variations in organization of data set Requires: - simulated raw data - tested heuristics scripts for single field and mosaic imaging (see Test 2, 3c) - components of offline package used by heuristics script (see Test 2, 3c) - simulated meta-data - simulated results from Tel Cal processing of calibration observations - Subsystem Scientist plus at least two outside testers Test 4b (integrated): July 2005, Quick Look pipeline tests, (single field, no single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line) in the integrated pipeline environment Testing Focus: - Calibration Monitoring by Operator/Staff Astronomer, all calibration modes - Array Monitoring by Operator/Staff Astronomer (single field mode) Lower priority: - Calibration or Array Monitoring by PI - Calibration or Array Monitoring for mosaic mode - Quick Look Data Processing (any mode) Requires: - simulated results from Tel Cal processing of calibration observations - Quick Look Calibration Monitoring structure at least partly in place - need outside testers who can act like operators or staff astronomers Test 4c (stand-alone): July 2005, ADD single dish data to previous modes; single field, with and without single dish, small multi-field mosaic imaging, with and without single dish, 256 channels or less, integration time 10 sec or less, 5-27 antennas, spectral line (with and without continuum subtraction), no self-calibration. Testing Focus: - automatic data editing - calibration - imaging - robustness of heuristics script to variations in organization of data set Lower priority: - pointing, Tsys, weather etc. to identify bad data - choice of best deconvolution algorithm and/or region - quality assessment Requires: - raw data - meta-data - heuristics scripts for single field imaging and mosaic imaging including single dish data (Early Science case) - required components of offline package available to be called by heuristics script - results from Tel Cal processing of calibration observations (lower priority?) - Subsystem Scientist + at least two outside users as testers Test 4d (stand-alone): July 2005, single field, no single dish, 4096 (TBC) channels or less, integration time 10 sec or less, 64 antennas, spectral line (with and without continuum subtraction), include self-calibration. Testing Focus: - automatic data editing - calibration - imaging - choice of best deconvolution algorithm and/or region - robustness of heuristics script to variations in organization of data set Lower priority: - pointing, Tsys, weather etc. to identify bad data - quality assessment Requires: - simulated raw data - simulated meta-data - results from Tel Cal processing of calibration observations - heuristics scripts for single field imaging for full ALMA case - required components of offline package available to be called by heuristics script - Subsystem Scientist + at least two outside users as testers ********************************************************************* COMMENTS ON TEST 4: There are four components to this testing, so it will be a large test effort (1) integrated tests of the Science Pipeline (focus on first two heuristics modes) (2) integrated tests of the Quick Look Pipeline (expand capabilities of Calibration Monitor; add Array Monitor) (3) stand-alone tests of ALL additional heuristics modes required for Early Science (4) stand-alone test of first full-up ALMA mode (single field, no single dish) ********************************************************************* Tests 5-7: Standalone tests continue before each subsystem release following the same pattern set up here i.e. every 6 months. Integrated tests would be best if can also follow this 6 month pattern, but if integrated releases happen only once per year, then integrated testing may have to be on a yearly schedule too. The structure for Tests 5-7 is similar to that of Test 4, except that stand-alone tests of Quick Look Heuristics replace the stand-alone tests of the Early Science ALMA Modes: (a) integrated testing of heuristics mode from the previous stand-alone test (b) integrated testing of the Quick Look Pipeline adding additional capabilities (first integrated tests of Quick Look Heuristics in Test 6) (c) stand-alone testing of Quick Look Heuristics (d) stand-alone testing of additional full ALMA heuristics modes (mosaic + single dish in Test 5; snapshot + OTF mosaic in Test 6; deep imaging + polarization in Test 7)