Critical Review: ALMA Memo 293 - ALMA Software Requirements Preliminary Report Memo Version: March 8, 2000 Review Version: April 25, 2000 Reviewer: Steven T. Myers, NRAO ------------------------------------------------------------------------------- Summary ------- The document outlines the "science-driven software requirements of the ALMA project", and is intended as a broad review of the considerations that will drive the software design. As such it is necessarily vague in places, as it will be followed by a more detailed document. The memo seems reasonably complete, and I do not think there are any glaring omissions. On the other hand, much of the discussion is at the technical equivalent of "mom and apple-pie" (perhaps "mere et crepes" for the European half) emphasizing the general notions that are fairly obvious and everyone can agree on. However, it will be in the details where the success of this project will be made or not. Some hard choices will have to be made, and soon, and this document seems to skirt the issue(s), and as it stands this document does not indicate what these choices are likely to be and what are the implications (even in just a general sense) of the alternatives. There appear to be three critical issues to be decided before embarking on the coding: the adoption of dynamic scheduling as the predominant default observing mode (this has been hinted as, but come up short of actually saying this) with interactive observing reserved for those very few cases that require this; the scope of the integration of the proposal, scheduling, observing, monitor and control, pipeline analysis, post-processing, and archiving software and the extent to which these will be developed in a parallel manner using the same base code (and whether a whole new package needs to be developed, and the extent to which it can be based upon existing designs); and given the above, what will be the look and feel of the GUI and script-level interfaces to the package. One thing that should come out of this document, in addition to the obvious overall picture of the software system requirements, is an idea of what things need to be investigated next, who will do so, and what has been decided already. Some specific comments by section: Part 1 - Introduction --------------------- no comments Part 2 - Real Time Software --------------------------- The four modes described (technical, manual, interactive, and dynamic) seem to encompass the desired range of operation. However the implications of having the primary operational mode be dynamic are not fully explored or even outlined. For example, it is unlikely that the exact composition and placement of antennas in the array will be known at schedule time, which will affect the way sub-arrays are allocated. In section 2.1 on sub-arrays it is unclear which are the "higher-level modes", I presume that the technical has the highest priority. In section 2.2, and throughout the document, the model presented for the control language is a standard keyword-value language (which essentially still emulates the old FORTH style of programming) similar to those used on most (if not all) existing radio and millimeter arrays as well as optical telescopes. However, I feel that it is highly likely that this type of language will prove inadequate for what we will be asking of it, especially given the proposed dynamic scheduing operation. In particular, the need to interface with the analysis pipeline as well as with the hardware control suggests that it be modeled on (or at be least highly compatibile with) the analysis software. Although it is not specifically stated, it is a reasonable assumption that the system software will be constructed using an Object Oriented Programming language, either directly in C++ or in a higher level OOP language (eg. glish). Since the requirements on the functionality of the control language will be rather severe, it is likely that the OO nature of the base language will carry through to the user-level, and it is important to recognize this from the beginning. I will probably harp on this issue too much, but I think it is more important to have a clear grammar and flexible software system than it is to present a deceptively simplistic control language to the putative "user". It will of course be clear during the actual software development whether a simple keyword style language will be sufficient or whether a more complex but powerful C++ or Java style language grammar is required. Given the sophistication of most observers, it is counter-productive to "dumb" them down when specifying the design requirements. The discussion of sub-arrays implies that specific antennas are to be selected for allocation to the subarrays, yet is unlikely that the exact array configuration, or even which antennas are available, is unknown at the time of observer scheduling, and will thus have to be described in a manner appropriate for dynamic scheduling. This was discussed in a short memo (dated 3 March 2000) sent to the alma-ssr group, and can be found archived at http://www.aoc.nrao.edu/~smyers/alma/subarrays2.txt along with some discussion of language choices (eg. keyword script style vs. OOP style). I assert that the "subarray" is the fundamental molecule of the observation (with the antennas and correlator setups forming the atoms) and as such will have to be described in a deterministic yet flexible manner. The main distinction between types of subarrays (from the user point as well as from the operational point) is whether they inherit the same (first) LO setup as another subarray or need a separate first LO signal - these are the ones likely to be restricted to four total. What will happen if some subarrays are already in use for VLBI or other observations and the user needs one or more? This will have to be factored into the dynamic scheduling... In section 2.3 the GUI is described. Although GUIs are the "de facto standard" for many things, including current array scheduling (and analysis) packages, they can prove very cumbersome when dealing with complicated and/or large observing programs. There are many of us who carry out large surveys for whom GUIs are nearly useless - thus it will be crucial that all GUI functionality be available with some list-directed input (to allow scheduling based on user-input lists) with some automation, or separate comprehensive script-level scheduling tools. It should not be necessary for such a user to develop their own scheduling program (like I have had to do for my CLASS survey for the VLA). This should be straighforward to implement if the software is designed with the large survey user in mind as well as the "novice" single-source observer for whom simple point-and-click GUI scheduling is appropriate. The "look and feel" of the GUI will be critical in determining its utility and popularity. For example, most current GUIs that I have seen for telescope control and scheduling are basically control panel style. However, for constructing (possibly complex) schedules and assembling objects to build a pipeline analysis macro, something more along the lines of IDL and similar visualization packages might be preferable. By being able to manipulate the atomic objects graphically and assemble them molecularly into macro-objects would simplify and enhance the user's experience in constructing schedules and analysis templates. Furthermore, it would allow easy "cloning" of macro-objects to facilitate the normally tedious process of assembling large snapshot survey schedules, for example. Computer input such as automated selection of calibrators based on user supplied criteria, and choice of calibration cycle time based on weather diagnostic input data, could be objects also. I think that by fundamentally considering the whole system software package, from the proposal to processing to archiving stage, as instances of a set of base objects, will greatly aid in the design process. Section 2.4, there is no text under Pulsar Observations (place as an example of "Other Modes" if no further description desired). Are there other Other Modes? Planetary Radar modes (with ultra-high spectral resolution)? Phased Array mode? Under "Total Power - On The Fly Mapping" there is the statement "Every so many scans a reference auto correlation on blank sky is needed" which is not true - one of the feature of OTF (in order for it to be efficient) is that it covers the ON and OFF parts of the source during the scanning. Otherwise it is just a rasterized position switched map. The critical thing for OTF mapping is a controlled slewing while observing. Should "Phase Calibration" be distinct from the amplitude part of "Gain Calibration"? I would call the amplitude and phase calibration derived from cross-correlations on sources "Gain Calibration" and "Flux Calibration" would be just an instance of this. Note for weak calibrators you would do phase-only Gain Calibration, and the secondary calibrators amplitudes can be linked to the primary flux calibrator amplitudes. I would tend to call what you call "Amplitude Calibration" from sky/hot/cold auto-corrs "Temperature Calibration" or something similar. Will there be "Noise Calibration" using a noise tube? On the observing tool reaction to astronomer inputs (2.4.2), what level of "fuzziness" will be assumed for parameters (especially when it comes dynamic scheduling time)? How are these uniformly specified (eg. as "Tolerance" on specs like Synth_Beam_Size or Largest_Structure_Scale which determine the allowed configurations)? Note that Beam_Size should be something like Synth_Beam_Size to be unambiguous. Also, I assume the intended info and warnings will be more specific than the example! Some thought needs to be put into a vocabulary and set of scientific specification parameters that will describe the desired observations. Can a preliminary set be included here? Section 3 - Proposal Submission and Handling -------------------------------------------- Section 3.1, the first requirement is that proposal is electronically submitted, yet later it says that "We don't think this needs to be computer parsable, if all the technically relevant data are available elsewhere. People should be free to use their favorite text processing system to prepare the text and to include the figures." This seems incompatible with the first, and will unnecessarily complicate the proposal review and archiving process. I *strongly* urge the adoption of some uniform and computer parsable scheme for proposal submission and archiving, either TeX based (most readable) or HTML (most archivable). I would even more strongly urge the disallowing of formatted text formats like MS-Word or Word Perfect (I used to get these from students, and they are incompatible across versions and hell to deal with.) Since the proposals are to be archived with the schedule and data, it is even more important to enforce some uniformity. On the specification (by proposer) of sensitivity goals and breakpoints, some work will be needed to come up with a proper language (with fuzzy specs or tolerances) and relative weights to allow scoring of dynamic scheduling points to determine priority and to make the necessary tradeoffs between some of the goals for a given set of in-hand data to determine breakpoints. As stated earlier (and will be again later) this will need some careful and innovative thought. It may be that a small set of parameters and tolerances (eg. Map_rms on various scales, Smallest_uv_gap, Signal_to_noise_fidelity, etc.). Are the things in Section 3.2 Advanced Features intended as suggestions or specs? Instead of linking to some (necessarily observatory maintained) database of a hodge-podge of existing line surveys by object, would it be sufficient to have a "standard" generic line-list of transitions perhaps with a guide (e.g. link in the list) to which sources or class of sources are likely to show these? I think it would be beyond the means or even the mandate of ALMA to maintain a research database on individual objects. A single line-list is probably sufficient, and we would put knowing about the lines in a specific target into the user required scientific background work on the proposal. Section 4 - Dynamic Scheduling ------------------------------ If we are to seriously implement dynamic scheduling, which I believe is the intent, then it must be done so as the *primary* method of observing. Beyond the technical shakeout stage, a user must have a strong valid reason why their project must be done interactively (or manually) instead. There needs to be a mechanism for the review and approval or dismissal of such requests. This should hold true for all peer reviewed scientific proposals, even for internal "expert" observers. No exceptions. An observer cannot have the option to observe interactively just because they prefer to do so, enjoy doing so, or cannot be bothered to carefully set up a dynamically schedulable program. The discussion "A table driven observation is more flexible." is unclear. More flexible than what? What is meant be a "table" (I presume an ascii table as it says a bit later)? I do not see this at all, and should be left out or demonstrated. I would think that this would be much less flexible, as it requires some rigid standard table scheme in order for it to be parsable. Some more standard method of specifying states (eg. GUI and/or state variable changing functions) would seem better. It says that the "expert dynamic scheduling program will probably become the default mode of observing", which seems a little weak. I would think that the decision to make it the primary mode (no "probably") is one of the (few) concrete things that this review should generate. It continues "unless an expert observer wants to take interactive control of the array" which as I stated before is insufficient - the "expert" observer needs a valid reason and permission. The section 4.3 "Guidelines" seems somewhat contradictory and vague. Again, indeed since ALMA is a valuable resource, dynamic scheduling must IMHO be the primary mode of operation. But not only should we avoid wasting time when weather is bad, we need to avoid wasting good time just so an in-house "expert" can play. ALMA will be a cutting-edge mature facility servicing the entire international community, and cannot be run as a seat-of-the pants operation, even if it would more fun for the people involved. "Leaving the human observers feeling in control" should not be a primary concern of the system, but giving the scheduler and the observers real control is important. This section as it is seems like a vestigial appendage. Perhaps it could be tightened, prioritized, and put at the beginning. I assume there will be controversy on the role of interactive observing. I hold the view that with proper planning and given the proposed set of powerful tools, monitoring, and pipeline diagnostics, that the need for interactive use of the array (after the test stage of course) can be kept to a bare minimum. This is the case with any space-based observations, and though this may take some of the "fun" out of the observing it is I think the right thing to do given the nature of ALMA. This cannot be seen as a $600M toy. Section 5 - Operator Interface ------------------------------ It says "Location: The Operator can either be located at the site or at the OSF in San Pedro". Duh. Where will the "operator" be and what are the issues for the two cases? If the operator is in San Pedro, then they cannot oversee operations on-site which has some impact on what duties they can assume. I could also see use in having an array operator at the site and at the same time a data manager at San Pedro monitoring the data and overseeing the pipeline (as well as a double-check on the site operator's safety) - this might be the best split of duties with the site operator more electronics oriented and the OSF data manager more computer oriented. A video link between the two would ease the isolation of the site. I don't think we should underestimate the hazards for those on-site and some connection to the OSF would be good. In 5.2 Operator Responsibilities, the second item is to oversee the site work schedules and safety (which cannot be done if the operator is in San Pedro!). This seems to distract from the main duty of the operator which is to operate the array in any event. Operations of ALMA is outside the scope of this document, and I would strongly urge the decision by this committee that a DEDICATED array operator be assigned during a given shift. Note that site saftey would require a second person (at least) with the operator, and the co-operator could be assigned charge of saftey concerns (including array shut-down due to weather, and evacuation, possibly overriding the primary operator). Note also that this second item is in contridiction to the skills listed at the top for the operator (eg. computing and electronics doesn't equal health and safety management training etc.). I would delete 2, as site operations will need to be managed by a site manager, and add the recommendation that a dedicated operator be designated who has no managerial duties to distract them. Section 5.3 seems somewhat slight, but Im sure a list of interface tasks will be filled out by members of the group currently dealing with arrays. Section 6 - Data Pipeline ------------------------- Nothing too startling here, although integration of the analysis pipeline will require coordination with the pipeline programmers (or does it fall under the SS group). What decisions have to be made when on the language specs for the system and analysis software? I think it would be a mistake to assume that some pipeline to be developed in isolation will be integratable into the ALMA system. Some tough choices will need to be made and soon (who will do this?) as to whether an existing package is adopted. In the lists of section 6.2, the main goal of the pipeline from the system software is to provide diagnostics usable by the online system to drive the dynamic scheduling. Whether it produces a homogenous or scientifically useful data base is secondary (though likely to follow assuming a sane pipeline is adopted). In the item beginning "The pipeline must interact ..." the phrase "various high levels" should read "various high-level". "The pipeline must operate through a readable and comprehensive data reduction script" is irrelevant given its (rather complex) task. It is desirable that it is readable (and to do the job it must be "comprehensive" I guess, or does the author mean "comprehendable"?) but it may well (acutally, probably) have to be written directly in the lower-level (C++,glish,java,???) language to have direct control over the objects. I find most programs to as readable once you understand the language as observing scripts in any event, and I think most astronomers would agree. This whole paragraph seems a bit silly given that there is no real vision of the detailed implementation. The end bits about what it must do are sensible, though redundant with the next section (see below). The last item "The pipeline should be run either at or near the telescope" doesnt seem relevant. Perhaps what is meant is that "The pipeline must have the capability of being run at the site, San Pedro operations center, remotely, and in real-time or at a later date." It will have to do all of the above. Section 6.3 outlining functionalities seems somewhat redundant with parts of 6.2 as mentioned previously. Section 6.3.4 is different, though vague. The primary goal of the pipeline is to feed back relevant diagnostics of the data quality vis a vis the proposal goals. It seems unlikely that we will want to "allow several algorithms to compete", rather we will perhaps need a suite of diagnostic algorithms (both imaging and visibility based) to tell different things about the data. The agreement or lack thereof between the different methods will provide information on a combination of data quality and image properties. The "complexity" of the image may well not be known beforehand, and thus should be one of the diagnostics output by the pipeline. Zero spacings (ficticious in any event) need not and should not be included as this can be dealt with in the diagnosis, but short spacings provided by ALMA total power data or mosaicing could be. I see no good way to uniformly incorporate non-ALMA short spacing data, though maybe provisions could be made for inclusion in the archive and thus available to the pipeline user provided images. I worry about basing diagnotics on data not under our control. In Section 6.4 on users the phrase "go beyond the informations" should be "go beyond the information". Also, "Fully or partially interactive observations should be justified during the proposal stage, with permission granted by the proposal review comittee, before interrupting the normal dynamic schedule flow." might be more to the point. However, realtime monitoring of observations might be acommodated if the bandwidth is high enough to the site. Do we wish to encourage or discourage this? Section 7 - Archiving --------------------- Some typos in the first paragraph: "inquires" should be "inquiries" and "spectra line" -> "spectral line". It is unclear that we will be able to afford to store both corrected and uncorrected data in all cases (especially the high data rate modes). Although it is desirable to keep the original data in cases of "irreversible on-line corrections" it should not be necessary after the initial testing and debugging stages. The goal in the end should be, however, to have a robust phase-correction system where the corrections can be applied safely in most cases (and thus if only a small number of observations are rendered unusable due to faulty irreversible modifications they can be dealt with through reobservation). I think it would be a poor use of resources to build in a factor of two storage and data rate redundancy (and this would be at the fastest speed before accumulation). On the other hand, it may be that at the slower data rates (longer integrations) where there is some excess capacity that there is some benefit in recording both uncorrected and corrected data streams. However, I would guess that in normal operation where correction is important the uncorrected data is useless to begin with, and in marginal cases the correction will not be making a difference anyway. Note that during testing it would be advantageous to be able to record both as a check on the correction algorithm. The long paragraph on object oriented post-processing might well be applied to the system software also (as I advocate). One of the innovations of the ALMA system will be the integration of all facets of the proposal, scheduling, observing, pipeline, post-processing and archiving processes into a (hopefully) seamless whole, and reusable and multi-purpose (as well as single-purpose) objects are an elegant solution. This paragraph could appear at an earlier place to stress this. We will likely need some redundancy in the Chilean archives (are site and San Pedro sufficient?) as well as regular updating of main European, American, and Japanese archives. Appendix A - Script Language ---------------------------- No comments here (I read this through earlier when proposed), seems reasonable but until a detailed system design is undertaken it is not clear whether this retro-styled script language is appropriate (my guess is "no"). Upshot ------ There are several "actions" (decisions) that should be taken as a result of this document before progress can be made. These seem to be: 1. Once shakedown of ALMA has been carried out and standard observations commenced, should Dynamic Scheduling be the primary (enforced) mode of operation, with interruption for interactive user observationd allowed only for cases of documented clear and present scientific need? Or should some interactive observing by "expert" observers be allowed in a more lenient fashion? I would argue for the former over the latter. If dynamic scheduling will be the overwhelming default, then this will drive the integrated system software to handle this fairly early on (eg. right away). Adoption of a policy here will probably need approval or at least endorsement by the ASAC and ACC. 2. Should the proposing, scheduling, observing, monitoring, control, pipeline analysis, post-processing and archiving be considered as parts of a (mostly uniform) whole, based on a particular language? How much compartmentalization and non-standarization can be allowed and still have the desired functionality? Which existing packages and platforms (if any) could fulfill this? My feeling is that we will need some decisions on standards, and fairly soon (this year). My overwhelming impression after reading this document is that the only way a working system will be developed is by basing all parts on a common software base. On the other hand, is the best we can expect a general agreement between the different programming groups on a set of compatibility specifications? 3. Given the above, what are the tradeoffs in language (eg. human friendly versus programming versus utility) specs? What style of GUIs should be adopted (eg. control panel style versus IDL style)? The "look and feel" of the user interfaces will be critical to allowing the wide community to use ALMA without unnecessary hardship. On the other hand, we as "expert" astronomers (and certainly this document as it stands) tend to "dumb down" the average user and require over-simplified specs at the cost of having a fully flexible and powerful system. Some balance must be struck. 4. One need for the system is a vocabulary to describe the scientific goals of a proposed observing program in a quantitative way that can be easily translated into technical requirements (eg. sensitivities). Does this require a "language" or a simple set of parameters? Should this be assigned to a working group? Are there other critical things to be decided asap? Should working groups be assigned to specific things now? -------------------------------------------------------------------------------