Collected Comments on Pipeline/Offline Requirements Document
V2.0 (and some before 21Jun01 on v1.x).

This document is online at

   http://www.aoc.nrao.edu/~smyers/alma/offline-req/olr-v2.0-comments.txt

-------------------------------------------------------------------------------

1-May-2001: V1.2

-------------------------------------------------------------------------------

From bclark@aoc.nrao.edu Fri Jul 13 10:26:20 2001
Date: Thu, 14 Jun 2001 12:48:49 -0600 (MDT)
From: Barry Clark <bclark@aoc.nrao.edu>
To: bclark@cv3.cv.nrao.edu, bglenden@cv3.cv.nrao.edu, gueth@iram.fr,
     morita@nro.nao.ac.jp, momose@mito.ipc.ibaraki.ac.jp, lucas@iram.fr,
     schilke@mpifr-bonn.mpg.de, tatematsu@nro.nao.ac.jp, smyers@cv3.cv.nrao.edu
Cc: bclark@zia.aoc.NRAO.EDU
Subject: Re: f.y.i. - an integrated version of the Pipeline/Offline doc

The Schilke & Guth document seems to have a very different mental image
about how things work than I have.

My mental image is that on the completion of a scan, something examines
whether it was a calibration scan, and if so, invokes one or more of a
number of scripts (aka pipelines) which reduce the observation, insert 
the results into a calibration archive, and optionally alert the sequencer
of their existence.  The imaging or quick-look pipelines are invoked at
more stately intervals, and both go through a (probably identical) stage
of extracting data from the calibration archive and constructing the 
detailed gain tables to use to make a first image.  Schilke & Guth 
seem to want to do the gain table construction on a scan-by-scan basis.
This is not an unreasonable way of doing things (though one may want to
go back and remake the whole lot after the last polarization and flux
calibration observations), but I do think that making the gain tables 
should be clearly separated from the reduction of the calibrator observations,
to make management of the priorities reasonably clean - calibrator observations
*must* be reduced immediately, whereas if gain tables don't get made for
a while, it's no big deal.  If we go this route, we need yet another name
for a pipeline, to separate them.

The calibration scripts (aka pipelines) I listed in my E-Mail some months 
ago were:
   calibrateTsys (loadswitched data)
   calibrateSidebandRatio
   calibrateFlux
   calibrateBandpass
   calibratePhase
   calibratePointing
   calibrateFocus
To which I would add something to accumulate polarization calibration 
information; the complete reduction is not possible until all observations
are in.

-------------------------------------------------------------------------------

From gueth@iram.fr Fri Jul 13 10:23:05 2001
Date: Sun, 17 Jun 2001 15:10:50 +0200
From: Frederic Gueth <gueth@iram.fr>
To: K. Tatematsu <k.tatematsu@nao.ac.jp>
Cc: Steven T. Myers <smyers@cv3.cv.nrao.edu>,
     Barry Clark <bclark@cv3.cv.nrao.edu>,
     Brian Glendenning <bglenden@cv3.cv.nrao.edu>,
     Koh-Ichiro MORITA <morita@nro.nao.ac.jp>, momose@mito.ipc.ibaraki.ac.jp,
     Robert Lucas <lucas@iram.fr>, schilke@mpifr-bonn.mpg.de,
     tatematsu@nro.nao.ac.jp
Subject: Re: another draft v1.4

"K. Tatematsu" wrote:
> 
> Dear all,
> 
> Thanks for your temporary summary work, Steve.
> Some more input...
> 
> Cheers,
> Ken
> 
> At 21:54 01/06/14 -0600, Steven T. Myers wrote:

> >Section 2: Pipeline Data Processing Requirements
> >
> >    2.2 Single-Dish data
> >    --------------------
> >
> >        2.2-R1 The Calibration Pipeline shall reduce the atmospheric
> >               calibration, and pass the results to the dynamic Scheduler.
> >
> >        2.2-R2 For all observations of an astronomical source, the Calibration
> >               Pipeline shall apply the atmospheric calibration to the data.
> >
> >        2.2-R3 The Calibration Pipeline shall reduce and pass the results to
> >               the Sequencer:
> 
>            2.2-R4  For the pointing and focus measuremets, the fitting results
>                 should be automatically stored in the telescope
>                 parameter file if the fitting error is less than
>                 the user/observatory specified value. If the error
>                 is not less than the specified value,
>                 the pipline will send a message to the alarm system.

This is typically what we meant by writting that the quick-look pipeline
shall be able to detect any bad data and give an alarm if necessary.
Whether it's a job for the calibration or the quick-look pipeline is
another question. In the current document, the idea is that the
calibration pipeline is reducing the data, and the quick-look pipeline 
is taking the results to do plots, images, and alarms. But the alarms 
shall be detected and signaled as fast as possible.

Frederic.

-------------------------------------------------------------------------------

From gueth@iram.fr Fri Jul 13 10:24:29 2001
Date: Sun, 17 Jun 2001 15:09:16 +0200
From: Frederic Gueth <gueth@iram.fr>
To: Barry Clark <bclark@zia.aoc.NRAO.EDU>
Cc: bclark@cv3.cv.nrao.edu, bglenden@cv3.cv.nrao.edu, morita@nro.nao.ac.jp,
     momose@mito.ipc.ibaraki.ac.jp, lucas@iram.fr, schilke@mpifr-bonn.mpg.de,
     tatematsu@nro.nao.ac.jp, smyers@cv3.cv.nrao.edu, bclark@zia.aoc.NRAO.EDU
Subject: Re: f.y.i. - an integrated version of the Pipeline/Offline doc

Barry Clark wrote:
> 
> The Schilke & Guth document seems to have a very different mental image
> about how things work than I have.
> 
> My mental image is that on the completion of a scan, something examines
> whether it was a calibration scan, and if so, invokes one or more of a
> number of scripts (aka pipelines) which reduce the observation, insert
> the results into a calibration archive, and optionally alert the sequencer
> of their existence.  The imaging or quick-look pipelines are invoked at
> more stately intervals, and both go through a (probably identical) stage
> of extracting data from the calibration archive and constructing the
> detailed gain tables to use to make a first image.  Schilke & Guth
> seem to want to do the gain table construction on a scan-by-scan basis.
> This is not an unreasonable way of doing things (though one may want to
> go back and remake the whole lot after the last polarization and flux
> calibration observations), but I do think that making the gain tables
> should be clearly separated from the reduction of the calibrator observations,
> to make management of the priorities reasonably clean - calibrator observations
> *must* be reduced immediately, whereas if gain tables don't get made for
> a while, it's no big deal.  If we go this route, we need yet another name
> for a pipeline, to separate them.
> 
> The calibration scripts (aka pipelines) I listed in my E-Mail some months
> ago were:
>    calibrateTsys (loadswitched data)
>    calibrateSidebandRatio
>    calibrateFlux
>    calibrateBandpass
>    calibratePhase
>    calibratePointing
>    calibrateFocus
> To which I would add something to accumulate polarization calibration
> information; the complete reduction is not possible until all observations
> are in.

I think that there are three kinds of calibrations that could 
be handled by a "calibration pipeline":
- The instrumental calibration: pointing, focus, delay, baseline,
  etc. What is required here is a fast feedback to the control
  software.
- The calibrations that do not require a time interpolation, as
  the atmospheric or bandpass calibration: each time such a 
  scan is observed, something has to be derived and then stored, 
  to be applied to all the  following observations, until a new 
  calibration of that kind is observed.
- The calibrations that require a time interpolation, ie the
  phase and amplitude calibration: a calibration curve has to be
  fitted using all available calibrations and then applied to all  
  the source observations that were observed in between.

The two first categories can easily be handled by a calibration
pipeline.
As for the third category, it is not yet clear to me which pipeline
should do the job. In the document I sent a few days ago, all three
pipelines are doing something in this area, and I agree it is not

clear enough. I think that the science pipeline should do a clean 
job and derive the calibration curves using all data. But the 
calibration and quick-look pipelines should also do a similar 
calibration, to get an estimate of the phase rms and to produce 
quick images. 

Frederic.

>>>SMyers: I will add the 3 categories to the beginning of the pipeline
section.<<<

-------------------------------------------------------------------------------

21-June-2001: V2.0

-------------------------------------------------------------------------------

From bclark@aoc.nrao.edu Thu Jun 28 14:21:59 2001
Date: Thu, 21 Jun 2001 21:33:55 -0600 (MDT)
From: Barry Clark <bclark@aoc.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

One case where AIPS and AIPS++ are sadly deficient is in polarization 
calibration.  We need support for all the possibilities Steve brings up
in his use cases, and, as far as I know, they aren't there.  We need to
specifically call them out to make sure they get there.

-------------------------------------------------------------------------------

From tcornwel@cv3.cv.nrao.edu Thu Jun 28 14:22:17 2001
Date: Fri, 22 Jun 2001 08:23:09 -0600
From: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Cc: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>,
     Athol Kemball <akemball@cv3.cv.nrao.edu>
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements


Barry Clark wrote:
> 
> One case where AIPS and AIPS++ are sadly deficient is in polarization 
> calibration.  We need support for all the possibilities Steve brings up
> in his use cases, and, as far as I know, they aren't there.  We need to
> specifically call them out to make sure they get there.

To put it mildly, I'm surprised at this statement. The polarization
capabilities in AIPS++ are quite different from those in AIPS, and as far as I
can see, support the possibilities outlined in Steve's uses cases.

	- The formulation used is that of Hamaker-Bregman-Sault. The
development of this formalism by AIPS++ is outlined in a series of AIPS++
notes, particularly 182 onwards:

	http://aips2.nrao.edu/daily/docs/notes/notes/notes.html


	- A basic description of the overall calibration system is available
at the ADASS VI proceedings.

	http://www.cv.nrao.edu/adass/adassVI/cornwellt.html

Another more recent reference is the chapter on synthesis calibration in the
AIPS++ document "Getting Results in AIPS+"

http://aips2.nrao.edu/daily/docs/gettingresults/gettingresults/gettingresults.html

	- Polarization processing is not a special case but is designed
in from the beginning in data structures and algorithms. The data
format is defined in:

	http://aips2.nrao.edu/daily/docs/notes/229/229.html

	- The Jones matrix is the fundamental calibration term. The AIPS++
calibrater tool solves for and applies Jones matrices. All appropriate
calibration terms are stored in Jones or Mueller matrix form. Interpolations
are of these forms. The Jones matrices may be parametrized and solutions
derived with respect to these parameters. For the format of calibration
tables see:

	http://aips2.nrao.edu/daily/docs/notes/240/240.html

	- The formalism is independent of polarization type
(R,L,X,Y,elliptical) and can work with circular, linear, or any mix (though we
have not tried this latter possibility).

	- Fully correct (non-linear) D-term solutions (time-variable or fixed)
are of course available.

	- Models of the sky can be either via polarized components or via 
polarized images. (Parenthetically, the image plane polarization analysis
procedures in AIPS++ are excellent: see the Image Analysis chapter in
Getting Results.)

	- The formalism and implementation explicitly allows for correction
of polarized primary beams: for example, the mosaicing software, as implemented
now, can correct for the R-L beam squint of the VLA antennas. 

	- Solution for (polarized) primary beams is accomodated in the
formalizm and implementation but we have not yet pursued the difficult
algorithmic problem of solving for a parametrized primary beam.

	- Complex polarization dirty images (i.e. the XX, XY, YX, YY images
at WSRT) can be made.

If members of this committee wish to know more about the capabilities of
AIPS++, I'd recommend browsing Getting Results. I'm also willing to answer
questions, of course.

Tim Cornwell
AIPS++ Project Manager

-------------------------------------------------------------------------------

From Wim.Brouw@atnf.csiro.au Thu Jun 28 14:23:48 2001
Date: Sun, 24 Jun 2001 12:37:55 +1000
From: Wim Brouw <Wim.Brouw@atnf.csiro.au>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements 


I had a quick look at the Pipeline document. I hope that comments from
outside the ssr are allowed.
The comments are not exhaustive, but just on some points that caught my eye
while reading.


At Thu, 21 Jun 2001 17:09:33 -0600 (MDT) "Steven T. Myers" wrote:
...
 >   We distinguish three different pipelines, the Calibration, the Quick-Look,
 >   and the Science Pipeline.  The Calibration pipeline is intended for
 >   processing of array calibration data, usually on short turnaround
 >   time-scales, with feedback to the online system and into the archive.
 >   The Quick-Look pipeline has the job of providing quasi-realtime
 >   (~minutes) or short turnaround-time (<hours) images, spectra, and
 >   data-quality assessment for feeback to the online system and to the
 >   observers, and possibly output to the archive.  The Science pipeline
 >   is the primary data path from the array to the archive and to the
 >   observer, usually operating on longer timescales to produce results
 >   after breakpoints and after completion of projects.  [We should put
 >   a reference here to other documents describing this]

Essential, especially since here the pipelines are described as more or less
independent enntities; but later on e.g. the Science pipeline uses the
calibration pipeline to derive a.o. bandp[as calibrations (should they not be
paart of the ALMA derived calibration data?).

...
 >     1.0-R2 All corrections applied shall be recorded so that any step can be
 > 	   reversed and redone if needed.
 > 
Recording of correctioposn not sufficient: not all corrections can e.g. be
applied commutative. Hence in addition to corrections also model used (and
history thereoff) should be recorded. Would it not be much better to use the
normally used scheme of never changing the input data, but apply corrections
'on-the-fly' in one form or another (with maybe some intermediate dataset if
and when necessary). This would also solve the big problem on how to cater
for the interrelation between say data flags and corrections (not) applied.

>>>SMyers: This is an implementation issue.  The requirement should be the
general availability of "undo" or "redo" without reloading of data or undue
loss of previously done steps.  This req should be rephrased to reflect this.
<<<

...
 >     2.0-R1 The Calibration Pipeline shall be activated after each scan has
 >            been observed.
 > 
 >     2.0-R2 The Calibration Pipeline may also be re-invoked at any time with
 > 	   updated parameters or improved data.  The results should not
 > 	   immediately overwrite old results so comparison is possible
 > 	   before adopting the new calibration.  There will need to
 > 	   be a method for validation and acceptance of calibration
 > 	   updates.
 > 

Why is tehre the logical difference between R1 and R2 ? Or should "The
results.." be a separate R3?
Also, why are there parameters necessary for R2, but not for R2?

>>>SMyers:  R2 was my addition.  The Calibration pipeline will necessarily
be activated when a calibration scan is observed.  It may also be activated
by staff later on (perhaps with an updated calibration script or tool)
to improve the results from a previous run.  Some mechanism must be there
to have some validation and acceptance of the new results.  R2 was perhaps
too detailed.
<<<

 >     2.1 Interferometric data
 >     ------------------------
 > 
 > 	2.1-R1 The Calibration Pipeline shall reduce, and store the following
 > 	       results for further use:
 > 
 > 	    R1.1 the receiver sideband ratio calibration
 > 	    R1.2 the atmospheric calibration
 > 
 > 	       The results of the atmospheric calibration shall be passed
 > 	       to or made available for access by the Dynamic Scheduler
 > 	       (in real-time mode).

Is atmospheric calibration WVR? If so, is deriving the atmospheric
corrections from WVR not something much tighter coupled to WVR and related
hardware? I could easily imagine also that the timescale for deriving this
data is different for that of the actual output data time scale. That means
that correction cannot be undone, and the only way is to record both
corrected and uncorrected data (not a choice, since later in science pipeline
you say:"use corrected data...")

 >         2.1-R2 For all observations of an astronomical source, the Calibration
 >                Pipeline shall:
 > 
 > 	    R2.1 apply the atmospheric calibration to the data
 > 	    R2.2 store the phase corrected from the atmospheric effect, if
 > 		 required

Why only phase mentioned? Is it better to talk about data (meaning complex
data) throughout when appropriate?

...
 >         2.1-R3 For all observations of a calibrator source, the
 > 	       Calibration Pipeline shall:
 > 
 > 	    R3.1 compute the phase rms on the scan timescale
 > 	    R3.2 compute the antenna efficiencies, using the averaged
 > 		 amplitudes
 > 	    R3.3 do the previous operations both with and without the
 > 		 atmospheric phase correction, and deduce from the
 > 		 comparison whether the atmospheric phase correction
 > 		 improves the results or not

I read this as on a per-baseline base (since 'observations are outputs of
calibrators). Would it not be much better to do this on a per telescope
basis: can find any wrong atmospheric correction; can try to closure phase
(and get a real good idea about errors); can use more elaborate models for
filed under scrutiny..

 > 	    R3.4 derive amplitude and phase time-dependent variations by
 > 		 fitting smoothed curves (e.g. polynomials, splines)
 > 		 using all observations of calibrators since the beginning
 > 		 of the session
 > 

Are you averaging over data or over amplitude/phase separately; and why?

...
 >         2.1-R4 The Calibration Pipeline shall reduce the following
 > 	       observations:
 > 
 > 	    R4.1 pointing scans (results to be passed to the Sequencer)
 > 	    R4.2 focus measurments (results to be passed to the Sequencer)
 > 	    R4.3 delay calibration (results to be passed to the Sequencer)
 > 	    R4.4 bandpass calibration
 > 	    R4.5 baseline calibration
 > 	    R4.6 holography measurement

Very dangerous to leave this not open ended (there is e.g. not a single
receiver calibratiuon in the list, which could easily have to be done,
especially at higher frequencies; or polarization -- which always has to be
done). Deciding now already on which goes to
sequencer and which not is a dangerous business for a project of >10
years. Would it not be much better to design a general calibration data
interface that can be used for now and later? (more than one of course,
depending on typoe of correction, but less than total number of corrections).

Also, no mention at all is made of any re-use of data for various calibration
schemes (expensive photons), or, even more important in my opinion, coupled
(or iterative) solutions (how to gte polarization leakage without gain
calibration simultaneously; maybe focus and pointing are correlated; can
bandpass be looked at without looking at gain/phase and delay errors?). Look at
packages like Newstar, Miriad and the aips++ resulting one that can cater
for all this (basically by starting from an appropriate telescope/platform
model -- the measurement equation).

...
 >        2.2-R5 The calibration pipline shall derive the
 >               half-power beam size, the main-beam
 >               efficiency, and the Moon (fss) efficiency from the calibration
 >               scans towards planets and the Moon, and store
 >               the successful results in the telescope parameter file.
 > 
 > 	      Another derived parameter is the total forward efficiency
 > 	      obtained from skydip measurements.
 > 

and another... and another.. : too detailed (cannot be complete at this stage)

 > 3.0 Quick Look pipeline
 > -----------------------
 > 
 >     3.0-R1 The Quick Look pipeline shall be activated after the Calibration
 > 	   Pipeline has been completed.
 > 
 >     3.0-R2 A Monitoring Tool shall be available, plotting and archiving in a
 > 	   log file various results of the Calibration Pipeline:
 > 
 > 	R2.1 the results of the last pointing or focus scan
 > 	R2.2 the phase rms computed over the last scan and computed over the
 > 	     current session
 > 	R2.3 the corresponding seeing
 > 	R2.4 the atmospheric opacity
 > 	...
 > 
 > 	   This tool shall include a variety of options, to control the plot
 > 	   parameters, to plot the variation of these results with time, to
 > 	   allow the operator to monitor one antenna or baseline in
 > 	   particular, etc.
 > 

It seems a bad idea to build your own tool. There are many commercial
packages available (to mind comes the HP industrial monotoring package) that
can do all you want here. Other telescopes must also have packages that are
divers enough to enable this.

>>>SMyers: That is not TBD here.  All that matters is that such a tool is
available.
<<<

...
 >     3.0-R3 A Monitoring Tool shall be available to plot the current properties
 > 	   of the array, such as:
 > 
 > 	R3.1 the current instantaneous uv coverage
This ms? s? scan? observation?
 > 	R3.2 the corresponding weight distribution
? per UV cell? how define UV cell?
 > 	R3.3 the corresponding dirty beam
 > 	R3.4 the previous quantities, integrated since the beginning of the
 > 	     session
 > 	R3.5 the thermal noise rms reached since the beginning of the session
For what? Baseline? antenna? set of baselines?
 > 	...
 > 
 >     3.0-R4 Single-Dish data: the current spectra observed on the astronomical
 > 	   target shall be corrected from the emission at a reference position
 > 	   or frequency (depending on the observing mode), and displayed with
 > 	   various options:
 > 
 > 	R4.1 time integration
 > 	R4.2 antenna summation
 > 	R4.3 baseline fit, excluding a pre-defined window, or a window
 > 	     defined by the Operator or AoD
Why not auto windowing? especially with robust fitting techniques this should
be possible in 99% of cases
 >         R4.4 spectra on a pseudo-grid corresponding to position on a raster
 > 	     (a "stamp" or "profile" plot)
 > 
 >     3.0-R5 Interferometric data: the visibilities observed on a target source
 > 	   shall be calibrated, using the results of the Calibration Pipeline:
 > 
 > 	R5.1 apply the current bandpass calibration
How, if that is calculated in science pipeine later?
 > 	R5.2 apply the current amplitude and phase correction
I would make this 'corrections' (and 'complex gain corrections')
 > 	R5.3 apply the flux conversion factor based on standard antenna
 > 	     efficiencies
 > 
 >     3.0-R6 Interferometric data: the current spectra observed on the
 > 	   astronomical target shall be displayed (amplitude and phase) with
 > 	   various options:
 > 
 > 	R6.1 time integration
Over complex data or ampl/phase separately (and why)
 > 	R6.2 choice of the baseline(s)
with 2000 baselines need them probably ordered per baseline length and binned
 > to be able to say something sensible; probably even a kind of percentile
 > coloring or so to highlight problem points
 > 	R6.3 baselines summation
over baseline or time (how averaged)
 > 	R6.4 intensity (amp or phase) as function of baseline and time
 > 	     (for a frequency), or time and frequency ( for a baseline )
 > 
 >     3.0-R7 Interferometric data: the Quick Look Pipeline shall compute the
 > 	   Fourier Transform of the visibilities, using the fastest algorithm,
 > 	   and display the resulting image.  Alternatively, the actual Fourier
 > 	   Transform of each new visibility point can be computed and added to
 > 	   the current image. This shall be done for:
 > 
The last is better option in general on-line.
 > 	R7.1 the continuum data
 > 	R7.2 the line-averaged spectra, over a pre-defined velocity range,
 > 	     or possibly a velocity range defined by the Operator/AoD
Again, I would use auto line detection windows to average (even if this means
 > say a delay of half a scan or so)
 > 
 > 4.0 Science Pipeline
 > --------------------
 > 
 >     4.0-R1 The Science Pipeline shall be activated after completion of a
 > 	   session.
 > 
 >     4.0-R2 The Science Pipeline shall find in the Archive all data observed
I though scoence pipeline produced archive?

>>>SMyers: ALMA archives raw (and/or online-corrected) data to the pipeline.
The Science pipeline both inputs from and outputs to the archive.
<<<

 > 	   during the session.  It shall use the atmospheric-calibrated data
 > 	   (amplitude and phase).
What if the observer selected raw data (this option is mentioned in other
documents (I do not agree with it, but it is stated))
 > 
 >     4.1 Interferometric data
 >     ------------------------
 > 
 > 	4.1-R1 The Science Pipeline shall use the calibrator to derive:
 > 

This only looks at a single type of observing mode by mentioning 'the
calibrator shall ...' There are observing modes that could be used (a simple
one is some monitoring observation that uses the ALMA determined calibration
parameters straight off; also somewhere else the calibartion object is
described as being able to detrmine if and when a calibration shall be
done). I think R1 is not correct here

 > 	    R1.1 the bandpass calibration
 > 	    R1.2 the best phase and amplitude solution
 > 
 > 	4.1-R2 The Science Pipeline shall calibrate the source observations by
 > 	       applying:
 > 

... by applying either the best set of corrections available, or a user
selectable set (and drop the next 3)

 > 	    R2.1 the bandpass calibration
 > 	    R2.2 the phase calibration
 > 	    R2.3 the amplitude calibration

...
 > 	4.1-R6 Special cases shall be supported, including:
 > 
 > 	    R6.1 mosaic observations
 > 	    R6.2 on-the-fly mosaics
 > 	    R6.3 self calibration projects
 > 	    R6.4 combination of single-dish + ALMA data (+ACA)
 > 
 > 	       Comment: Careful cross calibration of the flux scales between
 > 	       ALMA interferometric data and single dish data ( and ACA )
 > 	       is required for high fidelity imaging.  This will require
 > 	       careful coordination with the calibration pipeline, especially
 > 	       as ACA observations may be taken at very different times than
 > 	       the main array data.
 > 

It is more than just cross-calibration to get to high dynamic range. I would
think that couple self-calibration could be required (or ...)


 > 	4.1-R7 Subtraction of continuum level from spectral data is
 > 	       required. This can be done in both Fourier and image
 > 	       domain. In the case of uv-plane subtraction, flexible
 > 	       setting of the frequency channel ranges for the calculation
 > 	       of the continuum level should be available.
Why flexible in UV domain (or why only in UV domain)?

...
 > 5.0 Interface with the Archive  --- TO BE DETAILED
 > ------------------------------
 > 
 >     5.0-R1 The images produced by the Science Pipeline shall be archived,
 > 	   together with the
 > 
 > 	R1.1 the script that was used to produce the image

A script is not sufficient; you have to know the version of the components
used in the script (a script is only the 'glue' between 'objects' (or so)).

 > 	R1.2 the log file of the software

I suppose this is the output log of this run, not the revision log?

>>>SMyers: Yes, that was my intention here.  Should rephrase.<<<

 > 
 >     5.0-R2 cf 7.0-R3 general SSR document
 > 
 >     5.0-R3 Also to be archived:
 > 
 > 	R3.1 data quality control:
 > 
 > 	     R3.1.1 estimate of the noise
 > 	     R3.1.2 seeing
 > 	     R3.1.3 image fidelity based on model?
 > 
 > 	R3.2 observation quality control:
 > 	     R3.2.1 baseline quality
 > 	     R3.2.2 calibration quality
 > 
 > 	R3.3 telescope state: (possibly in monitor file, but accessible)
 > 	     R3.3.1 telescope pointing
 > 	     R3.3.2 subreflector focus
 > 	     R3.3.3 monitor point (e.g. temperatures) data
 > 
 > Appendix:  Barry Clark's list of input parameters needed for each procedure
 > ---------------------------------------------------------------------------
 > 
 > Where should we really put these?  I guess an appendix to this section is
 > fine.

I do not know the purpose of this liost. If it is an indication of what kind
of parameter data is at least needed for certain operations, I can understand
that.
I would in general prefer to see a more encapsulated description, with a
number of parameter objects (e.g. a DeconvolutionParameterObject; a
calbration parameterObject,...). In that way:
- easily re-used in various 'scripts'
- easily moved as messages between methods (and updated)
- easy to cater for coupling between parameters, and calculation of many
based on one input)
- non-varying interface possible

<Barry's list>

>>>SMyers: Question still stands - should we keep this appendix here?
My inclination is to leave it out or move to an appendix at the end
of the entire document.<<<

...
 > 1.0 General Requirements and Interaction with other ALMA elements
 >     1.1 Goals of the Offline Package
 > 	1.1-R1 An ALMA Offline Data Reduction Package (or "the package")
 > 	       is primarily intended to enable end-users of ALMA (e.g.
 > 	       observers or archive users) to produce scientifically
 > 	       viable results that involve ALMA data products.  The secondary
 > 	       use is to enable ALMA staff to assess the state of the
 > 	       array and derive calibration parameters for the system.
 > 	1.1-R2 The package should be able to function (be installed) at
 > 	       the users home institution, in addition to operating at
 
The 'in addition ... seems superfluous (it also gives the wrong impression of
priority to end users). 

>>>SMyers: 'and' is better<<<

 > 	       ALMA regional centers (both locally and remotely).  It should
 > 	       be portable to a reasonable number of supported platforms,
 > 	       including laptops without network connections.
 > 	1.1-R3 The performance of the package should be quantifiable and
 > 	       commensurate with the data processing requirements of
 > 	       ALMA output at a given time.  This should be benchmarked
 > 	       (e.g. "AIPSmarks") and reproduce accurately results for
 > 	       a fiducial set of reduction tasks.
 > 	1.1-R4 The offline data reduction package should not suck.

suck???

 >     1.2 Relation to the Pipeline
 > 	1.2-R1 All modules available in the pipline must be available also
 > 	       as an offline analysis option.  Note that not all offline
 > 	       analysis tools will be in the pipeline package.

Should be stronger. When time progresses, most of the offline stuff has to be
done in the pipeline aswell (at least if some degree of on-line reduction and
assessment is to be done): the requirements for front-line science after
creaming-off the first results will be high sensitivity and dynamic range

>>>SMyers: Without suggested text I have no idea of what Wim is looking for.<<<
>>>SMyers: Upon further reflection, I dont think it will be fruitful for us
to over-specify at this time what will go into the archive via the Science
Pipeline.  It is likely that the be-all and end-all of images wont be within
our capability, and it may be better to allow "Archival" programs by users
(like HST) to do real science on the archive and not do that ourselves. I
am inclinde to leave this as it stands.<<<

 > 	1.2-R2 One of the important differences between pipeline and
 > 	       offline reduction path is that offline one should have
 > 	       extensive capabilities to merge and compare data with different
 > 	       resolution, coordinate system, data grid, and so on.
 > 

It seems to me wrong to exclude that from the outside for the science
pipeline ("which has available all archived data ..."). E.g. single dish data.

>>>SMyers: It is not the intention here to restrict the pipeline.  Rephrase
as 'may include extensive' etc., with R2 starting with the "Note that..."
from R1.<<<

...
 >     2.4 Interface programming, parameter passing and feedback
 > 	2.4-R1 Must have basic programming facilities such as:
 > 
 > 	    R1.1 variable assigment and evaluation
 > 	    R1.2 conditional statements
 > 	    R1.3 control loops
 > 	    R1.4 string manipulation
 > 	    R1.5 user-defined functions and procedures
 > 	    R1.6 standard mathematical operations
 > 

Important aspects missed I think (why go in so much detail): vector handling;
complex numbers; DO standard calls, persistent objects, ...

>>>SMyers: Im not sure what the latter two are and I doubt users care.  I
   think we need examples here to show the intent. However, if we cannot
   define something like a minimal set, only a representative list, is this
   still useful?<<<

 > 	2.4-R2 Commands executed should be logged, with provision to
 > 	       re-execute the session.
 > 	2.4-R3 Input parameter checking upon parsing with reporting of
 > 	       incorrect, suspicious or dangerous choices should be
 > 	       done before execution where possible.
 >      2.4-R4 Parameters should be passable between applications in as
 > 	       transparent a manner as possible.  However, global parameters
 > 	       should not be the default, unless chosen specifically by the
 > 	       user-programmer.

A long-term system will have a large multitude of parameters (including
things like the distance to the SUN etc). If these parameters are what is
called 'global' here, I disagree. Many parameters will not change for years
on end, and parameters will be added constantly. For the average user these
parameters are of no interest (if he/she would know what they are
anyway). They are basically 'hidden' parameters in the sense of the next
paragraph. These parameters should have a global value (default) value if you
do not want to drive any observer crazy.

>>>SMyers: No, thats not what I meant.  I meant that if you define some
   variable such as, heaven forbid, APARM(1), it wont persist across
   functions by default!  Specific "parameters" like your system-wide 
   ones would be specifically designated as global (with some protection?).
   I guess I mean "variables".
<<<

...
 >     3.2 Data import and export
 > 	3.2-R1 The FITS/UVFITS data format and/or other commonly supported
 >                standards must be supported for both input and output
 > 	       without loss of functionality or information, though
 > 	       need not be the native format for both the package and archive.

By the time ALMA is operational, XDF, FITSML, etc will, I think, be the
standard data exchange types, and preferable to non-described formats like
UVFITS. You should be able to read 'old' formats (why not mention HDL etc as
well? , but preferably produce new formats. 

>>>SMyers: I have no idea what these are.  I guess I had no real idea that
UVFITS was anything other than a flavor of FITS.  I will mention only FITS.
Someone else who knows more about this should craft the requirement.  What
flavor is IDI for example?  I will add something about the project specifying
which formats will be supported.<<<

 > 	3.2-R2 Access to the archive must be supported, including for data
 >                from the currently active observing session.  Security and
 >                integrity of the archive must be ensured during these
 > 	       operations.
 > 	3.2-R3 Disk and offline data storage (eg. DAT, DDS, DLT) must be
 > 	       supported.

Again dangerous to mention types (no CDrom and DVDs are mebntioned!). Why not
drop this one R3, and make it:
- Internet and offline import/export media for the major systems used in the
partner countries m ust be supported.

>>>SMyers: Again examples are useful, and I doubt that CDrom or DVD will be
of sufficient storage volume to be useful.  The project must decide on 
supproted media.<<<

 >  3.2-R4 The ability to ignore flagged data on export should be
 >         included.

??? Ignore flag, or skip data (a bad idea I think)

>>>SMyers: I mean "drop flagged data" instead of propagating flags.  In 
general not a good idea but often useful when careful.<<<

 >  3.3 I/O speed and efficiency
 >  3.3-R1 I/O of data must not be a bottleneck for processing, especially
 >         for pipeline use.  This is especially true if the native format
 >         of the package is not used and filling/conversion is necessary.
 >         The definition of what constitutes a "bottleneck" and what
 >         I/O throughput rate is acceptable must be defined at each stage
 >         of ALMA operations (eg. interim science, full stand-alone ALMA,
 >         ALMA + ACA) and in each mode (eg. quick-look pipeline, offline
 >         use).  For offline use, the intention is that users not be
 >         faced with I/O operations that are way out of line with the
 >         fastest equivalent times that could reasonably be achieved
 >         with software development.

What is meant here? 

>>>SMyers: We need to ensure that the speed of the package is up to some
standard.  I dont know any other way to word it.  Maybe we should
establish benchmarks for the package...<<<

 >   3.4 ALMA interferometer data
 >   3.4-R1 Correlation products accumulated at multiple bit depths
 > 	    (16-bit,32-bit) must be supported transparently
 >   3.4-R2 On-line gain correction data must be carried along with
 > 	    data
 > 	3.4-R3 Calibration tables and editing information must be associated
 > 	       with the data and preserved on output
 >     3.5 ALMA single dish and phased-array data

+ unphased array(?) - how are you going to phase?

>>>SMyers: this is done online (eg. for VLBI)

Should it not be better to state:
- Data taken in any of the ALMA hardware available modes should be suppoorted
  in the most appropriate manner (e.g. and give one example).
- Data from non-ALMA telescope (single dish or array) should be usable if
  provided in a standard data exchange format

>>>SMyers: I think it is better to delineate the modes we know about 
explicitly where possible.  However a blanket requirement should be
added as 3.1-R1.  The second regarding foreign data is dealt with as
3.7-R1.<<<

And, in my view even more important, since we want to make sure that ALMA
will be used to advance astronomy in general, and not as a tool for some
mm-wave astronomy specialists:

- ALMA shall produce its observed data and processed data (images, spectra)
  in a world-wide accepted standard data exchange format, which can be
  accessed by general display and viewing packages (e.g. IRAF, AIPS++, ...). 

>>>SMyers: this sounds more like a requirement on ALMA (the pipeline?) than
on the offline package.  I had though about adding 
  3.1-R2 The package shall produce its observed data and processed data 
	 (images, spectra) in a world-wide accepted standard data 
	 exchange format, which can be accessed by general display 
	 and viewing packages (e.g. IRAF, AIPS++, ...). 
but have decided 3.2-R1 is sufficient for the offline package.<<<

...
 > 4.0 Calibration and Editing
...
 > 	4.1-R3 Data display and editing should be effected through generic
 > 	       tools applicable to both single-dish and interferometer modes.
 >                These should present should, as far as possible, present
 > 	       similar interfaces to the user and have the same look-and-feel.

Ther are generic tools available in packages and tools, which could easily be
re-used. In my view this re-use aspect (with similar look-and-feel across the
whole astronomical observational spectrum) is in the longer term much more
important than a similar look and view within a very small user
community. Ifd you want Dimap style, use Difmap; etc..

>>>SMyers: That sounds nice, but how does that apply to the offline package?
And is this really important?  Note that the only sure way to ensure
uniformity across the spectrum is to have a single supplier or consortium,
and I dont think the users would benefit from Microsoft-ish dominance either.
<<<

...
 >         4.2-R6 Determination of, correction for, and examination of closure
 > 	       errors should be straightforward to carry out.

This belongs in the calibration pipeline.

...
 > 	5.1-R4 Astrometric accuracy must be preserved over phase-calibration
 >                distances of a few degrees.

Statements like these should be quantified exactly or not have any numbers

>>>SMyers: Nonsense.  The intention is clear (or at least can be made clear)
and would not benefit from saying "10 degrees".  Perhaps "at least 5
degrees?" though that doesnt do it right either.  The need is to have
it deal with all reasonable switching distances.<<<

 >         5.1-R5 Images made on the different equinox (e.g. B1950 and J2000)
 > 	       or different coordinate (RA,DEC and l,b) system or
 > 	       different projection (tangent, sinusoidal, ...)
	       can be merged and compared appropriately.
 >         5.1-R6 Data cubes using different velocity definition (optical or radio
 > 	       definition for Doppler velocity) must be merged appropriately.

I think there is a mix-up between labelling and contents of axes. Correlators
will produce data as a function of frequency only why change that? the way
data is labelled does not change the data. Why would you (for a single map)
make a new datacube with equal spacing in some other coordinate?

>>>SMyers: actually the correlator will do so as a function of lag (at least
for the baseline correlator).  Im not sure what is meant here (need a line
pundit) but I think it has to do with translation of (possibly foreign) 
data.<<<

 >     5.2 Interferometer imaging
 >         5.2-R1 High-fidelity imaging of the entire primary beam in all
 > 	       Stokes parameters is the primary goal - therefore,
 > 	       incorporation of the polarized primary beam response of the
 > 	       array is required.
 > 	5.2-R2 Imaging must deal seamlessly with mosaiced data, with proper
 > 	       gridding in the uv-plane and compensation for primary beam
 > 	       effects and pointing in such a manner as to mitigate the
 >                effects of non-coplanar baselines and sky curvature.  A
 > 	       variety of options for gridding and beam correction should
 > 	       be available at user request.
 > 	5.2-R3 There must be seamless integration of data from multiple
 > 	       epochs and configurations
 >         5.2-R4 There must be the ability to include short-spacing data
 > 	       taken in single-dish mode (both ALMA and non-ALMA data)
 > 	5.2-R5 Subtraction of continuum level from spectral data is required.
 > 	       This can be done in both the Fourier and image domain.
 >                In the case of uv-plane subtraction, flexible setting of the
 > 	       frequency channel ranges for the calculation of the continuum
 > 	       level (graphically as well as CLI) should be available.

There must be the possibility to create 3D imgaes for rotating objects (see
e.g. Miriad).

>>>SMyers: I assume you mean like a planet? Add as 5.2-R6.<<<

...

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:36:05 2001
Date: Sat, 23 Jun 2001 21:32:28 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

>  >     1.0-R2 All corrections applied shall be recorded so that any step can be
>  > 	   reversed and redone if needed.
>  > 
> Recording of correctioposn not sufficient: not all corrections can e.g. be
> applied commutative. Hence in addition to corrections also model used (and
> history thereoff) should be recorded. Would it not be much better to use the
> normally used scheme of never changing the input data, but apply corrections
> 'on-the-fly' in one form or another (with maybe some intermediate dataset if
> and when necessary). This would also solve the big problem on how to cater
> for the interrelation between say data flags and corrections (not) applied.
> 

I agree completely with Wim's comments here. Radio astronomy data reduction
is not the same thing as word processing. Having to handle reversals and
undo's etc could easily double the size of your system. If you don't like
your result, just rerun the job with modifications to those input
parameters that you think will lead to improvement.

>>>SMyers: This is an implementation issue.  It will probably be done (at
least if aips++ or simiar is used) by successive application of tables.
However, it would be bad if you had to save a copy of the entire dataset
at a given state to get back to that state, that is if the tables arent
sufficient.  It this we need to require... Try to reword req 1.0-R2 to that 
effect.<<<

>  > 2.1-R5 User-understandable and non-destructive error handling at
>  >        all levels is highly desirable.
>  > 2.1-R6 Multiple levels of "undo" should be supported for all tasks.

Ditto here.

>>>SMyers: it is the interpretation of the implementation of "undo" that
is the problem.  Perhaps "undo" conjures too specific a picture, reword...<<<

Tony
-- 
Tony Willis

Internet  :   Tony.Willis@hia.nrc.ca  
Snailnet  :   Dominion Radio Astrophysical Observatory
              P.O. Box 248, Penticton, BC, Canada V2A 6K3
BC Tel net:   (250) 493-2277    Faxnet    :   (250) 493-7767
voicemailnet: (250) 490-4343    Localnet  :   ext 343           

-------------------------------------------------------------------------------

From tcornwel@cv3.cv.nrao.edu Thu Jun 28 14:36:53 2001
Date: Tue, 26 Jun 2001 11:09:33 -0600
From: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements


I have some comments on this draft based on my relevant experience 
in a number of roles:

	- Someone involved in research on various types of calibration
and imaging including mosaicing
	- As AIPS++ Project Manager since 1995
	- As someone responsible for NRAO's end-to-end processing needs.

0. A point concerning scope. AIPS++ is about 150 FTE-years. AIPS is
probably about the same. The ESO Data flow system is about 300 FTE-years
(I believe). I would guess from some communications that for the items 
described in this requirements document, the ALMA computing division 
has between 40 and 60 FTE-years (depending on how one counts various 
things). I would counsel that you spend that effort wisely. I think the
current draft overspends by a large factor.

>>>SMyers: Not our problem, except in assigning priorities (many of the
over-specified areas should be better in the next draft).  If the
package designers think that specific parts designated high-priority
will be too expensive, then they should back that up with numbers and
propose the relaxation of specific requirements to the project.<<<

1. A general comment is that data reduction splits into strategy
and tactics. The tactics come from the basic physics but the
strategy comes from experience. I think the document is mostly
fine on tactics but is a little too specific about some strategies.
The items on the calibration pipeline seem to me to fit in this
category. For example, 2.1-R3 is a strategy that may or may not
work in all situations.

>>>SMyers: True, but its not clear we can write a pipeline document without
giving at least some specific (example) operations.  To be discussed in
Berkeley.  By the way, it think this general issue of specific vs. general
is important to get sorted out early on (I would have preferred to do this
before writing even this much).<<<

2. It's hard to know how to process data for a ground-breaking
telescope like ALMA. I think one should be modest in setting
forth too-definitive statements of how the processing should
proceed. In this context, I think the tool-based approach using
in AIPS++ is vital, and I would advocate including a statement
aimed at this point.

>>>SMyers: "Tool-based approach" is meaningless (except maybe to everyone
but me).  I think the requirements cover the building blocks of the
aips++ approach, but if there is a specific statement you advocate
inclusion of, we would consider it.<<<

3. I haven't followed your discussions in detail so I'm not at all
sure what General Consideration B means. In what way is there a
fundamental distinction? I could not see how this consideration
affected the rest of the document. It's also a very dangerous point
since in many operations, one obviously wants no distinction.

>>>SMyers: Thats the point of B.  There was a proposal to break the
requirements down more by single-dish vs. interferometer than it
is.  This is an explanation of why.  Note that most of these
"general considerations" will be removed in the final doc.  I just
wanted somewhere to put some discussion of why the doc looks the
way it does.<<<

4. There are some prescriptive implementation details that should
be removed (e.g. 3.0-R7 "using the fastest algorithm", also
the Appendix of Barry Clark's input parameters).

>>>SMyers: At least in this stage of the document it is useful to have
examples of implementations.  We will have to delineate prescriptive
and descriptive items if we wish to keep them in.  As for 3.0-R7, Im
not sure what the authors wanted here.<<<

5. I am surprised that the document has relatively few requirements 
that are operational in nature. For example:

>>>SMyers: good idea.  Some operational issues are located elsewhere,
make a new OL-1.3 for this.<<<

	- Be installation-flexible: can be installed on non-specialized 
	hardware by end user
>>>SMyers: OL-1.3-R1 <<<
	- Processing script must be re-executable with only a small 
	number of changes
>>>SMyers: I dont know what this means. A script, once executable, should
always be executable. Unless you mean under later versions of the 
software.  I'll pretend this is what you mean as OL-1.3-R2 <<<
	- Process standard recurring observations and analyze according 
	to standard recipes 
>>>SMyers: Do we really need to say that? At some level those are things
we are specifying as "examples" of the operations in this document and
for which we get criticized as being too naive!<<<
	- Provide real-time feedback via standard compact displays 
	and plots
>>>SMyers: add to GUI as OL-2.2-R1<<<
	- Be operable automatically or manually
>>>SMyers: already under interface as OL-2.1-R1<<<
	- Allow preemption, termination, resubmission, etc.
>>>SMyers: with proper error handling & recovery, OL-1.3-R3<<<

6. I found some of the discussion hard to understand. An example
is 3.3-R1: Everything but the first sentence is unnecessary and
detracts from the simplicity of the requirement.

>>>SMyers: 3.3-R1 contains too much discussion, granted.  We need to 
settle on a simple requirement text.  Throw me a bone here...<<<

7. A major point that applies to all my remaining comments 
is that it's easy to write simple sounding requirements that
either double, triple, etc the software costs or prevent any
estimation at all. Wim and Tony pointed out that adding undo 
is one example. I'd also add a substantial number of others:

"1-R1: The pipelines shall be able to process all data coming from
the array."

For all arrays that I know of, one can think of observations that
"break the bank" of available computing. This must be true of
ALMA as well. Do you really want to limit the array in this way
or specify the pipeline so aggressively?

"4.2-R1: The data taken on the astronomical source shall be reduced,
depending on the observing mode. All possible modes shall be 
supported:
	R1.1 etc"

I think only the enumerated modes should be supported. Only known
things can be guaranteed to be supported.

>>>SMyers: True. Reword. But we cannot enumerate all modes in this
document - the project will need to maintain a list and part of the
negotiation for the package will be to agree to a mechanism to
deal with this evolving list.<<<

"1.1-R4 The offline data reduction package should not suck"

Harder than you would think. I think the software costs for this
are unknown.

>>>SMyers: The costs will likely suck also.<<<

"2.3-R4 All functionality of the {CLI,GUI} must be supported in {GUI,CLI}
mode" (Note some numbering problems here).  

This is much, much harder than you would think, and is a waste
of resources, especially in the era when UIs are evolving so
quickly. 

>>>SMyers: In principle, there should be 
underlying functionality that is accessible, with varying degrees of 
simplicity (eg. point-and-click to delete a point vs. specify a 
visibiltity in the CLI).  The relation between the GUI and glish in
aips++ is an example.  I think this is important to maintain as much
as possible (and has implications for pipelining).<<<

"3.2-R1 The FITS/UVFITS data format....without loss of functionality or
information" 

UVFITS will lose information. A dump of a data format to FITS binary 
tables is probably what is needed.

>>>SMyers: UVFITS is gone.<<<

"4.1-R1 The package must be able to reliably handle all of the proposed
and future ALMA calibration modes"

This (future modes) is of course impossible to guarantee, and bad 
practice to specify(!)

>>>SMyers: reword.<<<

"4.2-R9 Determination of polarization...."

Why would you want linearized solutions, except to save time? If so, say that
it's allowed to save time.

>>>SMyers: if its allowed, its allowed.  By the way, this is a "standard
recipe" such as you wanted us to write in as a requirement earlier.<<<

"4.4-R3 The complex polarization response of the telescope beams must be
calibratable (though this is mainly an imaging step)"

Some research is needed here: how would one model the response? 
This has considerable impact on the processing, especially if the responses
differ significantly from antenna to antenna. This also goes to requirement
5.2-R1.

>>>SMyers: indeed. I dont know a better way to word these more speculative
requirements (originally there was going to be a special flag for these).
Suggestions welcome.<<<

"7.1-R4 The output of the display should be possible in many different formats..."

No, I think you choose one and let other software (e.g. ImageMagick) do any
conversion. That's what we do in AIPS++: we write xpm and recommend that people
use a converter of which there are plenty.

>>>SMyers: bad idea.  Thats why we should specify this here.<<<

"8.1-R3 The speed of the simulator must be commensurate.....". 

While one may require this, it may not be doable. Simulation is hard and
can be very computationally expensive. In some cases, the simulator may have to run 
in the pipeline (using parallel code).

>>>SMyers: As stated, there may have to be different "simulators" depending
on the problem complexity and timescale - there will have to be at least
a quick simulator for the obstool for example.<<<

That's it.

Tim

-------------------------------------------------------------------------------

From lucas@iram.fr Thu Jun 28 14:37:20 2001
Date: Tue, 26 Jun 2001 20:12:01 +0200
From: Robert Lucas <lucas@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

Some more comments:

>     4.2 Single dish data
>     --------------------
> 
>         4.2-R1 The data taken on the astronomical source shall be reduced,
>                depending on the the observing mode. All possible modes
>                shall be supported:

All supported observing modes shall be supported for data reduction...

>>>SMyers: reworded, see above.<<<
...
>     1.2 Relation to the Pipeline
>         1.2-R1 All modules available in the pipline must be available also
>                as an offline analysis option.  Note that not all offline
>                analysis tools will be in the pipeline package.

That's very important. For instance atmospheric models are progressing,
many calibration devices will be aailable some of which will be used by
the pipeline, one may wish to reprocess tha atmospheric calibration with
improved or reevaluated atmospheric data. We have done this several
times at Plateau de Bure. So the atmospheric calibration procedures
(like a full atmospheric model) should be available in the off-line
package.

>>>SMyers: agreed.<<<
...
>     2.3 Command Line Interface (CLI)
>         2.3-R1 The CLI must be useable remotely over low-speed modem lines
>                or network connections, with ACSII terminal emulation.
>         2.3-R2 The interface must have the facility to read in command files
>                for batch processing of a sequence of CLI commands.
>         2.3-R3 The CLI should have command-line recall and editing
>         2.3-R4 All functionality of the GUI must also be available in CLI
>                mode.

Do not we require something like a minimum degree of user-friendlyness?
Please do not forget the biochemist!
I think that based on the dataset and its status the user should be
proposed a list of possible operations to be done on the data with
comments on the results that they are supposed to give.

>>>SMyers: Is this above what is required in 2.5 Documentation and Help?<<<

...
>     3.2 Data import and export
>         3.2-R1 The FITS/UVFITS data format and/or other commonly supported
>                standards must be supported for both input and output
>                without loss of functionality or information, though
>                need not be the native format for both the package and archive.

UVFITS is I guess deprecated. to not put it at the same level as general
FITS.

>>>SMyers: gone. Mea culpa.<<<

...
>     3.5 ALMA single dish and phased-array data
>         3.5-R1 Data taken with nodding secondary must be supported, as
>                a function of nodding phase

I've seen secondaries vibrating, nutating and wobbling but not yet
nodding. (note: one of the senses is: to incline or sway from the
vertical as though ready to fall....)

>>>SMyers: Nutation - Etymology: Latin nutation-, nutatio, 
from nutare to nod, rock.<<<

to be continued ...
-- 
Robert LUCAS,            Institut de Radioastronomie Millimetrique
300 rue de la Piscine,  F-38406 St Martin d'Heres Cedex   (FRANCE)
Tel +33 (0)4 76 82 49 42                  Fax +33 (0)4 76 51 59 38 
E-mail: mailto:lucas@iram.fr                http://iram.fr/~lucas/

-------------------------------------------------------------------------------

From smyers@cv3.cv.nrao.edu Thu Jun 28 14:40:36 2001
Date: Tue, 26 Jun 2001 12:11:01 -0600 (MDT)
From: Steven T. Myers <smyers@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: ALMA Science Software Working Group <alma-sw-ssr@cv3.cv.nrao.edu>
Subject: [alma-sw-ssr] comments on v2.0 Pipeline and Offline draft


Its great to see the comments.  Keep them coming!

It would be a great help if commenters suggest new or replacement
requirement text, rather than just comments.  Some of the overly wordy
"requirements" that some of you have commented on are the result of
getting a long comment and not knowing what to do with it, and so I
more or less included the comment as the requirement.  If it isnt obvious
from the comment what to do (such as "remove this requirement" or "delete
the word XXXX" or "replace XXXX with YYYY") then include your revision
of the requirement or the text of the new requirement.

Thanks,

 -Steve

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:41:10 2001
Date: Tue, 26 Jun 2001 11:52:49 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements (fwd)

> 	4.1-R6 Special cases shall be supported, including:
> 
> 	    R6.1 mosaic observations
> 	    R6.2 on-the-fly mosaics
> 	    R6.3 self calibration projects
> 	    R6.4 combination of single-dish + ALMA data (+ACA)
> 

Why are these called Special cases? I would have thought they should
be Standard cases.

>>>SMyers: "designated modes"?<<<

Tony

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:41:37 2001
Date: Tue, 26 Jun 2001 12:19:22 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

Tim wrote:

> "8.1-R3 The speed of the simulator must be commensurate.....". 
> 
> While one may require this, it may not be doable. Simulation is hard and
> can be very computationally expensive. In some cases, the simulator may have to run 
> in the pipeline (using parallel code).
> 
> That's it.
> 
> Tim
> 

>From the document:

>> 8.0 Simulation
>>     8.1 General simulation requirements
>> 	8.1-R1 There must be simulation capability for interferometer and
>> 	       single dish observation with ALMA in all modes, for planning
>> 	       (with the ObserveTool) and comparison of data with models
>> 	       (for editing and correction).  These should include error
>> 	       generation for thermal noise, pointing, primary beam,
>> 	       atmosphere, antenna surface errors, etc.
>> 	8.1-R2 The output of the simulator must be compatible with the
>> 	       rest of the offline package, and with the ALMA pipeline.
>> 	       It should be available in all ALMA data format(s).
>>        8.1-R3 The speed of the simulator must be commensurate with the
>> 	       desired feedback time.  For instance, if used with the
>> 	       real-time-system to assess quality the simulator must
>> 	       respond in minutes, if used for proposer feedback for
>> 	       ObsTool application it should feedback also on minute
>>             timescales for most simple experiments, while for complicated
>>             engineering simulations it may be allowed to take
>>             correspondingly longer.
>>             8.1-R4 The simulator should be available early in the software
>>             production cycle in order to use it to test other components
>>             of the package.

One of the goals of the Canadian proposal is to indeed build a simulator
capable of emulating the data rate from the actual ALMA telescope. Initially
(2001 - 2002 etc) - yes, it would have to run on some massively parallel
architecture, but luckily, radio interferometers have lots of 'embarassingly
parallel' components. So I believe it can be done, and should be done
early in the software cycle as proposed above.  But no, it will not run
on someone's laptop as proposed for ALMA software in 2.1-R7. 

>>>SMyers: Some aspects of simulation will have to run quickly by users
(eg. in the obstool).  A monolithic simulator is a bad idea, IMO.<<<

I believe a simulator is critical to the success of ALMA. 
The VLA was luckily saved from 
being an expensive boondoggle by the development (after the fact) of 
self-calibration by Tim and others. A 'real telescope' simulator will
allow the investigation and solution of ALMA imaging problem long before
the telescope is turned on. (I will give aips++ a plug by saying that it 
has excellent tools for easily developing massively parallel applications 
although the larger astronomical community has little awareness of the 
potential use of aips++ in this area.)

Tony

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:42:16 2001
Date: Tue, 26 Jun 2001 16:59:03 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements (fwd)

Some more thoughts:

> 5.0 Interface with the Archive  --- TO BE DETAILED
> ------------------------------
> 
>     5.0-R1 The images produced by the Science Pipeline shall be archived,
> 	   together with the
> 
> 	R1.1 the script that was used to produce the image
> 	R1.2 the log file of the software
> 
>     5.0-R2 cf 7.0-R3 general SSR document
> 
>     5.0-R3 Also to be archived:
> 
> 	R3.1 data quality control:
> 
> 	     R3.1.1 estimate of the noise
> 	     R3.1.2 seeing
> 	     R3.1.3 image fidelity based on model?
> 
> 	R3.2 observation quality control:
> 	     R3.2.1 baseline quality
> 	     R3.2.2 calibration quality
> 
> 	R3.3 telescope state: (possibly in monitor file, but accessible)
> 	     R3.3.1 telescope pointing
> 	     R3.3.2 subreflector focus
> 	     R3.3.3 monitor point (e.g. temperatures) data

Do the raw observed data end up in the archive? I assume so. Or is that
requirement given in another document?

> 1.0 General Requirements and Interaction with other ALMA elements
>     1.1 Goals of the Offline Package
> 	1.1-R1 An ALMA Offline Data Reduction Package (or "the package")
> 	       is primarily intended to enable end-users of ALMA (e.g.
> 	       observers or archive users) to produce scientifically
> 	       viable results that involve ALMA data products.  The secondary
> 	       use is to enable ALMA staff to assess the state of the
> 	       array and derive calibration parameters for the system.

Surely this secondary use is more a real-time or near real-time requirement?

> 	1.1-R2 The package should be able to function (be installed) at
> 	       the users home institution, in addition to operating at
> 	       ALMA regional centers (both locally and remotely).  It should
> 	       be portable to a reasonable number of supported platforms,
> 	       including laptops without network connections.

The 3 -> 60 Mb / sec data rate from ALMA is comparable to the data
rate assumed by Tim Cornwell for the EVLA (EVLA memo 24). He calculates
that you will still need at least a $20,000 to $100,000 (2000 dollars)
computer system in 2009 to handle that amount of data. So you will need
a very expensive laptop. I do not think that having a requirement that
the ALMA offline system run on laptops is realistic.
 
>         2.1-R6 Multiple levels of "undo" should be supported for all tasks.
>         2.1-R7 The interface and package should function without a network
> 	       connection (e.g. a laptop on an airplane).

Ditto here.

Tony

-------------------------------------------------------------------------------

From smyers@cv3.cv.nrao.edu Thu Jun 28 14:42:38 2001
Date: Tue, 26 Jun 2001 18:48:20 -0600 (MDT)
From: Steven T. Myers <smyers@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)


On Tue, 26 Jun 2001, Tony Willis wrote:

> Some more thoughts:
>
> > 5.0 Interface with the Archive  --- TO BE DETAILED
> > ------------------------------

<stuff deleted>

> Do the raw observed data end up in the archive? I assume so. Or is that
> requirement given in another document?

Thats in the first Requirements document (what should we refer to it as
in this document by the way?).  The default is that raw data and WVR
corrected data are both archived.

> > 1.0 General Requirements and Interaction with other ALMA elements
> >     1.1 Goals of the Offline Package
> > 	1.1-R1 An ALMA Offline Data Reduction Package (or "the package")
> > 	       is primarily intended to enable end-users of ALMA (e.g.
> > 	       observers or archive users) to produce scientifically
> > 	       viable results that involve ALMA data products.  The secondary
> > 	       use is to enable ALMA staff to assess the state of the
> > 	       array and derive calibration parameters for the system.
>
> Surely this secondary use is more a real-time or near real-time requirement?

Hard to say.  I think the staff (and members of this group!) will be using
the package manually to look at test data the day after observation or
even later, for example.

>
> > 	1.1-R2 The package should be able to function (be installed) at
> > 	       the users home institution, in addition to operating at
> > 	       ALMA regional centers (both locally and remotely).  It should
> > 	       be portable to a reasonable number of supported platforms,
> > 	       including laptops without network connections.
>
> The 3 -> 60 Mb / sec data rate from ALMA is comparable to the data
> rate assumed by Tim Cornwell for the EVLA (EVLA memo 24). He calculates
> that you will still need at least a $20,000 to $100,000 (2000 dollars)
> computer system in 2009 to handle that amount of data. So you will need
> a very expensive laptop. I do not think that having a requirement that
> the ALMA offline system run on laptops is realistic.

The users should be able to reduce their data wherever they are.  I don't
see how the peak and sustained data rates enter into this --- thats for
the Pipeline primarily, and for the input into the Science archive (Im
assuming thats the Pipeline also).  A user should be able to reduce a
12-hour dataset (some spectral line mode) on a desktop or laptop system
in 2007.

>
> >         2.1-R6 Multiple levels of "undo" should be supported for all tasks.
> >         2.1-R7 The interface and package should function without a network
> > 	       connection (e.g. a laptop on an airplane).
>
> Ditto here.
>

and ditto here too.  Unless my assumption that the Offline requirements
are primarily for end users is way off...

  -Steve

-------------------------------------------------------------------------------

From tcornwel@cv3.cv.nrao.edu Thu Jun 28 14:43:09 2001
Date: Tue, 26 Jun 2001 20:49:16 -0600
From: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)


Tony Willis wrote:

> The 3 -> 60 Mb / sec data rate from ALMA is comparable to the data
> rate assumed by Tim Cornwell for the EVLA (EVLA memo 24). He calculates
> that you will still need at least a $20,000 to $100,000 (2000 dollars)
> computer system in 2009 to handle that amount of data. So you will need
> a very expensive laptop. I do not think that having a requirement that
> the ALMA offline system run on laptops is realistic.

What I calculated was the average rate for a few cases. There is a spectrum
of possible observational scenarios, and undoubtedly some of scenarios will be
reducible on a laptop/PDA/wristwatch and it will make sense to do so. So
I do think that the requirement is realistic. I just cannot see that
there would be any doubt that any package could run on a laptop. It's
already trivial for even the biggest packages so why worry?

>>>SMyers: I worry about all things software :-( <<<

Tim

-------------------------------------------------------------------------------

From lucas@iram.fr Thu Jun 28 14:43:34 2001
Date: Wed, 27 Jun 2001 10:24:12 +0200
From: Robert Lucas <lucas@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

Tim Cornwell wrote:
> 
> Tony Willis wrote:
> 
> > The 3 -> 60 Mb / sec data rate from ALMA is comparable to the data
> > rate assumed by Tim Cornwell for the EVLA (EVLA memo 24). He calculates
> > that you will still need at least a $20,000 to $100,000 (2000 dollars)
> > computer system in 2009 to handle that amount of data. So you will need
> > a very expensive laptop. I do not think that having a requirement that
> > the ALMA offline system run on laptops is realistic.
> 
> What I calculated was the average rate for a few cases. There is a spectrum
> of possible observational scenarios, and undoubtedly some of scenarios will be
> reducible on a laptop/PDA/wristwatch and it will make sense to do so. So
> I do think that the requirement is realistic. I just cannot see that
> there would be any doubt that any package could run on a laptop. It's
> already trivial for even the biggest packages so why worry?
> 
> Tim

I think that it's still reasonable to require that if data reduction of
a good fraction of projects is feasible off-line with the cpu and memory
available on a  laptop, then it should not be restricted by other issues
(expensive(>0?) licences, complicated installation procedures ...).

>>>SMyers: Make part of Tim's suggested operational issues OL-1.3<<<

-- 
Robert LUCAS,            Institut de Radioastronomie Millimetrique
300 rue de la Piscine,  F-38406 St Martin d'Heres Cedex   (FRANCE)
Tel +33 (0)4 76 82 49 42                  Fax +33 (0)4 76 51 59 38 
E-mail: mailto:lucas@iram.fr                http://iram.fr/~lucas/

-------------------------------------------------------------------------------

From lucas@iram.fr Thu Jun 28 14:43:59 2001
Date: Wed, 27 Jun 2001 10:27:41 +0200
From: Robert Lucas <lucas@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline
    Requirements(fwd)

"Steven T. Myers" wrote:

> > Do the raw observed data end up in the archive? I assume so. Or is that
> > requirement given in another document?
> 
> Thats in the first Requirements document (what should we refer to it as
> in this document by the way?).  The default is that raw data and WVR
> corrected data are both archived.

ALMA-SW_MEMO 11 at
http://www.alma.nrao.edu/development/computing/docs/joint/0011/ssranduc.pdf

Robert
-- 
Robert LUCAS,            Institut de Radioastronomie Millimetrique
300 rue de la Piscine,  F-38406 St Martin d'Heres Cedex   (FRANCE)
Tel +33 (0)4 76 82 49 42                  Fax +33 (0)4 76 51 59 38 
E-mail: mailto:lucas@iram.fr                http://iram.fr/~lucas/

-------------------------------------------------------------------------------

From guillote@iram.fr Thu Jun 28 14:44:19 2001
Date: Wed, 27 Jun 2001 10:31:31 +0200
From: Stephane Guilloteau <guillote@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

>
>I think that it's still reasonable to require that if data reduction of
>a good fraction of projects is feasible off-line with the cpu and memory
>available on a  laptop, then it should not be restricted by other issues
>(expensive(>0?) licences, complicated installation procedures ...).
>

    One of the key difference between the laptop and a "normal" computer is
the screen
size. This requirement has more implication on the user interface than on
the data
reduction engines.

        Stephane

-------------------------------------------------------------------------------

From lucas@iram.fr Thu Jun 28 14:44:36 2001
Date: Wed, 27 Jun 2001 13:00:12 +0200
From: Robert Lucas <lucas@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

Hi: 

- Tim, please remember this is a draft far away from the final thing; we
are just starting to discuss this in the whole SSR group!


I tried to reply to some of Tim's comments in order to clarify a few
points. 


Tim Cornwell wrote:
> 0. A point concerning scope. AIPS++ is about 150 FTE-years. AIPS is
> probably about the same. The ESO Data flow system is about 300 FTE-years
> (I believe). I would guess from some communications that for the items
> described in this requirements document, the ALMA computing division
> has between 40 and 60 FTE-years (depending on how one counts various
> things). I would counsel that you spend that effort wisely. I think the
> current draft overspends by a large factor.

Remember that these requirements will be used as input to a re-use
analysis, and that the ALMA FTE's should  be used only to help fill in
the remaining gaps, and build a pipeline.
 
> 1. A general comment is that data reduction splits into strategy
> and tactics. The tactics come from the basic physics but the
> strategy comes from experience. I think the document is mostly
> fine on tactics but is a little too specific about some strategies.
> The items on the calibration pipeline seem to me to fit in this
> category. For example, 2.1-R3 is a strategy that may or may not
> work in all situations.

The main motivation here is to feed back the results to the dynamic
scheduling and data acquisition processes. We believe e.g. that if these
tests do not work, the data cannot be calibrated, and we switch to
another less demanding activity. This is a first guess strategy based on
experience with existing mm-wave arrays. 

> 2. It's hard to know how to process data for a ground-breaking
> telescope like ALMA. I think one should be modest in setting
> forth too-definitive statements of how the processing should
> proceed. In this context, I think the tool-based approach using
> in AIPS++ is vital, and I would advocate including a statement
> aimed at this point.

Clearly the wise attitude is not to spend all forces in the first
version, but keep a good part of them for when we have real high
frequency data to play with! That's in the planning of the sw group I
think. 

> 3. I haven't followed your discussions in detail so I'm not at all
> sure what General Consideration B means. In what way is there a
> fundamental distinction? I could not see how this consideration
> affected the rest of the document. It's also a very dangerous point
> since in many operations, one obviously wants no distinction.

The difference is in the on source data acquisition naturally (total
power or interferometry) but the calibration may use data taken in
either single-dish or interferometry, which may require to share 
calibration data between single-dish and interferometry software. 

> 4. There are some prescriptive implementation details that should
> be removed (e.g. 3.0-R7 "using the fastest algorithm", also
> the Appendix of Barry Clark's input parameters).

For the quick look speed matters of course (as the name says).

 
> 5. I am surprised that the document has relatively few requirements
> that are operational in nature. For example:
> 
>         - Be installation-flexible: can be installed on non-specialized
>         hardware by end user
>         - Processing script must be re-executable with only a small
>         number of changes
>         - Process standard recurring observations and analyze according
>         to standard recipes
>         - Provide real-time feedback via standard compact displays
>         and plots
>         - Be operable automatically or manually
>         - Allow preemption, termination, resubmission, etc.

Good comment, but please remember this is only a draft on which we are
working!

> 6. I found some of the discussion hard to understand. An example
> is 3.3-R1: Everything but the first sentence is unnecessary and
> detracts from the simplicity of the requirement.

It might be wise to separate the actual requirements from our
motivations in writing them (which is useful for a live document).
 
> 7. A major point that applies to all my remaining comments
> is that it's easy to write simple sounding requirements that
> either double, triple, etc the software costs or prevent any
> estimation at all. Wim and Tony pointed out that adding undo
> is one example. I'd also add a substantial number of others:
> 
> "1-R1: The pipelines shall be able to process all data coming from
> the array."
> For all arrays that I know of, one can think of observations that
> "break the bank" of available computing. This must be true of
> ALMA as well. Do you really want to limit the array in this way
> or specify the pipeline so aggressively?
 
Of course it depends how you define `process'. This is sort of
restricted by 1.0-R7 in our general requirements document.
 
> "4.2-R1: The data taken on the astronomical source shall be reduced,
> depending on the observing mode. All possible modes shall be
> supported:
>         R1.1 etc"
> 
> I think only the enumerated modes should be supported. Only known
> things can be guaranteed to be supported.

I agree, rephrasing needed.


> "1.1-R4 The offline data reduction package should not suck"
> 
> Harder than you would think. I think the software costs for this
> are unknown.

I thought the author had inserted that one as an simple e-mail
generator.

>>>SMyers: it worked!<<<

Robert
-- 
Robert LUCAS,            Institut de Radioastronomie Millimetrique
300 rue de la Piscine,  F-38406 St Martin d'Heres Cedex   (FRANCE)
Tel +33 (0)4 76 82 49 42                  Fax +33 (0)4 76 51 59 38 
E-mail: mailto:lucas@iram.fr                http://iram.fr/~lucas/

-------------------------------------------------------------------------------

From momose@mito.ipc.ibaraki.ac.jp Thu Jun 28 14:44:59 2001
Date: Wed, 27 Jun 2001 20:46:38 +0900
From: Munetake MOMOSE <momose@mito.ipc.ibaraki.ac.jp>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

Hi Folks,

Several comments just before the telecon...

Sincerely, 
M. Momose

-----------------

1. Calibration with Pipelines

I think it is still unclear the difference in "calibration procedure" 
between Calibration/Quick-look Pipeline and Science Pipeline. Personally, 
I agree with the following Frediric's comments (June 17th): 

>I think that there are three kinds of calibrations that could 
>be handled by a "calibration pipeline":
>- The instrumental calibration: pointing, focus, delay, baseline,
>  etc. What is required here is a fast feedback to the control
>  software.
>- The calibrations that do not require a time interpolation, as
>  the atmospheric or bandpass calibration: each time such a 
>  scan is observed, something has to be derived and then stored, 
>  to be applied to all the  following observations, until a new 
>  calibration of that kind is observed.
>- The calibrations that require a time interpolation, ie the
>  phase and amplitude calibration: a calibration curve has to be
>  fitted using all available calibrations and then applied to all  
>  the source observations that were observed in between.
>
> The two first categories can easily be handled by a calibration
> pipeline. As for the third category, it is not yet clear to me which 
> pipeline should do the job. In the document I sent a few days ago, all 
> three pipelines are doing something in this area, and I agree it is not 
> clear enough. I think that the science pipeline should do a clean 
> job and derive the calibration curves using all data. But the 
> calibration and quick-look pipelines should also do a similar 
> calibration, to get an estimate of the phase rms and to produce 
> quick images. 

If the above will be the case, it may be sufficient to support only a 
few simple modes in Calibration & Quick-look Pipelines for just quick 
calibration / real-time monitoring (e.g., baseline-base solutions with 
linearly interpolated calibration curve), while the archival data generated 
by Science Pipeline are reduced in some optimum mode that is selected 
among various options.


2. about Simulator (Section 3 -8)

My opinion is that a simulator that generates a probable resultant map 
for some model brightness distribution will be quite beneficial to the 
end users. However, the one that simulates the whole things (complete 
instrumental behavior as well as environmental condition) will be so 
complicated that most observers cannot handle it, though it might be useful 
in checking technical issues. We should therefore discuss the optimum specs 
of the simulator for end users. (Discussion about a simulator for system 
check or maintenance is beyond the scope of this group, I guess.)

>>>SMyers: I have delineated levels of simulation in the latest (12-Jul)
version.<<<	   

3. Offline Visualization (Section 3 -7)

I agree that Offline package should be able to deal with more than two 
image-files of some standard format to make composite / multi-layered 
maps. But importing purely-graphic files (such as JPEG or postscript 
format: see 7.2-R1) to produce composite maps should NOT be required to the 
offline package, because these files do not have any header information 
such as reference positions, observing frequencies , and so on. Although this 
is a general feature of graphic software (e.g., Photoshop, Canvas ...), 
but not of the astronomical reduction package. I therefore propose to 
revise 7.2-R1 as follows: 

User should be able to produce overlays of different data
sets of standard formats.  It should be possible to place these
data sets in layers which can be switched on and off separately.  
The different images should be editable, and it should be possible 
to declare certain colors transparent. It must be possible to shift, 
rotate and scale the images at will.

>>>SMyers: Good text, I have used this.  It is a great help when replacement
text is provided!<<<

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:45:22 2001
Date: Wed, 27 Jun 2001 06:01:04 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

> >
> > The 3 -> 60 Mb / sec data rate from ALMA is comparable to the data
> > rate assumed by Tim Cornwell for the EVLA (EVLA memo 24). He calculates
> > that you will still need at least a $20,000 to $100,000 (2000 dollars)
> > computer system in 2009 to handle that amount of data. So you will need
> > a very expensive laptop. I do not think that having a requirement that
> > the ALMA offline system run on laptops is realistic.
> 
> The users should be able to reduce their data wherever they are.  I don't
> see how the peak and sustained data rates enter into this --- thats for
> the Pipeline primarily, and for the input into the Science archive (Im
> assuming thats the Pipeline also).  A user should be able to reduce a
> 12-hour dataset (some spectral line mode) on a desktop or laptop system
> in 2007.
> 

Perhaps I misrepresented myself here - I agree with Tim's comments that
the software package should run on a lap top - indeed I run the ACSIS
software system on my laptop - its great for software development 
(and for system testing with a tiny 128 channel spectral line system
from a single receiver!). So you might be able to reduce a small
snapshot on a 2007 laptop. However a 12-hour dataset at even the rather
modest data rate of 4 Mb per second sums to 173 Gb after 12 hours. 
I suspect that if you are attempting to process this amount of data in
a laptop while waiting for your airplane at an airport you may need 
quite a large collection of batteries! Anyway why be explicit about
computing devices at this stage? For all we know, in 2007 - 2009 we
may be using some kind of wireless screens with instant connect to
some supercomputer.

Tony

-------------------------------------------------------------------------------

From guillote@iram.fr Thu Jun 28 14:45:40 2001
Date: Wed, 27 Jun 2001 15:15:41 +0200
From: Stephane Guilloteau <guillote@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

Excerpt from Tony Willis

> Anyway why be explicit about
>computing devices at this stage? For all we know, in 2007 - 2009 we
>may be using some kind of wireless screens with instant connect to
>some supercomputer.
>


    That comes exactly back to my previous message: the key-point about
laptop
is the screen size, and the requirement may rather be written e.g.
    "Should be able to (conveniently) run the data processing user interface
from a laptop"

>>>SMyers: added to 2.1-R7<<<

        Stephane

-------------------------------------------------------------------------------

From schilke@mpifr-bonn.mpg.de Thu Jun 28 14:46:00 2001
Date: Wed, 27 Jun 2001 15:43:07 +0200
From: Peter Schilke <schilke@mpifr-bonn.mpg.de>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

I am triggered to this mail by Momose-san's rejection of the requirement
of possibility of importing "foreign" formats such as jpeg or such,
because it is difficult.  I don't want to harp on this particular issue
in itself too much, but I think at present we shouldn't restrict
ourselves too much by considerations of feasibility, we should define
what we think we need.  Reality checks will come in with assigning
priorities and, ultimately, by considering the resources available.

To stick with this example, I find it annoying that I have to jump
constantly between packages to annotate or make overlays with jpeg, so
there is this requirement.  It will get the priority "desirable" which
translates to "won't happen in your lifetime" in most cases - unless it
can be done cheaply - and it might be, since we are talking about
reusing existing software.  If it's not in the requirements, it won't
ever happen because nobody would know we want it.  A similar argument
could be (and has been made) regarding the "must run on laptop"
requirement.  So I'd be in favor of not exercising a priori censorship
too excessively - of course it shouldn't get to the point where the
important issues get lost in desiderata.

>>>SMyers: I am in favor of outputting standard formats (jpeg, gif, ps)
directly, but am less in favor of importing these.  Actually FITS is
probably our best format for import, and is relatively easy to convert
other stuff to this format, of course with loss of header info. I am
tempted, as in 3.7-R2 (12-Jul) to restrict import to standard formats
like FITS.<<<

        Peter

-------------------------------------------------------------------------------

From gueth@iram.fr Thu Jun 28 14:46:26 2001
Date: Wed, 27 Jun 2001 15:28:49 +0200
From: Frederic Gueth <gueth@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

Munetake MOMOSE wrote:
> 
> Hi Folks,
> 
> Several comments just before the telecon...
> 
> Sincerely,
> M. Momose
> 
> -----------------
> 
> 1. Calibration with Pipelines
> 
> I think it is still unclear the difference in "calibration procedure"
> between Calibration/Quick-look Pipeline and Science Pipeline. Personally,
> I agree with the following Frediric's comments (June 17th):
> 
> >I think that there are three kinds of calibrations that could
> >be handled by a "calibration pipeline":
> >- The instrumental calibration: pointing, focus, delay, baseline,
> >  etc. What is required here is a fast feedback to the control
> >  software.
> >- The calibrations that do not require a time interpolation, as
> >  the atmospheric or bandpass calibration: each time such a
> >  scan is observed, something has to be derived and then stored,
> >  to be applied to all the  following observations, until a new
> >  calibration of that kind is observed.
> >- The calibrations that require a time interpolation, ie the
> >  phase and amplitude calibration: a calibration curve has to be
> >  fitted using all available calibrations and then applied to all
> >  the source observations that were observed in between.
> >
> > The two first categories can easily be handled by a calibration
> > pipeline. As for the third category, it is not yet clear to me which
> > pipeline should do the job. In the document I sent a few days ago, all
> > three pipelines are doing something in this area, and I agree it is not
> > clear enough. I think that the science pipeline should do a clean
> > job and derive the calibration curves using all data. But the
> > calibration and quick-look pipelines should also do a similar
> > calibration, to get an estimate of the phase rms and to produce
> > quick images.


I agree with Munetake (and with my previous email...), that the precise 
definition of the calibration pipeline is unclear. "Calibration" is a 
quite general concept which includes operations of a very different 
nature. I would like to mention two related problems:

1) The "telescope calibration" (pointing, etc) put constrains of a
   different nature than the other pipeline elements: until the 
   results are available, ALMA is blocked and cannot observe! This 
   implies an extremely fast answer, which is not necessarily the 
   case for the other calibrations (the computation time can be longer,
   providing it is not a bottle-neck in the data flow). Thus, instrument
   calibration should have the highest priority (which is not clearly 
   stated in the present document), but it can even call for a separate
   "telescope calibration" pipeline. 

2) Some calibrations can be computed immediately after the corresponding
   scan has been observed (eg bandpass). The result can be stored and
   used to calibrate following observations. In that sense, it can easily
   be handled by the calibration pipeline described in the current 
   document. BUT some calibrations can only be derived at the very end of
   the observations: this is typically the time-dependance phase and 
   amplitude curves. So this is a job for the pipeline running at the 
   end of the session, namely the science pipeline. Maybe we should 
   distinguish between the 'calibration' and the 'imaging' part of the 
   science pipeline? 

My point is not to split the 'pipeline' in an increasing number of
entities
but rather to identify some well-defined parts with clear inputs and
outputs.

Frederic.

>>>SMyers: Incorporated into header for PL-2.0 <<<

-------------------------------------------------------------------------------

From tcornwel@cv3.cv.nrao.edu Thu Jun 28 14:46:43 2001
Date: Wed, 27 Jun 2001 08:09:56 -0600
From: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

> 
>     That comes exactly back to my previous message: the key-point about
> laptop
> is the screen size, and the requirement may rather be written e.g.
>     "Should be able to (conveniently) run the data processing user interface
> from a laptop"

My laptop, Dell Inspiron, has the best screen of any of my computers (1600x1400)
and it'll only get better. I think there must be some other concern here about
user interfaces that should be expressed directly.

Regards,

Tim

-------------------------------------------------------------------------------

From tcornwel@cv3.cv.nrao.edu Thu Jun 28 14:47:05 2001
Date: Wed, 27 Jun 2001 08:38:23 -0600
From: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements


> Hi: 
> 
> - Tim, please remember this is a draft far away from the final thing; we
> are just starting to discuss this in the whole SSR group!

OK. I apologise for perhaps being too strident. It seems like I've been
discussing subjects like these for years :)

Tim

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:47:30 2001
Date: Wed, 27 Jun 2001 07:49:46 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)

> My point is not to split the 'pipeline' in an increasing number of
> entities
> but rather to identify some well-defined parts with clear inputs and
> outputs.
> 
> Frederic.
> 

There are ways to have multiple pipelines coexisting that are quite easy
to implement. In fact, by taking this approach the overall complexity
of pipeline "logic" might be reduced quite a bit.

Tony

-------------------------------------------------------------------------------

From twillis@drao.nrc.ca Thu Jun 28 14:47:57 2001
Date: Wed, 27 Jun 2001 07:52:54 -0700 (PDT)
From: Tony Willis <twillis@drao.nrc.ca>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: [alma-sw-ssr] overall comment re pipelines document

To me it reads to a certain extent as a great big wish list. I guess
the next step would be to start setting the requests in a priority order.

>>>SMyers: Indeed, this will be the primary task at the Berkeley meeting.<<<

Tony

-------------------------------------------------------------------------------

From guillote@iram.fr Thu Jun 28 14:48:24 2001
Date: Wed, 27 Jun 2001 17:19:35 +0200
From: Stephane Guilloteau <guillote@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
    (fwd)


-----Original Message-----
From: Tim Cornwell <tcornwel@nrao.edu>
To: alma-sw-ssr@nrao.edu <alma-sw-ssr@nrao.edu>
Date: Wednesday, June 27, 2001 4:10 PM
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements
(fwd)


>>
>>     That comes exactly back to my previous message: the key-point about
>> laptop
>> is the screen size, and the requirement may rather be written e.g.
>>     "Should be able to (conveniently) run the data processing user
interface
>> from a laptop"
>
>My laptop, Dell Inspiron, has the best screen of any of my computers
(1600x1400)
>and it'll only get better. I think there must be some other concern here
about
>user interfaces that should be expressed directly.
>
>Regards,
>
>Tim
>

    Sorry, it's my eyes which don't follow the 1600x1400 over a 14-15 inch
screen.
And I doubt they'll ever become better...

        Stephane

-------------------------------------------------------------------------------

From tcornwel@cv3.cv.nrao.edu Thu Jun 28 14:48:49 2001
Date: Wed, 27 Jun 2001 10:36:28 -0600
From: Tim Cornwell <tcornwel@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: RE: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements


Robert wrote:
> 
> I tried to reply to some of Tim's comments in order to clarify a few
> points. 
> 
> 
> Tim Cornwell wrote:
> > 0. A point concerning scope. AIPS++ is about 150 FTE-years. AIPS is
> > probably about the same. The ESO Data flow system is about 300 FTE-years
> > (I believe). I would guess from some communications that for the items
> > described in this requirements document, the ALMA computing division
> > has between 40 and 60 FTE-years (depending on how one counts various
> > things). I would counsel that you spend that effort wisely. I think the
> > current draft overspends by a large factor.
> 
> Remember that these requirements will be used as input to a re-use
> analysis, and that the ALMA FTE's should  be used only to help fill in
> the remaining gaps, and build a pipeline.

I took that into account. Even if you build on top of another package,
the current draft is too expensive. I know that the process of
requirements/reuse analysis/costing will reveal this but I think
a reality check now is possible and useful.

>>>SMyers: It is ironic that most of the requirements objected to are
taken nearly verbatim from the aips++ requirements memo from 1992.
I think this demonstrates what impact the unprioritized wish-list of
that document had.  Was a requirements/reuse analysis/costing analysis
done for the 1992 requirements to justify the choices aips++ made?<<<

>  
> > 1. A general comment is that data reduction splits into strategy
> > and tactics. The tactics come from the basic physics but the
> > strategy comes from experience. I think the document is mostly
> > fine on tactics but is a little too specific about some strategies.
> > The items on the calibration pipeline seem to me to fit in this
> > category. For example, 2.1-R3 is a strategy that may or may not
> > work in all situations.
> 
> The main motivation here is to feed back the results to the dynamic
> scheduling and data acquisition processes. We believe e.g. that if these
> tests do not work, the data cannot be calibrated, and we switch to
> another less demanding activity. This is a first guess strategy based on
> experience with existing mm-wave arrays. 

That's not the only point. The design of C++ library + high level
scripting allows one to put detailed and well-known tactics into
the library (via e.g. a measurement model for a telescope), and
defer strategies for implementation in the scripting language.
There are packages and systems that don't have this property
and therefore it is worth specifying.

> 
> > 4. There are some prescriptive implementation details that should
> > be removed (e.g. 3.0-R7 "using the fastest algorithm", also
> > the Appendix of Barry Clark's input parameters).
> 
> For the quick look speed matters of course (as the name says).

The fastest algorithm may require excessive disk space or have
low precision or only powers of two or whatever.... The point is
that calling out "fastest" as being the most important factor
is not necessary.

Tim

-------------------------------------------------------------------------------

From bclark@aoc.nrao.edu Thu Jun 28 14:49:13 2001
Date: Wed, 27 Jun 2001 11:12:28 -0600 (MDT)
From: Barry Clark <bclark@aoc.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

> 
> The subject of the Section 2 on Pipeline Requirements is referred to as the
> "Pipeline".  This may be implemented as disparate
> tools or programs, or as separate packages provided by different groups,
> or as a single package, as long as it fulfills the requirements.
> 
> The subject of Section 3 on Offline Data Reduction Requirements is referred to
> as the "Package" or "Offline Package".  This may be implemented as disparate
> tools or programs, or as separate packages provided by different groups but
> integrated into a single suite, or a single package.
> 

Perhaps instead:  

The "Package" or "Offline Package" is a set of tools or programs, believed 
adequate for ALMA reductions, and used by ALMA staff for reductions upon
which the behavior of the system will be judged.  It may consist of packages
provided by different groups, with transitions provided to integrate them
into a single suite.  The requirements will state that the Package will be
available for installation on the observer's own computer systems.  The 
requirements on the Package are set forth in Section 3.

A "Pipeline" is a set of operations, implemented by the underlying Package,
which takes a concise description of the way these operations are to be
performed and accesses ALMA data, either from the ALMA archive or from 
local files, and produces a desired data product.  (For purposes of software
requirements, the alternate definitions as a machine or set of machines,
or as the supervisory process that invokes these operations are less useful.)
There are several Pipelines essential to the efficient operation of ALMA.

>>>SMyers: I have resisted the assumption that the pipeline is built using
the offline Package.  Although likely (eg. aips++) I think this is unduly
restrictive.<<<

\bullet The Calibration Pipeline operates in quasi real time, looks at only
calibrator observations, and produces one or more of the following data 
products (depending on the type of observation and type of calibrator), and 
places the results in a location where they can be accessed both by the 
real-time system and by other reduction proceedures:  1) an antenna pointing 
offset for all antennas. 2).  Tsys for all antennas as a function of time.  
3.) Sideband ratios for all antennas.  4.)  Antenna based flux calibration
(TSYSJY) from a flux calibrator.  5.)  Antenna based bandpass calibration.
6.) Antenna based polarization leakage terms (with the usual indeterminate
offsets from a single observation).  7.) Antenna based IF phase differences
(from a strongly polarized calibrator).  8.) Antenna based phase calibration
(with noise and atmospheric rms).

\bullet The Science Pipeline will process most science data.  It's data product
is an image cube.  This product will in many cases be adequate to achieve the 
observer's science goals.  It may access ALMA data from several observing
sessions and even from observations not the observer's own.  It is intended
to produce the best image possible without the intervention of an expert
observer.  The Science Pipeline will include a data calibration phase, that
may, in fact, run somewhat asynchronously with the image making phase; this
should not be confused with the Calibration Pipeline above.

\bullet The Quick Look Pipeline will process data from only one observing
session, and will comprise a subset of the operations of the Science Pipeline.
It will be sufficiently limited in its processing to produce results in a
time short compared to the length of a typical observing session.  It's
data products (images) will usually be available while the session is still
in progress.

The requirements for these pipelines are set forth in Section 2.

-------------------------------------------------------------------------------

From gueth@iram.fr Thu Jun 28 14:49:55 2001
Date: Thu, 28 Jun 2001 13:40:43 +0200
From: Frederic Gueth <gueth@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: [alma-sw-ssr] Definition pipelines

Barry Clark wrote:
> 
> >
> > The subject of the Section 2 on Pipeline Requirements is referred to as the
> > "Pipeline".  This may be implemented as disparate
> > tools or programs, or as separate packages provided by different groups,
> > or as a single package, as long as it fulfills the requirements.
> >
> > The subject of Section 3 on Offline Data Reduction Requirements is referred to
> > as the "Package" or "Offline Package".  This may be implemented as disparate
> > tools or programs, or as separate packages provided by different groups but
> > integrated into a single suite, or a single package.
> >
> 
> Perhaps instead:
> 
> The "Package" or "Offline Package" is a set of tools or programs, believed
> adequate for ALMA reductions, and used by ALMA staff for reductions upon
> which the behavior of the system will be judged.  It may consist of packages
> provided by different groups, with transitions provided to integrate them
> into a single suite.  The requirements will state that the Package will be
> available for installation on the observer's own computer systems.  The
> requirements on the Package are set forth in Section 3.
> 
> A "Pipeline" is a set of operations, implemented by the underlying Package,
> which takes a concise description of the way these operations are to be
> performed and accesses ALMA data, either from the ALMA archive or from
> local files, and produces a desired data product.  (For purposes of software
> requirements, the alternate definitions as a machine or set of machines,
> or as the supervisory process that invokes these operations are less useful.)
> There are several Pipelines essential to the efficient operation of ALMA.
> 
> \bullet The Calibration Pipeline operates in quasi real time, looks at only
> calibrator observations, and produces one or more of the following data

In the current draft, the calibration pipeline does not ignore the 
observations of the astronomical source: it applies (in the sense: 
store the appropriate quantity in the relevant header) the atmospheric 
calibration to all incoming observations. 

> products (depending on the type of observation and type of calibrator), and
> places the results in a location where they can be accessed both by the
> real-time system and by other reduction proceedures:  1) an antenna pointing
> offset for all antennas. 2).  Tsys for all antennas as a function of time.
> 3.) Sideband ratios for all antennas.  4.)  Antenna based flux calibration
> (TSYSJY) from a flux calibrator.  5.)  Antenna based bandpass calibration.
> 6.) Antenna based polarization leakage terms (with the usual indeterminate
> offsets from a single observation).  7.) Antenna based IF phase differences
> (from a strongly polarized calibrator).  8.) Antenna based phase calibration
> (with noise and atmospheric rms).

The list should be left open in such a general description. For
instance, 
the focus offset or the antenna positions derived from a baseline
measurement 
are missing.

> 
> \bullet The Science Pipeline will process most science data.  It's data product
> is an image cube.  This product will in many cases be adequate to achieve the
> observer's science goals.  It may access ALMA data from several observing
> sessions and even from observations not the observer's own.  It is intended
> to produce the best image possible without the intervention of an expert
> observer.  The Science Pipeline will include a data calibration phase, that
> may, in fact, run somewhat asynchronously with the image making phase; this
> should not be confused with the Calibration Pipeline above.

To avoid confusion, I would suggest to change the "calibration pipeline"
name 
to "real-time calibration pipeline". It can also be run off-line, but it
*has*
to be available in real-time.

> 
> \bullet The Quick Look Pipeline will process data from only one observing
> session, and will comprise a subset of the operations of the Science Pipeline.
> It will be sufficiently limited in its processing to produce results in a
> time short compared to the length of a typical observing session.  It's
> data products (images) will usually be available while the session is still
> in progress.
> 

The functions of each pipeline is summarized in the following list,
in which "blocks" of operations are identified:


Real-time calibration pipeline
------------------------------
   
  - Data acquisition part     - store in all incoming observation the
current 
				calibration paramaters (Tsys, bandpass, ...)

  - Telescope calibration     - reduce array calibrations (pointing, focus,
				delay, baselines,...)
			      - results are made available to the Sequencer 

  - Astronomical calibration  - reduce astronomical calibrations (atmopheric 
				calibration, phase rms, flux scale, bandpass,
				...) 
			      - results are made available to the Dynamic
				Scheduler 


Quick-look pipeline	
-------------------

  - Monitoring tools 	      - display the current properties of the array
			        and/or observation 
			      - need results of real-time calibration pipeline

  - Calibration pipeline      - from temperature-calibrated visibilities to uv
			        tables (simplified calibration)
			      - need results of real-time calibration pipeline

  - Imaging pipeline   	      - from uv tables to images (simplified version)
			      - need results of previous calibration pipeline

  - Display tools 	      - display current observations, to allow the
				operator/AoD easy checks of the data quality 
			      - need results of previous calibration and/or
				imaging pipelines

Science pipeline
----------------

  - Calibration pipeline      - from temperature-calibrated visibilities to uv
				tables 

  - Imaging pipeline 	      - from uv tables to deconvolved images

Frederic.

-------------------------------------------------------------------------------

From jschwarz@eso.org Thu Jun 28 14:50:24 2001
Date: Thu, 28 Jun 2001 14:57:30 +0200
From: Joseph Schwarz <jschwarz@eso.org>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Definition pipelines


Frederic Gueth wrote:

> Real-time calibration pipeline
> ------------------------------
>
>   - Data acquisition part     - store in all incoming observation the current
>                                 calibration paramaters (Tsys, bandpass, ...)
>
>   - Telescope calibration     - reduce array calibrations (pointing, focus,
>				  delay, baselines,...)
>                               - results are made available to the Sequencer
>

What is likely to be the limiting factor, time to acquire the calibration
data, or time to reduce it? Presumably baseline calibrations won't change
during the execution of a Scheduling Block (unless the observing process is
supposed to compensate for earthquakes in real time). Delay calibrations
(according to the Use Cases, section 4.8.5 of the main requirements doc) are
performed "at least once per receiver tuning" or "at least once per observing
session" or (Lucas & Muders, private communication) "after reconnections of
cables/fibres and after antenna moves". So while it's true that ALMA can't
observe without these results, which certainly need to be known by the
observing process (Sequencer?), there might be more time to produce them than
the phrase "real-time" implies.

As for pointing and focus, the Use Cases specify a "Pointing Session", which I
understand results in a pointing model, but also a "Pointing Calibration",
whose purpose is to update the parameters of that pointing model. The Pointing
Session is an array- (or observatory-) level calibration which is done "after
moving one or more antennas and/or at regular time intervals (weekly ?)",
while the Pointing Calibration gets done fairly often. When we were generating
the Use Cases, I had understood that there was no hard requirement on how
quickly the results from the "Pointing Calibration" were needed: that an
observing procedure could continue to execute even if the updates to the
pointing and focus parameters weren't available for some time. How long this
"some time" could be was never specified. It would be helpful for the analysis
if this could be made a little clearer.

>
>   - Astronomical calibration  - reduce astronomical calibrations
>				  (atmopheric calibration, phase rms, 
>                                 flux scale, bandpass, ...)
>                               - results are made available to the Dynamic 
>				  Scheduler
>

From prior discussions and from the Use Cases, I had understood that phase rms
results would be made available to the observing process (not just to the
Scheduler), so that an executing Scheduling Block could adjust cycle and dwell
times on target and phase calibrator based on the results. Similarly, an SB
might want to terminate once a certain noise level had been reached. Might not
the time constraints be tighter than those on the telescope calibrations?

-------------------------------------------------------------------------------

From lucas@iram.fr Thu Jun 28 14:50:47 2001
Date: Thu, 28 Jun 2001 15:48:29 +0200
From: Robert Lucas <lucas@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Definition pipelines

Joseph Schwarz wrote:
> 
> Frederic Gueth wrote:
> 
> > Real-time calibration pipeline
> > ------------------------------
> >
> >   - Data acquisition part     - store in all incoming observation the
> > current
> >                                 calibration paramaters (Tsys, bandpass, ...)
> >
> >   - Telescope calibration     - reduce array calibrations (pointing,
> > focus, delay,
> >                                 baselines,...)
> >                               - results are made available to the Sequencer
> >
> 

> What is likely to be the limiting factor, time to acquire the calibration
> data, or time to reduce it? Presumably baseline calibrations won't change
> during the execution of a Scheduling Block (unless the observing process is
> supposed to compensate for earthquakes in real time). Delay calibrations
> (according to the Use Cases, section 4.8.5 of the main requirements doc) are
> performed "at least once per receiver tuning" or "at least once per observing
> session" or (Lucas & Muders, private communication) "after reconnections of
> cables/fibres and after antenna moves". So while it's true that ALMA can't
> observe without these results, which certainly need to be known by the
> observing process (Sequencer?), there might be more time to produce them than
> the phrase "real-time" implies.

The time to reduce the delay calibration  is small compared to the time
to acquire the data.
But the feedback is real time, that is you have to apply them right
away, particularly with the delay calibration (if you would proceed and
apply the new delay offsets after some time, you would get a data set
that is non-homogeneous).
 
> As for pointing and focus, the Use Cases specify a "Pointing Session", which 
> I understand results in a pointing model, but also a "Pointing Calibration",
> whose purpose is to update the parameters of that pointing model. The 
> Pointing Session is an array- (or observatory-) level calibration which is 
> done "after moving one or more antennas and/or at regular time intervals
> (weekly ?)", while the Pointing Calibration gets done fairly often. When we
> were generating the Use Cases, I had understood that there was no hard
> requirement on how quickly the results from the "Pointing Calibration" were
> needed: that an observing procedure could continue to execute even if the
> updates to the pointing and focus parameters weren't available for some
> time. How long this "some time" could be was never specified. It would be
> helpful for the analysis if this could be made a little clearer.

In the pointing calibration Use Case the pointing offsets are applied in
a loop, the way Steve has written  it (BC steps 2-5, remember that this
had to be included in that specific ObservePointingCalibration sequence 
diagram). So at the end the offsets are already applied! Focus is the
same though we never wrote the relevant Use Case.
 
> From prior discussions and from the Use Cases, I had understood that phase
> rms results would be made available to the observing process (not just to 
> the Scheduler), so that an executing Scheduling Block could adjust cycle and
> dwell times on target and phase calibrator based on the results. Similarly, 
> an SB might want to terminate once a certain noise level had been
> reached. Might not the time constraints be tighter than those on the 
> telescope calibrations? 

You're right. The time constraint is however not tighter since the
results are used to modify loop parameters; therefore a delay of the
order of one or a few loop cycles is tolerable.

Regards

Robert

-------------------------------------------------------------------------------

From gueth@iram.fr Thu Jun 28 14:51:10 2001
Date: Thu, 28 Jun 2001 15:19:48 +0200
From: Frederic Gueth <gueth@iram.fr>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Definition pipelines

Joseph Schwarz wrote:
<a bunch of stuff deleted>


> 
> Frederic Gueth wrote:
> 
> > Real-time calibration pipeline
...
<a bunch of stuff deleted>
...
> the updates to the pointing and focus parameters weren't available for some
> time. How long this "some time" could be was never specified. It would be
> helpful for the analysis if this could be made a little clearer.
> 

I was refering to the pointing calibration done very regularly -- each 
hour or so. Of course, alma can continue to observe without having the 
result of this pointing measurement, but then you have all chances to 
point at a slighlty wrong position. So a much wiser approch, used with all
existing antennas or interferometers, is to wait for the results of the
pointing calibration before continuing the observations. The same is true 
for focus measurements. The SSR document (req. 6.1-R1) mentions a max 
delay of 0.5 sec to have the calibration results passed to the observing 
system.

> >   - Astronomical calibration  - reduce astronomical calibrations
...
<a bunch of stuff deleted>
...
> From prior discussions and from the Use Cases, I had understood that phase
> rms results would be made available to the observing process (not just to
> the Scheduler), so that an executing Scheduling Block could adjust cycle and
> dwell times on target and phase calibrator based on the results. Similarly,
> an SB might want to terminate once a certain noise level had been
> reached. Might not the time constraints be tighter than those on the
> telescope calibrations? 

Yes, the results of the astronomical calibrations have to be made available to
the observing process. But I think that the reduction time constraints for the
telescope calibration are tighter. For the phase rms, it's a matter of
deciding what has to be observed; if the pointing or focus are wrong, the data
are affected by errors that you cannot correct (bad pointing, bad focus).

Frederic.

-------------------------------------------------------------------------------

From jschwarz@eso.org Thu Jun 28 14:51:26 2001
Date: Thu, 28 Jun 2001 15:47:52 +0200
From: Joseph Schwarz <jschwarz@eso.org>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Definition pipelines

> In the pointing calibration Use Case the pointing offsets are applied in
> a loop, the way Steve has written  it (BC steps 2-5, remember that this
> had to be included in that specific ObservePointingCalibration sequence
> diagram). So at the end the offsets are already applied! Focus is the
> same though we never wrote the relevant Use Case.
>

BC step 3 says "While observations continue, the pipeline separately reduces
the [pointing calibration] data sets..."  How long can "observations continue"
without getting the results?

-------------------------------------------------------------------------------

From bglenden@cv3.cv.nrao.edu Wed Jul  4 08:11:03 2001
Date: Tue, 3 Jul 2001 15:01:54 -0600
From: Brian Glendenning <bglenden@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

Some late comments.

> A. Two fundamentally new aspects of ALMA are the integrated archive and
>    the pipleline, therefore the impact of requirements on these two areas
>    should be considered.  In particular the Pipeline will be the most
>    critical aspect of ALMA given that we envision both an effective
>    dynamically scheduled observatory with prompt user feedback mechanism and
>    a scientifically viable archive.

I think this statement is a bit over-done. I think a decent dynamic
scheduler would be possible just by taking into account environmental
factors. Similarly, I think even a raw data archive would be viable,
although of course we want to be able to attract non-traditional observers.

>>>SMyers: This seems to backtrack on the grand vision we had.  I dont think
we want to back off to just a raw data archive now!  The scheduler is not
part of this document, but I would like to have provisions in case it wants
feedback from the data in the archive (or the pipeline).<<<

>     1.0-R1 The Pipelines shall be able to process all data coming from the
>    array. It must not constitute a bottle-neck in the data flow,
>    meaning that several occurences of the same pipeline shall be able
>    to run in parallel if necessary.

I would add something like: Some projects will require unusually  high data
rates or processing requirements. These will require processing outside of
the ALMA system and will be flagged appropriately so they are not processed
by the ALMA pipeline.

>>>SMyers: good. added as PL-1.0-R1.1 <<<

>     1.0-R2 All corrections applied shall be recorded so that any step can be
>    reversed and redone if needed.

I agree with previous comments that this is harder to say than do.

>     2.0-R1 The Calibration Pipeline shall be activated after each scan has
>            been observed.

Mightn't you want to do this more often for some observations, e.g. for an
OVRO style pointing scan where you want to do a calculation on each point of
the triangle (if I remember correctly). Similarly, you might want to do
something after each raster line during holography. Maybe "observation"
rather than scan?

>>>SMyers: It is likely that these sub-scan calculations will be handled
by the online system as part of the procedures themselves (like on the VLA).
It would be nice to say that the smallest entity that activates the 
pipleline is the scan (or whatever).<<<

>     2.0-R2 The Calibration Pipeline may also be re-invoked at any time with
>    updated parameters or improved data.  The results should not
>    immediately overwrite old results so comparison is possible
>    before adopting the new calibration.  There will need to
>    be a method for validation and acceptance of calibration
>    updates.

In general, do we want to keep old calibrations "forever" and merely "mark"
the current set?

>>>SMyers: probably, though this is an implementation issue (eg. flags for
validity or deprecation).<<<

>     R2.1 apply the atmospheric calibration to the data
Does this mean WVR? If so it is probably applied by the online system before
the calibration pipeline.

>>>SMyers: good question.  Where do we see WVR being applied?  Are there
post-WVR atmospheric corrections?<<<

>     R3.1 compute the phase rms on the scan timescale
scan->observation?

>>>SMyers: we should iron out the nomenclature here.<<<

>        2.2-R4 For the pointing and focus measuremets, the fitting results
>               should be automatically stored in the telescope
>               parameter file if the fitting error is less than
>               the user/

It would seem dangerous to allow a user specified threshold determine what
was accepted for current values in the system of things like pointing and
focus. (Do user's even want to know about these things?)

>>>SMyers: replaced "user/observatory" with "system".<<<

>     4.0-R1 The Science Pipeline shall be activated after completion of a
>    session.

I don't think this is right. It activates after a breakpoint if the user has
requested feedback, after all observations for a source have completed, or
when the program completes. We don't want to have to needlessly repeat the
nonlinear parts.

>>>SMyers: have replaced session with "breakpoint", with a breakpoint assumed
at the end of the session.<<<

> 4.1-R3 The Science Pipeline shall check and correct the flux scale by
>        using observations of source of known fluxes. Any effect due to
>        the source being resolved shall be taken into account.

It seems like the second part of this is really an offline requirement.

>>>SMyers: Many of the best calibrators are resolved (planets, HII regions,
even 3C48/3C286 on the VLA!) and system maintained models can be (are) used
to deal with these cases.<<<

> 4.1-R4 The Science Pipeline shall compute images for each frequency
>                channel, as well as for the continuum emission:

Does the user have an option to not image, e.g. "edge" channels (to keep
within data rate parameters, for example).

>>>SMyers: add "(non-blanked, possibly user-specified)".<<<

> 4.1-R5 The images shall be deconvolved using the most appropriate
>        algorithm. In case of a complex image, it should be possible to
>        have several algorithms running in parallel, the best
>        (according to criteria TBD) image being eventually selected.

This will lead to an imhogeneous archive, and determining "best" by some
automated procedure may not be easy. We have to decide if we're producing a
"reference" image or trying to produce "the best" image.

>>>SMyers: we had this (inconclusive) discussion at the last Berkeley 
meeting.<<<

>     ? maybe Total power from detectors

If in fact it is not saved with the correlation data, do we normally throw
it away considering it only a debugging tool?

>>>SMyers: I routinely discard this from VLA data in filling.  But we should
make no assumptions here.<<<

> Should these have some prefix to indicate that they are for Offline, like
> "O-1.0" etc.?

Yes (or embed the section number).

>>>SMyers: I have adopted PL-xxx and OL-xxx to be easier...<<<

> 1.1-R3 The performance of the package should be quantifiable and
>        commensurate with the data processing requirements of
>        ALMA output at a given time.  This should be benchmarked
>        (e.g. "AIPSmarks") and reproduce accurately results for
>        a fiducial set of reduction tasks.

We could be more explicit here, i.e. take a few fiducial problems and say
that the performance should be greater than some value. I think it's also
important to say that the package must be able to cope with data sizes much
larger than main memory (however it chooses to do it).

>>>SMyers: Should we craft these numbers at the meeting or leave TBD?
Is there an official mechanism for deciding these TBD quantifications?
These are likely to have sigificant impact on the packages (eg. Tim's
costing) and I'd hate to have to make them up on the fly!<<<

> 2.1-R3 Multitasking for all interfaces should be available where
>        appropriate.

A bit vague, maybe: It must be possible to run one or more long-running
calculations in the "background." While background tasks are running normal
interactive activities must be possible.

>>>SMyers: add OL-2.1-R3.1 <<<

This brings up the subject of locking:

The package must support locking data files so that there is no possibility
of one process corrupting a file that is also being written to by another
process. The model should be: "one writer, multiple readers."

>>>SMyers: add as OL-3.1-R14 <<<

>         2.1-R6 Multiple levels of "undo" should be supported for all
tasks.
Again, hard. Some operations can be undone readily, others can't (e.g., if
you want to be able to undo a deconvolution you probably have to keep a copy
of the original image!).

>>>SMyers: see previous discussions and current (12-Jul-01) text.<<<

> 2.3-R4 All functionality of the CLI must also be available in GUI
>        mode.
Not realistic IMO (unless a CLI typein window counts!)

>>>SMyers: most substantive operations should be scriptable (which is a
CLI).<<<

>         2.3-R5 A graphical data-flow oriented (IDL style) tool assembler
>        would be desirable, perhapsed as an advanced GUI for later
>        development.
These are cute in principle, but they don't seem to be used much in
practice.

>>>SMyers: they are expensive in practice.  Almost certainly so for us :-( <<<

> 2.3-R3 The CLI should have command-line recall and editing
Name completion? Minimum match?

>>>SMyers: add<<<

> 2.3-R4 All functionality of the GUI must also be available in CLI
>        mode.
This direction I believe!

> 2.4-R1 Must have basic programming facilities such as:
IMO in a scientific command language whole-array arithmetic/processing is at
least desirable.

>>>SMyers: added as per Wim's comments.<<<

> 2.5-R2 There should be a variety of help levels and documentation
> [...]
>        These should be in printer-friendly formats.
Does this mean no native HTML?

>>>SMyers: add as desirable.<<<

> 3.1-R8 Comprehensive and understandable processing history information
>        for the data must be maintained and be exportable
What does exportable mean? Just that it's written into COMMENT cards in a
FITS file or something more complicated?

>>>SMyers: I would prefer the option of something more readable, but a
FITS table would be adequate.<<<

> 3.1-R10 When sorting or indexing is desirable for performance
>        enhancement, this should be carried out in a manner
>        transparent to the user.
I personally prefer to manually "purge" rather than having semi-intelligent
garbage collecting turn on at some random point (usually just when I want to
do something else).

>>>SMyers: I was thinking more along the lines of time-baseline vs. 
baseline-time indexing for speed-up of gridding etc.<<<

> 3.3-R1 I/O of data must not be a bottleneck for processing, especially
>        for pipeline use.  This is especially true if the native format
>                of the package is not used and filling/conversion is
necessary.
I think this is really a pipeline requirement. (Of course there are low
FLOPS operations in the offline package where I/O will be the bottleneck!).
Again, rather than subjective statements like this I think some objective
tests/times would be better.

>>>SMyers: Some text to that effect added 12-Jul-10.<<<

>     3.6 Images and other Data Products
Not having to transpose cubes is nice.

>>>SMyers: how to word this as a req? Im not sure what you mean here.<<<

> 3.7-R2 Imaging data in standard formats from astronomical instruments
>        at different wavelengths should be importable, with the
>        ability to combine these with ALMA data where appropriate.
>        This should be through a set of widely used formats.

Be more explicit about what you mean by combine. I assume you mean that for
each pixel
output = f(input1, input2, ...) where f consists of the usual mathematical
and logical functions, taking into account blanking.

>>>SMyers: like AIPS COMB? add "(coadd)".<<<

Blanking support should also be in the requirements: To prevent bad pixel
values from propagating through calculations blanking must be supported.
Usually any calculation that produces a pixel operating on a set of pixels
at least one of which is blanked will result in a blanked output pixel. It
is desirable that blanks not be destructive (the original pixel value is
retained), and it be possible to turn on and off different blanking ("mask")
levels.

>>>SMyers: add as OL-3.6-R3 with implications also in OL-5.1-R7
and OL-6.5-R3.<<<

> 4.1-R1 The package must be able reliably handle all of the proposed
>        and future ALMA calibration modes, including but not exclusive
>        to temperature controlled loads, semi-transparent vanes,
>        apex calibration systems, WVR data, noise injection,
>        fast-switching calibration transfer, planetary observations.

Several or all of these are more requirements for the online
system/calibration pipeline.

>>>SMyers: but have importance here also.<<<

>         4.1-R7 Data editing, calibration, and display of calibration
Besides interactive editing, what about automatic editing?

>>>SMyers: add as 4.1-R9 <<<

> 4.2-R4 Redundancy (e.g. same or crossing baselines) should be used
Do we have enough redundant baselines to make this relevant?

>>>SMyers: there should be a number of u,v crossings at least.<<<

> 4.4-R1 Individual data points must be associated with pointing
>        center information, and one must have the ability to
>        deal with complex scanning strategies.
What does this mean? Just that it can be gridded, or something else?

>>>SMyers: just that it can deal with wierd spiral patterns or other
scanning patterns.<<<

> 6.5-R2 The ability to collapse or integrate over sub-dimensions
>        of data cubes in order to form "moments" is required.
Add: Interactive and automatic facilities (windowing, S/N based blanking,
...) to avoid degrading the S/N in the moment calculations must be provided.

>>>SMyers: add as OL-6.5-R3.1 generalized beyond moments. <<<

>         7.2-R3 Both contour plots with variously colored and styled lines
>        and false color maps should be possible, it should also be
>        possible to produce RGB overlays (i.e. one layer gets
>        assigned intensity scales of red, another one of green,
>        and one of blue).
While useful, Hue/Intensity/Saturation is probably the more interesting
color "space" to be able to do this in.

>>>SMyers: add.<<<

Cheers,
Brian

From bglenden@cv3.cv.nrao.edu Wed Jul  4 08:11:24 2001
Date: Tue, 3 Jul 2001 15:18:36 -0600
From: Brian Glendenning <bglenden@cv3.cv.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Draft v2.0 of Pipeline and Offline Requirements

> >     1.0-R2 All corrections applied shall be recorded so that any step
can
> >   be reversed and redone if needed.
> I agree with previous comments that this is harder to say than do.

Oops - I got a sign wrong. This is *easier* to say than do.

Cheers,
Brian

-------------------------------------------------------------------------------

From bclark@aoc.nrao.edu Thu Jul 12 13:58:19 2001
Date: Mon, 9 Jul 2001 12:06:04 -0600 (MDT)
From: Barry Clark <bclark@aoc.nrao.edu>
Reply-To: alma-sw-ssr@cv3.cv.nrao.edu
To: alma-sw-ssr@cv3.cv.nrao.edu
Subject: Re: [alma-sw-ssr] Definition pipelines

Frederick Gueth wrote:

> 
> In the current draft, the calibration pipeline does not ignore the 
> observations of the astronomical source: it applies (in the sense: 
> store the appropriate quantity in the relevant header) the atmospheric 
> calibration to all incoming observations. 
> 
It is a matter of definition, but I would regard the processing of the
unknown sources as part of the Calibration phase of the Science and/or
Quick Look pipelines.  (This is in aid of setting machine priorities -
this has to be done on the Quick Look timescale, not on the same timescale
as the other items mentioned.)

> > products (depending on the type of observation and type of calibrator), and
> > places the results in a location where they can be accessed both by the
> > real-time system and by other reduction proceedures:  1) an antenna pointing
> > offset for all antennas. 2).  Tsys for all antennas as a function of time.
> > 3.) Sideband ratios for all antennas.  4.)  Antenna based flux calibration
> > (TSYSJY) from a flux calibrator.  5.)  Antenna based bandpass calibration.
> > 6.) Antenna based polarization leakage terms (with the usual indeterminate
> > offsets from a single observation).  7.) Antenna based IF phase differences
> > (from a strongly polarized calibrator).  8.) Antenna based phase calibration
> > (with noise and atmospheric rms).
> 
> The list should be left open in such a general description. For
> instance, 
> the focus offset or the antenna positions derived from a baseline
> measurement 
> are missing.
> 

I don't think these two functions are properly part of a pipeline.  They will
need a little human supervision, because of the very serious consequences
if they are a bit wrong.

But we need to specify that the Package will have tools to do these,
which I do not think we have there.