CASA CalTable Refactor ********************** 2008Jul22 - (orig) for discussion (gmoellen) 2011Apr29 - minor changes, still for discussion (gmoellen) 2011Jun06 - corrections (gmoellen) 2011Jul03 - corrections, added intro (gmoellen) Introduction: The current CASA CalTable is poorly designed, and its current implementation is clumsy in a variety of ways, including an excess of unused (often redundant) Table columns, non-optimal I/O performance, and lack of portability. This document describes a revised and simplified CalTable definition that retains the best of the current design (the columns we currently use, mainly), and generalizes in several ways to improve efficiency and portability. Most notably, the former (complex) GAIN column will become a more general (real) PARAM column capable of supporting storage of both real and complex calibration parameters (from which complex Jones or Mueller matrix elements are calculated), the CAL_DESC_ID of the current CalTable will be dropped in favor of direct indexing with SPECTRAL_WINDOW_ID, and meta-info subtables will be adopted directly from the MS to support full calibraiton identification and portability. As a result, the handling of CalTables in the calibration code will be considerably streamlined and easier to maintain. The definition: MAIN table ========== Table Keywords: -------------- Type S Calibration type (VisCal enum) MSName S Parent MS PolBasis S Polarization Basis ('lin' or 'circ') Columns: ------- Keys: TIME D Solution midpoint FIELD_ID I Refers to FIELD SPECTRAL_WINDOW_ID I Refers to SPECTRAL_WINDOW ANTENNA1 I Refers to ANTENNA subtable ANTENNA2 I Refers to ANTENNA subtable[a] Non-keys: INTERVAL D Solution interval SCAN_NUMBER I MS Scan number Parameters[b]: PARAM F[NPAR,NCHAN][c] The cal solution params (type-dependent) PARAMERR F[NPAR,NCHAN] Cal Solution Error[d] FLAG B[NPAR,NCHAN] Cal Solution flag (per param)[d] SNR F[NPAR,NCHAN] Solution SNR[d] WEIGHT F[NPAR,NCHAN] Solution WEIGHT[d] Subtables: ========= ANTENNA Duplicates MS/ANTENNA FIELD Duplicates MS/FIELD SPECTRAL_WINDOW Duplicates MS/SPECTRAL_WINDOW + some type-dependent adjust, e.g., channel decimation, etc. HISTORY History info; cf MS/HISTORY [a] For baseline-based solutions; stores refant for antenna-based solutions [b] Parameter columns are variable-shape ArrayColumns, to handle changing nchan, etc. These columns should have keywords that describe the parameter axis for 3rd-party information (details of these keywords TBD). Note that Complex parameters (e.g., complex gain solutions) will be packed in real arrays in the CalTable, and realized as complex numbers in the calibrations code. [c] NCHAN, NPAR (# parameters) are defined and managed by the particular calibration type. E.g., NPAR for an ordinary (dual-polarization) gain solution (GJones) will be NPAR=4 to store two complex gain values. [d] PARAMERR, FLAG, SNR, WEIGHT are as determined by solving code. FLAG may be updated by flagging applications (e.g., in plotting), WEIGHT by averaging code. For non-solved CalTables (e.g., gencal), these parameters will be set to nominal values. Considerations: 1. Nominally, CalTable subtable indexing will be independent of the MS indexing. (At first, they will be identical, until full assigment capabilities are deployed.) 2. Selection will be via "CalSelection" (cf MSSelection), wherein user-supplied (or MS-supplied) meta-data is matched in the subtables to yield the indices that should be selected in the main table. 3. Append/concat? Append/concat for Cal from same MS is straightforward, and will be the only append/concat mode supported. 4. Additional ID columns will be added in future, as needed, e.g., FEED_ID, OBSERVATION_ID, ARRAY_ID. 5. Extensions for specialized types (e.g., GSPLINE, BPOLY, etc.): only add type-specific columns when absolutely necessary. 6. Sorting. Physical (disk) CalTables will be sorted by SPECTRAL_WINDOW, TIME, and ANTENNA1(/ANTENNA2) (slowest to fastest). This sort order will tend to satisfy parallelization and general solving concerns vis-a-vis the likely subdivision (by SPW) and traversal (in TIME order) of MSs, and is also generally convenient for interpolation. It may be desirable to store different SPWs as different TableColumn hypercubes, even when this is not required by varying shape, in order to better isolate them for parallelization. 7. I/O. CalTables _are_ CASA Tables (like the MS), and so CalTable instances will be specialized Table instances. Usually, I/O will be achieved by copying a disk Table to a MemoryTable (and vice-versa). Per-column I/O will be supported as an option via putColumn-like operations (e.g., to support revision of the FLAG column by a flagging application or revision of the PARAM column by adjustment applications like fluxscale, etc). Iterated I/O will be addressed in the future, to support the (as yet unseen) 'too-big-for memory' CalTable, which is likely to be largely ameliorated by SPW-subdivision for parallelization. Merging of CalTables (including meta-info reconciliation) at the I/O step must be supported. 8. Deployment: VisCals (the fundamental class describing an instance of a particular calibration type) will own the (MemoryTable) CalTable object, and share it with (improved) interpolation and interface (cf the current CalSet) classes. The interface class (CalTableInterface?) will provide means of accessing individual solutions (those rows in the CalTable corresponding to a single solution from one spw) to permit recording of solutions as they are obtained by solving, or short sequences of them for interpolation purposes. The interface layer will also be responsible for translating (implicitly) the MS indices to CalTable indices according to directives provided by the user (e.g., spwmap, plus enhancements supporting similar mappings on the antenna and field axes). 9. Solution flagging convention (CAS-956): A flagged solution that occurs in a CalTable indicates a datum that was selected for the calibration solve, but failed due to encountering flagged data or poor convergence/snr. Solutions which do not occur in the CalTable indicate data that were explicitly absent or selected against when solving. Only explicitly flagged solutions will flag data upon application; absent solutions (e.g., a subset of antennas that were absent for some solution intervals) will be spanned by interpolation, subject to a user-specified (TBD) interpolation timescale. (NB: CURRENTLY, data in spws with absent calibration are NOT flagged, but for otherwise present spws, missing subsets of antennas have explicit flagged solutions in the cal table, so ARE flagged. The present refactoring effort will simplify and make flag-by-calibration behavior more intuitive.) 10. Plotting. CalTable plotting will be supported in the plotms application will currently supports plotting MS data and information. The close alignment of the CalTable with the MS (in particular, the subtables) will enable considerable code resuse in plotms for CalTable support. 11. Alignment with ALMA CalDM? We may need an export mechanism to satisfy any requirement to deposit CASA-sourced calibration in the ALMA archive. As needed, we may wish also to import TELCAL-sourced calibration as CASA CalTables. These will probably be straightforward. (Need to study this a bit more carefully.) --