Date: Mon, 19 Feb 2007 15:03:16 -0700 From: Joseph P. McMullin Subject: ms data selection Current state, from Sanjay: General Syntax: =============== White-spaces are eaten (i.e., they are considered silent part of the syntax - they are the garnishing!). All lists of basic selection specification-units are comma separated lists and can be of any length. All integers can be of any length (in terms of characters) composed of the characters 0-9. Floating point numbers can be in the standard format (DIGIT.DIGIT, DIGIT. , .DIGIT or in the mantissa-exponent format). Places where only integers make sense (e.g. IDs), if a floating point number is given, only the integer part is used. Range of numbers (integers or real numbers) can be given in the format N0~N1. For integer ranges, it is expanded into a list of integers starting from N0 (inclusive) to N1 (inclusive). For real numbers, it is used to select all values present for the appropriate parameter in the Measurement Set between N0 and N1 (including the boundaries). Wherever appropriate, units can be optionally specified. The specified units are used to convert the values it applies to, to the units in the Measurement Set (which is MKS-system I think). For ranges, the units are specified only once (at the end) and it applies to both the range boundaries. Strings matching can be done in three ways. Any component of a comma separated list that cannot be parsed as number or number range or a number/range followed by an appropriate unit is treated as a literal string. These strings can include any character except the following: ',' ';' '"' '/' NEWLINE (since these are either also part of the MSSelection syntax). Literal strings are for exact matches. Strings enclosed in a pair of quotes ('"') are treated as patterns (patterns are simpler than regular expression for string matching). Patterns are internally converted to equivalent regular expressions before matching. Strings enclosed between a pair of slash ('/') are treated as literal regular expressions. Although it is highly discouraged to have strings in the MS which are used for selection, to include above mentioned reserved characters. However if one DOES choose to include the reserved characters are part of names etc., those names must be given within quotes. This leaves the only printable character that cannot be part of a name as the double-quotes character ('"'). If a need is felt to include that as well, an escape mechanism can be included later (but I would prefer to enforce that *at least* the double-quote character not be part of any name!) Generic syntax for time selection: ================================= T0, T1 and dT in the following can be specified as YY/MM/DD/HH:MM:SS.FF. Fields (i.e., YY, MM, DD, HH, MM, SS and FF), starting from left to right, can be omitted and they will be replaced by context sensitive defaults as explained below. Time selection: ============== Syntax: ====== 1. time='T0~T1'; Select all time stamps starting from T0 to T1. Fields missing in T0 are replaced by the fields in the time stamp of the first valid row in the MS. Fields missing in T1 are replaced by the corresponding fields of T0 (after it's defaults are set). 2. time='T0'; Select all time stamps that are within an integration time of T0. Integration time is determined from the first valid row (more rigorously, an average integration time should be computed). Default settings for the missing fields of T0 are as in (1) 3. time='T0+dT'; Select all time stamps starting from T0 and ending with time stamp = T0+dT. Defaults of T0 are set as usual. Defaults for dT are set from the time corresponding to MJD=0. I.e. dT is an specification of length of time from nominal "start of time" (and I don't mean the z corresponding to Big-Bang or something! AIPS++/CASA has to be sensitive to the sentiments of Creationists, Steady-state folks, etc. as well. And in that sense the 'J' in MJD also has it's baggage - but I am leaving it at this). 4. time='>T0' ; Select all times greater than T0 time='ID", all field IDs greater than ID are selected. Similarly for "UVDIST", all rows with uv-distance greater than the given uv-distance (converted to the appropriate units) are selected. When specified in the format "ID" will select all SPWs with ID greater than the specified value. "