1 General Syntax

A MSSelection expression consists of a comma separated list of specifications. Specifications are typically strings or numbers. Strings and numbers can be mixed to form a single list. Elements of the list which can be converted to integers are treated as integer index specification. Elements which do not get parsed as numbers are treated as strings. Where appropriate, strings are matched against names. Depending upon the content of a string, it can be used as regular expressions or pattern. Where appropriate, physical quantities (numbers with appropriate units) can also be used.

A blank selection expression is interpreted as ”no selection to be applied to the MS”. Hence a blank expression effectively implies ”select all”.

1.1 Number Format

Integers can be of any length (in terms of characters) and composed of the characters in the range 0-9. Where appropriate, negative values can be given using the ’-’ character. Floating point numbers can be in the standard format:

or in the mantissa-exponent format (e.g. 10.56e-1). If a floating point number is given where only integers are expected (e.g. indexes), the floating point value is truncated to the nearest integer.

1.2 Range Specification

Range of numbers (integers or real numbers) can be given in the format N0~N1. Integer ranges are expanded into a list of integers starting from N0 (inclusive) to N1 (inclusive). Range of real numbers is used to select all values between N0 and N1 (including the boundaries). E.g.

Integer ranges:

Floating point ranges:

1.3 Units

Wherever appropriate, units can be optionally specified. Values with units are converted to the units in the Measurement Set (which uses the MKS-system). For ranges, the units are specified only once (at the end) and it applies to both the range boundaries. E.g.

1.4 Strings

String matching can be done in three ways. Any component of a comma separated list that cannot be parsed as number/number range/physical quantity is treated as a regular expression or a literal string. If the string contains any of the ’*’, ’{’, ’}’ or ’?’ characters, it is treated as a pattern (a simplified form of regular expression). Otherwise it is treated as a literal string and used for exact matching. As a result, for most cases, the user does not need to supply any special delimiters for literal strings and/or regular or pattern matching expressions. However if it is required that the string be matched exclusively as a regular expression, it can be supplied within a pair of ’/’ as delimiters. A string enclosed within double quotes (’”’) is used exclusively for pattern matching (patterns are a simplified form of regular expressions - used in most UNIX commands for string matching). Patterns are internally converted to equivalent regular expressions before matching. Read elsewhere (e.g. use command ”info regex”, or visit this link6 ) for details of regular expression and patterns.

Strings can include any character except the following:

   ','   ';' '"'  '/'  ':' and NEWLINE

(these are reserved characters for MSSelection expression syntax). Strings that do not contain any of the characters used to construct regular expressions or patterns are used for exact matches. Although it is highly discouraged to have name in the database containing the above mentioned reserved characters, if one DOES choose to include the reserved characters are part of names etc., those names can only be matched against quoted strings (since regular expression and patterns are super-set of literal strings. I.e. literal string is a valid regular expression also). This leaves the list ’”’, ’*’, ’?’, ’{’ and ’}’ as the list of printable character that cannot be part of a name (i.e., a name containing this character can never be matched in a MSSelection expression). If a need is felt to include these as well, an escape mechanism can be included later (but I would prefer to enforce that at least these characters not be part of any name!). Following are some examples of strings/regular expressions/patterns:

1.5 White Spaces (blanks in expressions)

In most cases, blanks are treated as white-spaces (i.e., insertion of blanks anywhere in the expression has no effect), except in the case of Field Selection Expressions (see Section 4). Blanks are allowed as part of the field names. Blanks around the delimiting characters (’,’ , ’;’ , ’&’ etc.) are ignored. For field names, blanks after the first valid name character and before the last valid name character are included as part of the name. Hence