New text is in this background: This is new text.
The Scientific Software Support (SSS) team in Socorro has created an object model for an astronomical source. (Model details may be found here.) One important aspect of this model is that it may be expressed as text in an XML format. XML is the preferred format for expressing the model as text and for creating the model from text. This model will be used in EVLA software and perhaps elsewhere. One of the objects in the model is the SourceCatalog. The Green Bank Telescope (GBT) team also has a source catalog construct and a text format that can be used to create the catalog. The remainder of this document pertains to progress being made by SSS staff in creating source catalogs by parsing text files in the GBT format.
It is evident from the GBT text format that there are some small model mismatches between GBT and EVLA. We believe we will be able to modify the EVLA Source Model to accommodate some of these differences.
The GBT Parser
is designed to survive as many parsing errors as possible. The goal is to report
all the errors in a file in one pass. The parser will also populate a catalog
to the best of its ability, no matter how many errors were found.
There is one
exception to this rule and it pertains to the new syntax
that we needed to introduce. The new syntax demands that the following text
be the first active (i.e., non-commented) line in the file:
catalogType=GBT
(spaces are permitted on either side of the "=" sign).
We envision creating more parsers, so we need a signal at the beginning of the file
that will help the software determine the proper parser. This is the only additional
syntax introduced.
We introduced new optional syntax that is designed to help us pick the correct
parser for a given text file format. For GBT, that syntax is:
catalogType=GBT
If the "catalogType=xxx" line is present, it must be the first
active line (ie, all lines that are not comments or blank) in the file.
GBT supports four formats: SPHERICAL, CONIC, EPHEMERIS, & NNTLE. The GBT Parser aims to support all of these formats. At this point all the formats are recognized and have stubbed parsers but only the SPHERICAL format has a functional parser. The SSS Source Model supports the data brought by the SPHERICAL, CONIC, & EPHEMERIS formats; the NNTLE might be transformable into something the SSS Source Model recognizes, such as orbital elements. More study is needed here.
Keywords Common to All Formats
This is the status of the GBT Parser's support for those keywords that are of the form key=value and that are not specific to any particular format:
Fully supported. All four of the valid formats are recognized by the catalog parser and cause the correct source parser to be invoked. It looks like the GBT requirements restrict the format line to the top of the data file. The SSS's GBT Parser will allow this line to appear anywhere in the file and will cause the catalog parser to change its source parser. If an invalid format value is found, the parser will report the error and will use the most recently used format. If no format was specified anywhere in the file, the SPHERICAL format is used as a default.
Fully supported. The initial stage of parsing merely holds the text after the equals sign in the HEAD = ... line. The SPHERICAL source parser will interpret the headings (see the Spherical Format section, below).
Fully supported. The value is used as the source's name.
Partially supported. This keyword has a set of valid values. Here are the valid values and the effect they have on the building of an SSS Source Catalog by the GBT Parser:
Any values other than those above are reported and ignored. This means that the parser's coordinate mode is left in its current state. If no COORDMODE was ever specified, J2000 is used as the default.
Not supported (yet?).
Fully supported. The value of VELDEF comes into play only if velocity values are later specified in the source data. The value found here causes the valuation of both the velocityFrame and velocityConvention properties of SourceVelocity. Any values other than the legal values are reported and ignored. This means that the parser's velocity frame and convention properties are left in their current states. If no VELDEF was ever specified, OPTICAL is used as the default convention and BARYCENTRIC is used as the default frame. (To Do: ask GBT for appropriate defaults.)
When this format is detected the Source uses a PolynomialPosition object to hold its position information.
Column Keyword Support
Fully supported.
Fully supported. The units are always taken to be km/s.
Not supported. Our SourceVelocity object does a have frequency range over which it is valid, but the setting of frequencies will be part of the SSS Resource Model.
Fully supported.
Partially supported. As mentioned above SSS may alter its Source Model to accomodate AZ/EL specification. For now these values are parsed as if they were RA & DEC.
Partially supported. As mentioned above SSS may alter its Source Model to accomodate GLON/GLAT specification. For now these values are parsed as if they were RA & DEC.
Any column not recognized as one defined by GBT's specifications will be treated as a user-defined column. The parser will make note of these columns so that clients may inspect them if they wish. The user-defined values will be saved for each source.
Column Keyword Validation
Some of the column headings imply information about the values of other column headings and/or about the values of other keywords. This is a summary of the cross-validation done for some of the keywords. (Any keyword not listed below has no cross-validation with other keywords.)
The parser will report violations of any of the above rules but will do its best to continue to read the data and create a catalog.