Chapter 2. Output format MIME structure

The overall format of a binary output data file is given by the MIME[1] specification [RFC2045], [RFC2046], [RFC2387]. In the context of the MIME standard, the binary output data file consists of one multipart MIME message, comprising one or more multipart sub-parts. In other words, a data file is a nested (or tree-structured) multipart MIME message.

One MIME message, or file, comprises data for a set of integrations from one subarray, where the data for each integration are themselves contained in a multipart message that is one sub-part of the top-level MIME message. At the top level, the file has a MIME type of multipart/mixed, which is the standard MIME type for multipart messages wherein the sub-parts are ordered[2]. In turn, the data for one integration are formatted as a MIME message of type multipart/related, which indicates that the sub-parts within the message are related and may have references to other sub-parts within the same message. The message for an integration has one sub-part for the data header, and several additional sub-parts for the binary data tables.

[Note]Note

This specification allows the aggregation of multiple integrations into a single data file, whereas that is not the case for ALMA.

Which data are aggregated into a single file is ultimately an operational or policy issue. It is likely that all of the integrations in one sub-scan will be aggregated into a single file. However, to limit file size, it may be necessary to limit the number of integrations in one file, in which case the integrations in one sub-scan will span several files. Nothing in this specification requires either policy, nor any other policy in this regard.

An example of a partial binary output data file is provided below to illustrate the high-level structure. Note that the example is not normative; except as noted, any discrepancies in the example with respect to the MIME standard are to be resolved in favor of the MIME standard.

MIME-Version: 1.0 1

Content-type: multipart/mixed; boundary="integration_boundary"; 2
type="text/plain"
Content-description: correlator spectral data set
3
--integration_boundary 4
Content-type: multipart/related;
              boundary="abcd";
              start=<hdr//X1/1/0/0>
Content-description: correlator spectral data
Content-id: <uid://X1/1/0/0> 5

--abcd 6
Content-type: text/xml; charset=iso-8859-1
Content-transfer-encoding: 8bit
Content-id: <hdr//X1/1/0/0> 7
<sdmDataHeader>
...
<actualTimes href="cid:actualTimes//X1/1/0/0" 8.../>
...
</sdmDataHeader>

--abcd 9
Content-type: application/octet-stream 10
Content-id: <actualTimes//X1/1/0/0>
[BINARY DATA]
--abcd
... 11
--abcd--

--integration_boundary 12
Content-type: multipart/related;
              boundary="ABCD";
              start=<hdr//X1/1/0/1>
Content-description: correlator spectral data
Content-id: <uid://X1/1/0/1>

--ABCD
Content-type: text/xml; charset=iso-8859-1
Content-transfer-encoding: 8bit
Content-id: <hdr//X1/1/0/1>
<sdmDataHeader>
<actualTimes href="cid:actualTimes//X1/1/0/1" .../>
...
</sdmDataHeader>

--ABCD
Content-type: application/octet-stream
Content-id: <actualTimes//X1/1/0/1>
[BINARY DATA]
--ABCD
...
--ABCD--

--integration_boundary
uid://X1/1/0/0 XXX 13
uid://X1/1/0/1 YYY
--integration_boundary--
1

CRLF line endings required throughout.

2

Boundary delimiters must be generated carefully to adhere to the MIME standard, especially in light of the nested multipart nature of this file. While it would be desirable to conform to the standard, it is not clear whether conformance in this regard is necessary or practical for EVLA purposes.

3
[Note]Note

ALMA has other MIME header fields, but they are optional according to the standard.

4

Each MIME sub-part at the top level corresponds to one backend integration. The data for each integration are themselves formatted as a multipart MIME message. This MIME header provides the MIME container for the integration, and associates an identifier with this part of the data file.

5

Content-id uniquely identifies the data in this sub-part (i.e., the data of this integration).

6

Data header for this integration. This sub-part contains a true data header, providing metadata for the binary data.

7

Content-id value is derived from the unique identifier for data of this integration.

8

References to Content-id values within multipart/related MIME messages are described in [RFC2111].

9

First binary data table for this integration. Content-id value is again derived from the unique identifier for the data of this integration.

10
[Note]Note

ALMA uses binary/octet-stream Content-type.

11

Other binary tables.

12

The second integration in the data file.

13

Index for mapping the Content-id value of each integration to the byte offset within the file of the beginning of the MIME part for that integration.

[Note]Note

ALMA has a alma-uid MIME header field, which, for EVLA, has been integrated with the Content-ID field of the MIME sub-part of each integration.

A more complete example of an output file may be found in the section called “Output data file example”.



[1] Multipurpose Internet Mail Extensions

[2] Ordering of the sub-parts is required only because the final sub-part of the top-level message does not contain integration data, but rather provides an index to the integrations within the file.