partition

NRAO Home > CASA > CASA Task Reference Manual

0.1.87 partition

Requires:

Synopsis
Task to produce Multi-MSs using parallelism

Description

Partition is a task to create a Multi-MS out of an MS. General selection parameters are included, and one or all of the various data columns (DATA, LAG_DATA and/or FLOAT_DATA, and possibly MODEL_DATA and/or CORRECTED_DATA) can be selected.

The partition task creates a Multi-MS in parallel, using the CASA MPI framework. The user should start CASA as follows in order to run it in parallel.

1) Start CASA on a single node with 8 engines. The first engine will be used as the MPIClient, where the user will see the CASA prompt. All other engines will be used as MPIServers and will process the data in parallel. mpicasa -n 8 casa –nogui –log2term partition(.....)

2) Running on a group of nodes in a cluster. mpicasa -hostfile user_hostfile casa .... partition(.....)

where user_hostfile contains the names of the nodes and the number of engines to use in each one of them. Example: pc001234a, slots=5 pc001234b, slots=4

If CASA is started without mpicasa, it is still possible to create an MMS, but the processing will be done in sequential.

A multi-MS is structured to have a reference MS on the top directory and a sub-directory called SUBMSS, which contain each partitioned sub-MS. The reference MS contains links to the sub-tables of the first sub-MS. The other sub-MSs contain a copy of the sub-tables each. A multi-MS looks like this in disk.

ls ngc5921.mms ANTENNA FLAG_CMD POLARIZATION SPECTRAL_WINDOW table.dat DATA_DESCRIPTION HISTORY PROCESSOR STATE table.info FEED OBSERVATION SORTED_TABLE SUBMSS WEATHER FIELD POINTING SOURCE SYSCAL

ls ngc5921.mms/SUBMSS/ ngc5921.0000.ms/ ngc5921.0002.ms/ ngc5921.0004.ms/ ngc5921.0006.ms/ ngc5921.0001.ms/ ngc5921.0003.ms/ ngc5921.0005.ms/

Inside casapy, one can use the task listpartition to list the information from a multi-MS.

When partition processes an MMS in parallel, each sub-MS is processed independently in an engine. The log messages of the engines are identified by the string MPIServer-#, where # gives the number of the engine running that process. When the task runs sequentially, it shows the MPIClient text in the origin of the log messages or does not show anything.

Arguments


Inputs
vis	Name of input measurement set
	allowed:	string
	Default:
outputvis	Name of output measurement set
	allowed:	string
	Default:
createmms	Should this create a multi-MS output
	allowed:	bool
	Default:	True
separationaxis	Axis to do parallelization across(scan, spw, baseline, auto)
	allowed:	string
	Default:	auto
numsubms	The number of SubMSs to create (auto or any number)
	allowed:	any
	Default:	variant auto
flagbackup	Create a backup of the FLAG column in the MMS.
	allowed:	bool
	Default:	True
datacolumn	Which data column(s) to process.
	allowed:	string
	Default:	all
field	Select field using ID(s) or name(s).
	allowed:	any
	Default:	variant
spw	Select spectral window/channels.
	allowed:	any
	Default:	variant
scan	Select data by scan numbers.
	allowed:	any
	Default:	variant
antenna	Select data based on antenna/baseline.
	allowed:	any
	Default:	variant
correlation	Correlation: ” ==> all, correlation=”XX,YY”.
	allowed:	any
	Default:	variant
timerange	Select data by time range.
	allowed:	any
	Default:	variant
intent	Select data by scan intent.
	allowed:	any
	Default:	variant
array	Select (sub)array(s) by array ID number.
	allowed:	any
	Default:	variant
uvrange	Select data by baseline length.
	allowed:	any
	Default:	variant
observation	Select by observation ID(s).
	allowed:	any
	Default:	variant
feed	Multi-feed numbers: Not yet implemented.
	allowed:	any
	Default:	variant
disableparallel	Create a multi-MS in parallel.
	allowed:	bool
	Default:	False
ddistart	Do not change this parameter. For internal use only.
	allowed:	int
	Default:	-1
taql	Table query for nested selections
	allowed:	string
	Default:

Example

----- Detailed description of keyword arguments -----

    vis -- Name of input visibility file
        default: none; example: vis=’ngc5921.ms’

    outputvis -- Name of output visibility file
        default: none; example: outputvis=’ngc5921.mms’

    createmms -- Create a multi-MS as the output.
        default: True
        If False, it will work like the split task and create a
        normal MS, split according to the given data selection parameters.
        Note that, when this parameter is set to False, a cluster
        will not be used.

        separationaxis -- Axis to do parallelization across.
            default: ’auto’
            Options: ’scan’, ’spw’, ’baseline’, ’auto’

            * The ’auto’ option will partition per scan/spw to obtain optimal load balancing with the
             following criteria:

               1 - Maximize the scan/spw/field distribution across sub-MSs
               2 - Generate sub-MSs with similar size

            * The ’scan’ or ’spw’ axes will partition the MS into scan or spw. The individual sub-MSs may
            not be balanced with respect to the number of rows.

            * The ’baseline’ axis is mostly useful for Single-Dish data. This axis will partition the MS
              based on the available baselines. If the user wants only auto-correlations, use the
              antenna selection such as antenna=’*&&&’ together with this separation axis. Note that in
              if numsubms=’auto’, partition will try to create as many subMSs as the number of available
              servers in the cluster. If the user wants to have one subMS for each baseline, set the numsubms
              parameter to a number higher than the number of baselines to achieve this.

        numsubms -- The number of sub-MSs to create.
            default: ’auto’
            Options: any integer number (example: numsubms=4)

               The default ’auto’ is to partition using the number of available servers in the cluster.
               If the task is unable to determine the number of running servers, or the user did not start CASA
               using mpicasa, numsubms will use 8 as the default.

                Example: Launch CASA with 5 engines, where 4 of them will be used to create the MMS. The first
                    engine is used as the MPIClient.

                mpicasa -n 5 casa --nogui --log2term
                CASA> partition(’uid__A1’, outputvis=’test.mms’)

        flagbackup -- Make a backup of the FLAG column of the output MMS. When the
                      MMS is created, the .flagversions of the input MS are not transferred,
                      therefore it is necessary to re-create it for the new MMS. Note
                      that multiple backups from the input MS will not be preserved. This
                      will create a single backup of all the flags present in the input
                      MS at the time the MMS is created.
            default: True

    datacolumn -- Which data column to use when partitioning the MS.
        default=’all’; example: datacolumn=’data’
        Options: ’data’, ’model’, ’corrected’, ’all’,
                ’float_data’, ’lag_data’, ’float_data,data’, and
                ’lag_data,data’.
            N.B.: ’all’ = whichever of the above that are present.

---- Data selection parameters (see help par.selectdata for more detailed
    information)

    field -- Select field using field id(s) or field name(s).
             [run listobs to obtain the list iof d’s or names]
        default: ’’=all fields If field string is a non-negative
           integer, it is assumed to be a field index
           otherwise, it is assumed to be a field name
           field=’0~2’; field ids 0,1,2
           field=’0,4,5~7’; field ids 0,4,5,6,7
           field=’3C286,3C295’; fields named 3C286 and 3C295
           field = ’3,4C*’; field id 3, all names starting with 4C

    spw -- Select spectral window/channels
        default: ’’=all spectral windows and channels
           spw=’0~2,4’; spectral windows 0,1,2,4 (all channels)
           spw=’<2’;  spectral windows less than 2 (i.e. 0,1)
           spw=’0:5~61’; spw 0, channels 5 to 61
           spw=’0,10,3:3~45’; spw 0,10 all channels, spw 3 - chans 3 to 45.
           spw=’0~2:2~6’; spw 0,1,2 with channels 2 through 6 in each.
           spw = ’*:3~64’  channels 3 through 64 for all sp id’s
                   spw = ’ :3~64’ will NOT work.
           spw = ’*:0;60~63’  channel 0 and channels 60 to 63 for all IFs
                  ’;’ needed to separate different channel ranges in one spw
           spw=’0:0~10;15~60’; spectral window 0 with channels 0-10,15-60
           spw=’0:0~10,1:20~30,2:1;2;4’; spw 0, channels 0-10,
                    spw 1, channels 20-30, and spw 2, channels, 1, 2 and 4

    antenna -- Select data based on antenna/baseline
        default: ’’ (all)
            Non-negative integers are assumed to be antenna indices, and
            anything else is taken as an antenna name.

            Examples:
            antenna=’5&6’: baseline between antenna index 5 and index 6.
            antenna=’VA05&VA06’: baseline between VLA antenna 5 and 6.
            antenna=’5&6;7&8’: baselines 5-6 and 7-8
            antenna=’5’: all baselines with antenna 5
            antenna=’5,6,10’: all baselines including antennas 5, 6, or 10
            antenna=’5,6,10&’: all baselines with *only* antennas 5, 6, or
                                   10.  (cross-correlations only.  Use &&
                                   to include autocorrelations, and &&&
                                   to get only autocorrelations.)
            antenna=’!ea03,ea12,ea17’: all baselines except those that
                                       include EVLA antennas ea03, ea12, or
                                       ea17.

    timerange -- Select data based on time range:
        default = ’’ (all); examples,
           timerange = ’YYYY/MM/DD/hh:mm:ss~YYYY/MM/DD/hh:mm:ss’
           Note: if YYYY/MM/DD is missing date, timerange defaults to the
           first day in the dataset
           timerange=’09:14:0~09:54:0’ picks 40 min on first day
           timerange=’25:00:00~27:30:00’ picks 1 hr to 3 hr 30min
           on next day
           timerange=’09:44:00’ data within one integration of time
           timerange=’>10:24:00’ data after this time

    array -- (Sub)array number range
        default: ’’=all

    uvrange -- Select data within uvrange (default units meters)
        default: ’’=all; example:
            uvrange=’0~1000klambda’; uvrange from 0-1000 kilo-lambda
            uvrange=’>4klambda’;uvranges greater than 4 kilo-lambda
            uvrange=’0~1000km’; uvrange in kilometers

    scan -- Scan number range
        default: ’’=all

    observation -- Select by observation ID(s)
        default: ’’=all

------ EXAMPLES ------

1) Create a Multi-MS of some spws, partitioned per spw. The MS contains 16 spws.
    partition(’uid001.ms’, outpuvis=’source.mms’, spw=’1,3~10’, separationaxis=’spw’)

2) Create a Multi-MS but select only the first channels of all spws. Do not back up the FLAG
column.
    partition(’uid0001.ms’, outputvis=’fechans.mms’, spw=’*:1~10’, flagbackup=False)

3) Create a Multi-MS using both separation axes.
    partition(’uid0001.ms’, outputvis=’myuid.mms’, createmms=True, separationaxis=’auto’)

4) Create a single-dish Multi-MS using the baseline axis only for the auto-correlations.
    partition(’uid0001.ms’, outputvis=’myuid.mms’, createmms=True, separationaxis=’baseline’, antenna=’*&&&’)

[next] [prev] [prev-tail] [front] [up]

More information about CASA may be found at the CASA web page

This code is available under the terms of the GNU General Public Lincense