CONSOLIDATED REPORT MONITOR AND CONTROL SOFTWARE PDR
                     14 - 15 May 2002
                   Gustaaf van Moorsel
                        7/31/2002

CONTENTS
========

OVERALL DESIGN ISSUES
MIB SELECTION/RFI CONSIDERATIONS
M&C SOFTWARE / MIBs
OPERATIONAL INTERFACE
DEVICE BROWSER
CORRELATOR MONITOR AND CONTROL
CORRELATOR BACKEND/INTERFACE WITH ARCHIVE
RECRUITMENT

OVERALL DESIGN ISSUES
=====================

Question
--------

There is a distinct lack of an architectural design.  Producing such a
design should be one of the highest priorities in the months ahead.
Without a design that shows how everything fits together a schedule
cannot be created.  The missing design means a high element of risk.
The schedule is very tight; we may be forced into using existing
software, such as the GBT M&C system.

Reply
-----

The EVLA M&C software group strongly agrees with this point.  We are
working now to develop an overall architectural design.  Real-time
software does not always have the luxury of following the classic
software development scenarios.  Real-time software is part of a
combined software and hardware development effort, and must respond
both to the overall needs of the project, and to hardware development
schedules.  The current lack of an architectural design is a direct
result of the decision to respond to the need to configure the systems
software and the development environment for the antenna MIBs in
support of the schedule for the test antenna.  Those MIB issues have
now been settled, and we have now turned our attention to
architectural issues.

Question
--------

The schedule is being driven by the hardware needs, not the software
design.  The software schedules don't mean much until a software
design is in place.

Reply
-----

We are of the opinion that real-time software must sometimes depart
from the classic development scenarios.  The EVLA does not become a
reality until we have an outfitted test antenna and can determine the
RFI environment.  It is the test antenna schedule that is driving the
development schedule, and rightly so.

We agree that software schedules have very little meaning until a
software design has been developed.  However, it is our opinion that
responding to the test antenna schedule as the 1st priority was the
correct course to follow for the case of the EVLA M&C software effort.

Question
--------

An object diagram is needed of the design as understood, or being
thought about, so far.  Its most important use is before the design is
finished.  It will guide one's thoughts into better object-oriented
designs.  It should exist whether a review is scheduled or not.

Reply
-----

Agreed.  A 1st cut of a high level sketch of the objects in the core
of the system will appear on or about 7/8/2002.  It is not
particularly detailed, but goes in the direction of capturing overall
structure and will serve as the basis for further elaboration.  That
same document will contain the beginnings of our thinking on a
standard device interface.  This document is very informal, and will
appear on evla-sw-discuss.

Question
--------

For the overall architecture, we need a software equivalent of the
block-diagram that exists for hardware

Reply
-----

Some people insist on diagram rich architecture and design documents
because "nobody reads".  We agree with this sentiment.  We will try to
insure that our current efforts to develop an architecture includes
informative diagrams.

MIB SELECTION/RFI CONSIDERATIONS
================================

Question
--------

What drove us to the particular choice of chip?  What drove us to the
choice of RTOS?

Reply
-----

The chip choice was driven almost entirely by RFI considerations:
On-chip RAM is an important guideline for the reduction of RFI.  The
choice of RTOS was made primarily because of its small footprint,
fitting comfortably in the available RAM.

Question
--------

Are we backing us too much in a corner by certain requirements to the
chip?  Is an RFI-free design using off-chip RAM out of the question?

Reply
-----

The only way we will know if the MIB chip requirements were
unnecessarily stringent is to build the MIB prototype board, and
to then compare its RFI levels to both the detrimental levels
of RFI as developed from theory and tests, and to the RFI
levels of COTS boards that might have been used for the MIB.
Both steps will be taken.

We are attempting to mitigate the severity of the current constraints
by relying on standards for MIB communications with the external world.
As long as standard method for MIB communication are used, such
as Ethernet as the wire protocol and UDP & TCP/IP as the underpinning
for information exchange, we should be able to preserve our options,
and have a path for changes/upgrades to the antenna MIB that will
not impact the rest of the system.

An RFI-free design using off-chip RAM seems unlikely.  The ability to
use off-chip RAM would have greatly expanded our options for the MIB,
and would have reduced costs, but in the absence of actual test data
(soon to be developed), we believe the MIB choice was the right one.

Question
--------

The selection of the antenna MIB is driven by RFI considerations and
the choice of Ethernet.  While RFI concerns must be addressed
successfully (or we will be swamped with noise) the question is
whether the complexities of Ethernet are really worth it.  Having only
one chip on the market that meets all requirements should raise red
flags.  What if that chip is withdrawn from the market - actually it
really is not on the market yet?

Reply
-----

We are not overly concerned with the fact that only one suitable chip
was located at the time of the search.  It seems clear that the
System-on-Chip (SOC) market is in a state of dynamic expansion, and
that the sort of SOC we are using for the MIB will become much more
common in the near future.  At least two other chips, well along in
development were located, but they were a few months further out from
production than the TC11IB.  More chips will follow.  At least for the
next several years, the sort of SOC chip we need for the antenna MIB
will become increasingly commonplace.

As to Ethernet, we will risk the prediction that 5 to 7 years from
now, the extensive use of COTS networking will been seen as one of the
most powerful features of the EVLA.  Putting the necessary network
infrastructure in place is expensive, time-consuming, and requires
attention to a myriad of details.  However, once in place, it allows
the use of readily available, volume-priced hardware for maintenance
and upgrades, makes possible the use of the very wide range of
commercial and open source software technologies/packages that are
Ethernet/IP based, and will give the EVLA an unprecedented degree of
flexibility, which is absolutely essential to satisfying the
longer-term requirements of inter-operation with VLBA antennas,
operation of the NM Array antennas, and satisfaction of the
requirements for remote observing and observing modes that are more
interactive.

Question
--------

What if it turns out that the chip has to be changed?

Reply
-----

As long as communications between the chip and the external world are
based on standard wire & software protocols we have alternatives.  If
need be the MIB _systems software_ could be ported to a new chip and
the applications software would not then require modification.
Another alternative is to use different systems software, which
includes the same basic functionality.  Then, a port, but not a
rewrite of the MIB applications software would be required.

Question
--------

Should we look into other MIB/software combinations in case the risks
for this MIB look bad?

Reply
----

To develop alternative chip/systems software scenarios at this time
would probably not be a good idea, with emphasis on "at this time".
Investigation of alternative scenarios takes time, money, and
manpower.  Since we are terribly short on time and manpower, we would
rather stall the development of alternative scenarios until the case
for the possible need of them is much clearer and stronger.  The time
and manpower not currently spent on the MIB would best be spent on the
development of applications software for the current MIB, and on the
overall EVLA M&C software architecture.

M&C SOFTWARE / MIBs
===================

Question
--------

Whenever programs have the same or similar interfaces, every attempt
should be made to use object-oriented methods to capture those
interfaces and, whenever possible, make them identical.  This has been
the single, biggest win in the GBT software.  The EVLA project seems
to have an excellent start on such a strategy, but a careful review
would bring out others; also, beware of the trap of something being
"too simple" or "not needing" a standard interface which greatly
diminishes the utility of those interfaces which are standard.  A good
example is George Peck's engineering "high-level screens" will
interface to the device interface described by Kevin Ryan which is the
same interface used by the M&C system.

It is not clear how the software development responsibilities are
being divided between software and hardware personnel.  Whoever does
it should work toward developing standard interfaces between the
device software per se and the software labeled I/O Area, Device
Functionality, and Other Low-Level Device Specific Code.  Those things
that the Correlator/Backend provide or need from M&C, should be done
like other devices, i.e., use the "device" interface.

Reply
-----

As of June 2002 we began to work on an overall design.  One of the
1st issues raised was that of a standard device interface.  We are
investigating that issue now, with the goal of specifying a standard
interface that will function at all levels of the software.

Question
--------

The GBT project lost a significant amount of development time due to
lack of resolution on a number of issues involving requirements.  They
seemed minor, but when attempting to make design decisions, vague or
missing requirements slows design down significantly.  I wish we had
had a clear mechanism for resolving such issues. It was not clear to
me that such a mechanism exists for the EVLA, e.g., as whether you
need an integrated or global reset for the MIB.

Reply
-----

I believe we do have reasonably clear lines of responsibility drawn -
Rick Perley as Project Scientist, Jim Jackson as Systems Engineer for
Hardware, Barry Clark as Systems Engineer for Software, Peggy Perley
as Head of Operations.  For the example given, poll these parties for
their opinions.  If opinions differ, put them all in a room together
and lock the door until a consensus is reached.

Question
--------

We need clearly defined, standardized MIB screens for standardized
hardware to reduce development time.  This requires input from the
hardware engineers.  Has this been given consideration?

Reply
-----

Hardware engineers have been and will continued to be solicited for
their input.  Sometimes it is necessary to wrestle them to the ground
to get anything more than the obvious "I need access to the hardware".
We are practicing our holds and throws.  We are also thinking about
and experimenting with ideas that will allow us to speed up the
development of the initial, lowest level screens that will be needed
at the start of bench testing.

Question
--------

Has re-using some of the VLA Software been considered?

Reply
-----

Elements of the VLA design have already found their way into our
thinking.  Reuse of the VLA code is, for the most part, not practical.
The VLA system is not object-oriented, does not include the notion of
intelligence at the antennas, does not have to deal with different
antenna types, and is written almost entirely in assembler and
Fortran.  Modcomp assembler code is entirely non-portable.  Fortran IV
and Fortran 77 are not candidate languages for the EVLA software.

Question
--------

Has contractor work been considered?

Reply
-----

We have already contracted for the the port of the systems software
for the antenna MIB, and we are constantly on the lookout for other
tasks suitable for contract work.  However, for the core applications,
we strongly prefer to use in-house personnel in order to keep the
expertise and knowledge of the software within NRAO.

Question
--------

It was stated that test and operational software will be written by
the hardware designer.  That seems realistic, but the implementation
needs definition: who writes what; interfacing requirements with
system; standards, languages, other details.

Reply
-----

The Computer Division will supply a "skeleton" for the MIB software
that will include methods for getting data out of the MIB and commands
into it.  We expect that Wayne Koski and George Peck will want to
handle the MIB device interfaces - SPI, parallel I/O lines, etc, and
that the designers of specific hardware will want to write the code
for that hardware.  However, we are flexible and will remain so.  As
development proceeds, the task list of what needs to be done will grow
more detailed.  Allocation of the work from that task list will
proceed as the task list grows.

Question
--------

Do we have any idea about the reliability of the antenna MIBs?

Reply
-----

We don't have the means for accelerated life testing at NRAO, but we
have plans to purchase thermal analysis software.

Question
--------

Whichever Communication Protocol is selected, it should be "Discovery
Based" to an extent that monitor point data can be logged/archived
based solely on the information in the system, i.e., logging programs
are completely generic.

Reply
-----

Agreed, and that is where our thinking has gone.

Question
--------

The differences and requirements for detecting, reporting, and
signaling bad values (data or monitoring) was not clear.  Where are
messages, indicators, and/or flags used?  And how to handle
alarm/message cascading (information overload)?

Reply
-----

These points were not clear because they have not yet been defined.
I don't expect that we will get to this point until Sept - Oct 2002.

Question
--------

How will power failures on the arms be addressed?  Power to the arms
fails often during summer months, albeit for short periods.
Currently, someone has to go out to a failed antenna when this
happens.  The new system should provide remote power reset.

Reply
-----

Discussions about providing a global antenna reset have taken place,
and Wayne Koski has been urged to give serious consideration to this
feature.

We also feel that a discovery based device interface will be of
considerable usefulness w.r.t. power outages.  Assuming that crucial
portions of the system are on battery backup, a discovery based device
interface will discover when portions of the array have disappeared
and will adapt to that fact, and will also discover the reappearance
of the hardware when power has been restored, and will adapt to those
new circumstances.

Question
--------

Is it possible to interrogate an antenna, i.e. "are you out there?"

Reply
-----

Control will flow downward in the system.  Monitor data will flow
upward (and perhaps laterally).  There will be a constant flow of
monitor data that will serve to keep us informed as to who is out
there.

Question
--------

The MIB is needed by the fall 2002 for RFI testing and module
development.  Is this realistic?

Reply
-----

Yes.  While we will not meet the date of 7/15/2002 for a MIB prototype
board, we should have one in time for RFI testing in the fall of 2002.

Question
--------

Who is doing the MIB software?  Advises to use resources in both
Electronics and the Computer Division.

Reply
-----

The MIB software will be done by Wayne Koski, George Peck, the actual
device designers, Kevin Ryan, and effort will also be contributed by
the person hired to replace Bruce Rowen in his previous capacity as
one of the people who helps to maintain the VLBA.

Question
--------

It is highly recommended that a more detailed schedule/scenario is put
together.  For instance, for the bench integration, what do we need at
the various phases in terms of M&C support?

Reply
-----

We agree that a more detailed schedule/scenario is needed.  It is a
high priority item.

Question
--------

A security plan is required.  Will this be given attention?

Reply
-----

It indeed is necessary to develop such a plan.  As of May 2002, the
four highest priority items for the EVLA M&C software effort are:

1. Development of an overall software architecture and design.
2. Development of a detailed, timelined scenario/schedule for
   the test antenna.
3. Security requirements and a design that satisfies the requirements
   for security.
4. Development of a more detailed, timelined scenario/schedule for
   the hybrid array (the transition plan).

plus, of course, the actual hardware and applications software
development for the antenna MIB.  We will deploy our manpower with
these priorities in mind.

OPERATIONAL INTERFACE
=====================

Question
--------

Is the operational interface only supported on one main platform?

Reply
-----

No, all displays will work on all platforms.  We feel that there is no
reason to unnecessarily limit the software, especially the operational
software, to a single platform.  We need to build a system that is
highly adaptable and flexible and free of such limitations. And with
programming languages such as Java there is absolutely no reason to
build to a single platform.

Question
--------

How etched in stone is the level of access for the various
groups?

Reply
-----

It is not cast in stone. The diagram in the presentation is simply a
first cut at a diagram that shows a top-level view of the security
requirements. It is meant to show the primary user categories and what
types of permissions those users will have from different locations.
It is my understanding that an EVLA security document will be
generated in the near future. The diagram will be modified as those
requirements become better known.

Question
--------

What about the overhead for XML, SOAP, etc?

Reply
-----

There is, without question, overhead associated with the use of XML or
SOAP as a communication protocol namely that the data is sent over the
wire as ASCII text which typically means that packet/stream sizes will
be larger. There is also the cost of serializing and deserializing the
data as the received packets/streams must be parsed using an XML
parser.  We do feel, however, that due to the strong industry backing
and acceptance of these technologies, we should not disregard them
without giving them a serious look. These technologies might not be
the end-all solution to all parts of the system, but we do believe
they will play a role in selected parts of the system.

Question
--------

I take it that the EVLA -- like the VLA -- does not worry about data
monitoring during a scan, whereas it was agreed that basic viewing of
the astronomical data during a scan is imperative for the GBT.

Reply
-----

Actually ,the VLA does support data monitoring during a scan.  The
function of the F/D10 display, which is in constant use at the VLA
during an observation, is to provide the VLA Operators with a measure
of the quality of the science data during the scan.  Additionally,
there is the checker screen which displays alarms and warning messages
generated from monitor data, and a third screen, with an associated
software process, that warns of conditions such as a potential array
stall due to missing files, failure of antennas to converge to a
solution during a reference pointing scan, and other conditions.  The
EVLA will include similar capabilities, but of an enhanced nature.  We
do worry about data monitoring during a scan, and the capacity to
monitor data quality is very much a part of the plans for the EVLA.


Question
--------

If one is serious about following the design outlined by Kevin Ryan in
his talk, i.e., using distributive processing - "Putting the
Intelligence where is Action is" (which I strongly recommend since
that was the guiding principle for the GBT), then the computer(s) on
the antenna, MIB or otherwise, should have enough power and memory to
accept only high-level commands and on the whole act autonomously.

Reply
-----

The TC11IB has a Tricore core which is clocked at 96 MHZ.  In addition
it has a peripheral control processor which appears to be clocked at
48 MHZ, and 1.5 MB of on-chip RAM of which approximately 1 MB will be
available for application code.  If there were only one of these
processors available for each antenna, there would be legitimate cause
for concern that there was insufficient computing power and resources
to place a sufficiently high level of intelligence in the antenna to
implement the desired software architecture.  However, there will be
40 to 50 MIBs, each with its own TC11IB chip in each antenna.  Taken
together, we do not see this amount of distributed processing power as
insufficient to implement an approach that consists of sending high
level commands to the MIB, with the actual implementation of those
commands performed by the MIB.

DEVICE BROWSER
==============

Question
--------

For the VLBA, Bob Greschke and the Electronics Division have written
important test software.  Moeser's Device Browser nicely does some of
what VLCj does for the VLBA.  It is not clear who will write the test
macros to tie together multiple screens and equipment for things like
PCAL, BBC, RFI tests.  Does the requirement document cover features
provided by VLCj?  Does it cover features provided by tests sets
written by Mack Stephenson for the VLBA?

Reply
-----

It is our understanding that the test software will be written by
those individuals that know how to test the hardware, namely the
hardware engineers. The test software should be written to a
standardized software interface that allows the test software to be
plugged-into the system.

We are not familiar enough with VLCj to comment on whether or not the
features in VLCj are in the operational requirements document. If
anyone is familiar with VLCj and has read the EVLA operational
requirements document and finds that requirements are missing, they
should bring it to the attention of Bill Sahr or Rich Moeser.

Question
--------

Does such a browser mean overhead on the MIB side?

Reply
-----

No, these data are already in the MIB

Question
--------

What about the number of packets/second needed?

Reply
-----

This is indeed a big issue, we will have to take efficiency seriously.

CORRELATOR MONITOR & CONTROL
============================

Question
--------

Is the virtual correlator interface a device?

Reply
-----

Yes, and it should follow the standard interface rules

Question
--------

What about PCMCIA (vs. PC104+)?

Reply
-----

PCMCIA is newer, smaller, and it is where the industry is going

Question
--------

Why separate the various control computers?

Reply
-----

This greatly enhances reliability.  For instance, when the Correlator
Power Control Computer (CPCC) goes down, the correlator does not go
down with it.

Question
--------

Is there a need for the CPCC to talk to the antenna devices?

Reply
-----

No, all this goes through the Main Correlator Control Computer (MCCC.)

CORRELATOR BACKEND/INTERFACE WITH ARCHIVE
=========================================

Question
--------

Does the e2e only expect frequency domain data?

Reply
-----

In general the Backend (BE) can output data in any form that the e2e
can accept. A major design point of the BE was to perform FFT's of
lags to spectra. There are currently no requirements to produce other
than spectra.

Question
--------

Why is Ethernet needed between the correlator and the correlator
backend?

Reply
-----

To minimize interprocessor communications on the BE, all work for a
given Baseline will be done on a single BE node.  Depending on the
Correlator mode, Baseline data can come from a number of Correlator
output points and can vary from mode to mode, thus there is no fixed
mapping of correlator output points to Baselines.  As a result we will
need maximum flexibility in the connection scheme between the
Correlator and the Backend.  Currently, a switched, Gigabit ethernet
provides the optimal combination of flexibility, speed and cost.

Question
--------

Can the reversibility requirement be loosened?

Reply
-----

Irreversible processes will not be hidden from the user. All
irreversible processes will be under user control. All reversible
processes (e.g., FFT) will produce sufficient metadata to allow the
process to be undone at some future time.

Question
--------

Constant phase rotation needs to be added to the data processing
requirements

Reply
-----

This will be taken into consideration for addition as an optional
process.

Question
--------

Is a heterogeneous cluster a possibility, allowing us to replace
components one at a time?

Reply
-----

The advantage of having a homogeneous cluster is that we simply assign
the same number of baselines to each node to distribute the workload
evenly.  For a heterogeneous cluster we will have to take into account
the relative speeds of the nodes when determining how many baselines
each will do.

Question
--------

What about alarms?  Are we going to use the same screen for monitor
and for control data?

Reply
-----

The Backend will not have a GUI of its own and hence will not need to
directly produce user screens. The BE functional design will be
coordinated with M&C to provide the needed alarms and user data in an
agreed upon format via an agreed upon delivery mechanism.

Question
--------

Please explain the plans re flags.  Is there a large flag with each
visibility, or a combined flag.  How will the e2e system deal with
those flags?

Reply
-----

The specifics of data flagging of output to the e2e has not yet been
worked-out. This will depend in part on requirements imposed by the
e2e.

Question
--------

Could you comment on networking?

Reply
-----

The correlator and the antennas will constitute two networks, each of
which is large than that at the AOC.  It is important to use identical
switches all around, and make use of network management software.

Question
--------

Why use segments at all?

Reply
-----

Lag Data will be distributed to the various BE processes running on a
given node via shared memory. (No off-node distribution is anticipated
until final delivery to the e2e.) This shared memory cache space will
be logically segmented or blocked into a few to possibly ten large
chunks. These will be filled one at a time by the Input process as
data comes in from the Correlator. They will be accessed by the Input
Manager to do a logical sort of the lag frames, and emptied by the
Data Processing process which applies math functions. Empty blocks are
released back to Input for reuse. The size and number of blocks will
be set (possibly dynamically) to optimize throughput. A key factor
will be minimization of the number of Input interrupts at the end of
filling a block (pushing towards having a minimal number of blocks). A
competing factor is the need to avoid having other processes waiting
for Input to finish filling a block (pushing toward more blocks).

Question
--------

Will the prototype of the backend have any functionality?

Reply
-----

The BE prototype is intended to have full internal functionality. It
will be able to accept Correlator lag frames (which are synthetically
generated and stored on disk since there will be no actual correlator
available.) It will be capable of doing FFT's and integration but will
most likely not have any optional functions deployed. It will probably
not produce output formatted for the e2e in its first incarnation. The
underlying message passing layer used may not be the same one
ultimately used in the production code.

Question
--------

How do the correlator data and the M&C data relate in the archive?

Reply
-----

The intent is to have the Backend combine all data not directly
received from the Correlator with the lag frame data to produce a
single output stream to the e2e. All Correlator non-lag frame data and
all non-Correlator data will come to the BE from M&C and be combined
before being sent to the archive. Thus, there should be no need to
further relate (assemble) data in the archive itself.

Question
--------

How do monitor data attach to a measurement set that is still open?

Reply
-----

This question is still mostly an open issue.  It is likely that
monitor data will be attached to a measurement set as an extension or
extensions to the AIPS++ measurement set format.  Having said that
much, no further details can be supplied at this time.

RECRUITMENT
===========

Question
--------

Establishing a close relationship with Tech, UNM, and NMSU has helped
the Electronics program.  There have been student hires that show
great potential, and visibility with alumni has helped with two hires.
Hiring all seasoned veterans is a great goal, but in view of the
location and salary it is sometimes necessary to hire someone with
promise.

Reply
-----

For all advertised position we need staff with at least several years
of work experience and demonstrated abilities in the fields required.
We think student hires are too risky and would need too much time to
come up to speed, even if eventually successful.  We have, though,
considered candidates with only a few years experience who would
require additional training and seasoned veterans alike.

Question
--------

There has been talk of an Albuquerque office.  And what about
Charlottesville as a location?  What about employing help from other
parts of the observatory, especially GB?

Reply
-----

Creation of an Albuquerque office has been considered, but we don't
feel availability of such an office would have made a difference in
hires that fell through.  As for hiring people at other NRAO sites,
for obvious reasons we would prefer not to have some of our staff
thousands of miles away, but if that's what it takes to hire suitable
staff, we may consider it.

As we proceed with the initial architectural design we hope to open
robust, strong communication channels with other parts of the the
observatory, especially Green Bank.  The possibility of inviting
various Green Bank personnel to Socorro for periods of work on the
EVLA software architecture and design has been discussed, and will be
pursued.

Question
--------

We need to continue to push on recruiting software engineers.  We need
not to be afraid to 'experiment' with one of the open positions.

Reply
-----

We are pushing.  At the time of writing (July 2002) we have filled
three of our four vacancies, and have a number of suitable candidates
for the fourth.  We are steadily becoming more flexible in our view of
candidates, and in the terms of our offers.  We are also aided by the
cooling off of the job market.