Date: Fri, 31 Mar 2006 16:20:42 -0700 (MST) From: Michael Rupen Subject: A few uber comments Hi Steve, after this afternoon's discussion I got to wondering whether part of our problem lies in confusing appearance, documentation, and implementation. We want all the commands to appear alike to the user; we want to be able to find a sub-command by looking in obvious groupings (vs. the current tools); and it doesn't really matter how all this is implemented, so long as the user doesn't have to care about it. So, here are a couple basic principles I'm hoping we can agree on. * The actual system should be as "flat" as possible. : All commands -- tasks, methods, whatever -- should be available directly, without (e.g.) entering a different environment. : All commands -- tasks, methods, whatever -- should look alike to the user. I.e., one should enter parameters in a similar fashion; those parameters should have a consistent naming convention; the commands should be similarly documented and maintained; and all commands should be usable in the same way in scripts. The user should not for instance care whether the implementation is through python, C++, or FORTRAN, or even just a simple script+interface. : There should NOT be another level of commands which only experts know how to use. Note that I do not say such commands should not exist -- there are commands in any package that only the cognoscenti use or appreciate. But those esoteric commands should be accessible in the standard fashion, and should be documented and maintained in the same way. Assuming the "normal" CASA approach conceals objects (for instance) from the user, intelligently handles opens/closes so the user doesn't have to pay attention to that, etc., the "esoteric" commands should share this approach. One should not have to learn another level of complexity in style/syntax to write expert scripts -- one should be able to build directly on the basic system. In this same vein the scripts (and pipelines) we provide as examples should look like what one types at the command line, not some more advanced form of OOP. : My (crazy?) interpretation of this is that - tools, from the user point of view, should vanish - methods, from the user point of view, should be callable directly, with any requisite instantiation, opening, closing, etc., hidden from the innocent eyes of the user. Maybe this means there are really boring "tasks" written which check whether the thing is already open correctly, open it if need be, run the method, and return. That is an *implementation issue*, and not appropriate for us to comment on at this stage. If Joe comes back and says "gosh, this is hard", we negotiate until we find an acceptable compromise. - tasks can be anything the programmers desire, but should look and feel the same as methods. : A subsidiary requirement in all this is that we retain access to the full functionality of the current methods. : A side effect of this is that the concept of tools basically vansihes from the user's p.o.v. A Good Thing for the OOP-challenged. : THIS IS MORE CONTROVERSIAL BUT IS A DETAIL: Ideally no "setup" command would be required, before running any other command. The aim here is to avoid the complex open/specify subsets of/run/close sequence which seems to annoy users currently. - This obviously breaks for e.g. plotting a bandpass (which requires that you have created said bandpass!) -- one also cannot image if the MS hasn't yet been loaded. I don't think anyone will complain about that sort of restriction, which is common to most packages. - More seriously, this argues against the SELECT and SETMASK commands we discussed running before CLEAN etc. I can weasel out of those by thinking of those pre-commands as basically being a simpler way of specifying input parameters to the only actual command from the user's point of view, which is CLEAN. That's my own preference -- I view the whole SELECT business as left over from the part of the current interface that we most dislike -- but I don't think there's any consensus on that. I guess I would do this as 1- SELECT is possible but not essential to running CLEAN. SELECT gives you a nice way of doing complex selections, and returns a pointer (or a variable, or whatever) which then becomes the default input to uv tasks like CLEAN. You can over-ride this default within CLEAN if you like. 2- SETMASK creates a mask or masks, and again returns a pointer/variable/map which is also made the default input to appropriate tasks. Again you can over-ride this in CLEAN if you want. So in this model neither SELECT nor SETMASK are necessary -- they are conveniences, but can be over-ridden. Nor are they simply prequels to the main action -- they actually do something, even if it's only returning a pointer/variable which can then be referenced. [I do wonder about how long that thing sticks around -- if I over-ride it in CLEAN, presumably it vanishes.] * This "flat" approach is IN CONTRAST to the documentation, which must be intelligently hierarchical as well as flat. : Information should be grouped in areas that make sense to the astronomer, rather than according to the "tool" in which a method or whatever resides. Hence help imaging gets you the list of all imaging commands (with one-line descriptions), help imanalysis gets you the list of all image analysis commands, etc. Within these we could organized further -- I'm not sure of this -- maybe the most common commands near the top, and more unusual, special purpose, or low-level commands further down, or separated in some fashion. This is NOT what AIPS does -- there you just get alphabetical, which has its uses as well. I'm not sure of the details here. : This same information hierarchy should be reflected in the cookbook, and in the URM. : Some version of "apropos" would be useful too. : On the "flat" side: one should be able to say HELP or EXPLAIN COMMAND (or PARAMETER) for every command/parameter, at any time, and get the appropriate information. Note this is another push for consistent parameter names. Note that in all this I've concealed the association of methods and tools from the user. I'm not sure whether this means horrible hits in efficiency, programming pain, etc. -- maybe for speed one shouldn't switch back and forth between im. and cb. or some such -- but again I see that as something for the programmers to weigh in on. There were things in AIPS that just made things faster, but were a pain for the user: switching between TB and XY order of uv-data, transposing a cube before calculating moments, etc. etc. Somehow we put up with those, and similar considerations may pop implementation details up into the user domain in CASA. But we should try to avoid forcing this on the astronomer unless it really is *necessary*. Anyhow that's my two cents. I do think we should have some basic philosophy up at the top of the document, to set the basic stage, before getting into details about how input works etc. Thanks again for all your work on this. I know it's been painful. Cheers, Michael =============================================================================== Date: Fri, 31 Mar 2006 22:58:56 -0700 (MST) From: Walter Brisken Subject: Re: A few uber comments Hi Michael, I think I am amongst those (including Steve) who you may consider straddle the line between developer and user and I have some objections to your comments, though believe that your ideas should represent ideals to strive toward, in the presense of the realities of the underlying software and the desire to unleash the full power to expert users. Below are my "Walter-centric" responses to your comments -- I take full credit/blame for the my statements below. Feel free to forward this on to others as you think appropriate, or to respond to me in kind :) ... On Fri, 31 Mar 2006, Michael Rupen wrote: > Hi Steve, > > after this afternoon's discussion I got to wondering whether part of our > problem lies in confusing appearance, documentation, and implementation. > We want all the commands to appear alike to the user; we want to be able > to > find a sub-command by looking in obvious groupings (vs. the current > tools); > and it doesn't really matter how all this is implemented, so long as the > user doesn't have to care about it. So, here are a couple basic > principles I'm hoping we can agree on. Remember that the "user" not only includes those which just want an answer and easy path to it, but those developing pipelines, not only for universal use, but also for "large" individual projects. > * The actual system should be as "flat" as possible. > : All commands -- tasks, methods, whatever -- should be available > directly, without (e.g.) entering a different environment. I think though that all possibilities of "assisted parameter setting environments", be them graphical or text, already violate this. Even at the aips++ script level behavior within the ( ) of a method is pedantically a different environment, so I don't buy your request. I do (and think that everyone in the project does too) agree that everything that is entered at the user interface should be offered at the same face value to those writing scripts. This requirement is more or less undebatable. > : All commands -- tasks, methods, whatever -- should look alike to > the user. I.e., one should enter parameters in a similar fashion; > those parameters should have a consistent naming convention; the > commands should be similarly documented and maintained; and all > commands should be usable in the same way in scripts. The user > should not for instance care whether the implementation is through > python, C++, or FORTRAN, or even just a simple script+interface. I think this is a project (ALMA,EVLA) requirement. Ultimately the user is providing parameters for functions/tasks which eventually get executed. I think no one is suggesting that the entry of these parameters should be different whether it be a task or procedure. HOWEVER, it is equally undebatable that all the low-level tool-based functionality is allowed though the user interface, perhaps via slightly more esoteric incantations, given that python-level tools will almost certainly be a staple amongst expert users. > : There should NOT be another level of commands which only experts know > how to use. Note that I do not say such commands should not exist -- > there are commands in any package that only the cognoscenti use or > appreciate. But those esoteric commands should be accessible in the > standard fashion, and should be documented and maintained in the same > way. Assuming the "normal" CASA approach conceals objects (for > instance) from the user, intelligently handles opens/closes so the > user doesn't have to pay attention to that, etc., the "esoteric" > commands should share this approach. One should not have to learn > another level of complexity in style/syntax to write expert scripts > -- > one should be able to build directly on the basic system. In this > same vein the scripts (and pipelines) we provide as examples should > look like what one types at the command line, not some more advanced > form of OOP. I think you are asking for two mutually incompatible features : full support for beginner users with a stateless interface and full complexity for experts. The object oriented (state-full objects in memory) approach is hugely powerful and really cannot be side-stepped for complex scripting. I think there is a huge misconception regarding the inner workings of the toolkit-level objects based on the rejection by those not used to them. I'd happily admit that this interface is not for the person that just wants an image to be made from UV data (thats what tasks/pipelines are for), but requiring someone to spend a few days to learn the toolkit in order to build a pipeline is not. The toolkit is really quite logically arranged (though could use a bit of love to smooth of some rough edges). However, due to its complexity it cannot be fully exploted in single line execution examples or purely though tasks. > : My (crazy?) interpretation of this is that > - tools, from the user point of view, should vanish > - methods, from the user point of view, should be callable directly, > with any requisite instantiation, opening, closing, etc., hidden > from the innocent eyes of the user. Maybe this means there are > really boring "tasks" written which check whether the thing is > already open correctly, open it if need be, run the method, and > return. That is an *implementation issue*, and not appropriate for > us to comment on at this stage. If Joe comes back and says "gosh, > this is hard", we negotiate until we find an acceptable compromise. > - tasks can be anything the programmers desire, but should look and > feel the same as methods. I think my perspective on this is that users doing mundane processing should not realize that they are using the toolkit and should be able to go end-to-end without explicit open/close, using a common input environment, but this should not preclude the experts from enjoying the full power of the system. I think we actually *agree* here. > : A subsidiary requirement in all this is that we retain access to > the full functionality of the current methods. Yes > : A side effect of this is that the concept of tools basically > vansihes from the user's p.o.v. A Good Thing for the OOP-challenged. Vanishes from the users p.o.v. if the user chooses to ignore it. Alternatively the user could spend a couple days learning about a powerful way to interact with data. I am sort of put off by the assumption that users don't want to learn how to use a new package. Astronomers are not stupid people. So long as the learning curve is not steep is not a problem. > : THIS IS MORE CONTROVERSIAL BUT IS A DETAIL: > Ideally no "setup" command would be required, before running any > other command. The aim here is to avoid the complex open/specify > subsets of/run/close sequence which seems to annoy users currently. > - This obviously breaks for e.g. plotting a bandpass (which requires > that you have created said bandpass!) -- one also cannot image if > the MS hasn't yet been loaded. I don't think anyone will complain > about that sort of restriction, which is common to most packages. > - More seriously, this argues against the SELECT and SETMASK commands > we discussed running before CLEAN etc. I can weasel out of those > by thinking of those pre-commands as basically being a simpler > way of specifying input parameters to the only actual command from > the user's point of view, which is CLEAN. That's my own preference I think this perspective ignores the clash between requirement that the UI be homeomorphic to scripting and that pipelines use the same environment. Unless you are willing to be really narrow in the focus of the UI you need to think of extreme examples and allow for the worst. Think of all the reasons you cannot write end-to-end processing easily in POPS and make sure that none of those enter into the new system. It is not an easy task! I've written pipelines for AIPS and know many of the pitfalls, and would not be happy if the same were true down the road. Global variables for example can be a huge pain in the pipeline/scripting environment -- hence Steve's (any my own) cringes at these ideas. AIPS has bitten me so many times by assuming I want to follow the standard procedure -- for example I want to use a different image model in calibration for each IF and this is _really_ hard in AIPS. My take is that implementation and interface need not clash, but you need to allow the interface to allow for everything that you may want. Given almost any interface that relies heavily on global variables I can almost certainly think of pains cause by such. Not relying on globals is one step that can be taken that vastly simplifies things. > I view the whole SELECT business as left over from the part of the > current interface that we most dislike -- but I don't think there's > any consensus on that. Agreed. > I guess I would do this as > 1- SELECT is possible but not essential to running CLEAN. > SELECT gives you a nice way of doing complex selections, and > returns a pointer (or a variable, or whatever) which then becomes > the default input to uv tasks like CLEAN. You can over-ride > this default within CLEAN if you like. Yes -- a good old local variable, completely specifying the state of a selection operation. A prime example of OOP in action! I am fully in agreement here. Note there may be efficiency issues associated with maintaining unusued selection objects, especially for huge datasets. > 2- SETMASK creates a mask or masks, and again returns a > pointer/variable/map which is also made the default input to > appropriate tasks. Again you can over-ride this in CLEAN if you > want. Again I agree, with the same concern. > So in this model neither SELECT nor SETMASK are necessary -- they > are > conveniences, but can be over-ridden. Nor are they simply prequels > to the main action -- they actually do something, even if it's only > returning a pointer/variable which can then be referenced. [I do > wonder about how long that thing sticks around -- if I over-ride > it in CLEAN, presumably it vanishes.] Perfect. In fact, i think in many cases "select" --> new ms -> image without further selection is likely a good way to proceed. > > * This "flat" approach is IN CONTRAST to the documentation, which must > be intelligently hierarchical as well as flat. > : Information should be grouped in areas that make sense to the > astronomer, rather than according to the "tool" in which a method or > whatever resides. Hence > help imaging > gets you the list of all imaging commands (with one-line > descriptions), > help imanalysis > gets you the list of all image analysis commands, etc. Within > these we could organized further -- I'm not sure of this -- maybe > the most common commands near the top, and more unusual, special > purpose, or low-level commands further down, or separated in some > fashion. This is NOT what AIPS does -- there you just get > alphabetical, which has its uses as well. I'm not sure of the > details here. Good. > : This same information hierarchy should be reflected in the cookbook, > and in the URM. Definitely. > : Some version of "apropos" would be useful too. Absolutely. > : On the "flat" side: one should be able to say HELP or EXPLAIN COMMAND > (or PARAMETER) for every command/parameter, at any time, and get the > appropriate information. Note this is another push for consistent > parameter names. Be slightly careful here. A balance between number of distinct parameter names and consitency of use is non-trivial. There will almost cetainly be compromises in both directions. As far as flatness goes, I think that all task-level literals should be available at the root level (in english, all tasks should be accessable without a . ) Enough tasks will transcend the tool level (ie use multiple, different .'s) and there will be few enough of them ( << 1000) that further classification is likely to cause more confusion than good. > Note that in all this I've concealed the association of methods and tools > from the user. Bad idea, I think, for anything not at task level. > I'm not sure whether this means horrible hits in > efficiency, > programming pain, etc. -- maybe for speed one shouldn't switch back and > forth between im. and cb. or some such -- but again I see that as > something > for the programmers to weigh in on. There were things in AIPS that just > made things faster, but were a pain for the user: switching between TB and > XY order of uv-data, transposing a cube before calculating moments, etc. > etc. Somehow we put up with those, and similar considerations may pop > implementation details up into the user domain in CASA. But we should try > to avoid forcing this on the astronomer unless it really is *necessary*. I agree with the later part of this for the case of task-based applications. For tool-based usage (for efficiency sake) the programmer should be responsible. > Anyhow that's my two cents. I do think we should have some basic > philosophy up at the top of the document, to set the basic stage, before > getting into details about how input works etc. I think your ideas are in general good from a pure end-user, non-developer perspective, but from an algorithm writer's point of view I think there are holes. I think of instead of imagining a gap between the tools and the tasks, think of an overlap in functionality between tasks and tools. Better, think of tool-space as a superset of task-space, and hope for a task-space big enough for most use cases. And expect that (and allow for) complicated analysis will use tool-based functionality. ============================================================================== Date: Mon, 3 Apr 2006 09:14:49 -0600 (MDT) From: Michael Rupen Subject: Re: A few uber comments Hi Walter, many thanks for the comments -- they are really excellent, particularly the ones slamming on me for separating user and developer. That's a very good point. I fear I partly fell into the "us vs. them" trap I was trying to avoid, after the last week of very heated (all smoke, no fire? no wait... :) discussions. It is desperately important not to drive any further wedges between astronomer and programmer, and to allow individuals to move easily between the two worlds. One of my concerns going into last week was whether all the confusion about the OOP approach and vocabulary (tools, methods, objects, instantiation) just resulted from our choice of testers. The non-developers are mostly pretty experienced old fogies who grew up with FORTRAN, and think function calls are a pretty cool idea :) We're building an instrument for the next generation; maybe the current postdocs and gradstu are all hep to this stuff, and wouldn't find it confusing at all. In this regard I was discouraged that Crystal and John, who are at least a little closer to the bleeding edge of astronomical _researchers_ (not developers, I know!), seem to share the pain of the older folks. I've also raised the point with a small random sampling of current gradstu and postdocs, and it seems to me that those who use rather than write software are pretty clueless about this stuff. So I fear this approach is a real issue for the interface, for many if not most of our likely users. Of course one of the most *useful* segments of users is the astronomically inclined software developer, someone who will actually dive in and make tasks that the rest of us can gratefully turn to our own use. There are alas far too few of these people, but that just means we have to work much harder to be sure we keep them happy and productive. One of the great promises of AIPS++ (and now CASA) was to make the development and implementation process easier and more enjoyable, and we should not back away from that. I had a long chat with Sanjay on Friday on this point: we can't worry so much about spiffy interfaces that we put a straitjacket on the developer, at any level, whether at the procedure/scripting level or way down in the code. So I fear this really is a hard problem. I do NOT want to make your life, or Steve's life, or Sanjay's life, any harder. But I DO want to be sure that astronomers totally focussed on data exploration, imaging, etc. will be happy to jump into the system and use it for their comparitively basic work. My worry on this side is that the development has now been going on so long with the current approach that the developers are all very familiar and happy with the current environment, while the basic users, the boring old astronomers, are left clueless. We need to strike a balance, and currently the system feels (from the pure astronomer's p.o.v.) very tilted towards the software developer. A few more direct responses follow...I've clipped bits of the original message to make those a little easier to find. On Fri, 31 Mar 2006, Walter Brisken wrote: > > : All commands -- tasks, methods, whatever -- should be available > > directly, without (e.g.) entering a different environment. > > I think though that all possibilities of "assisted parameter setting > environments", be them graphical or text, already violate this. Even at > the aips++ script level behavior within the ( ) of a method is > pedantically a different environment, so I don't buy your request. I do > (and think that everyone in the project does too) agree that everything > that is entered at the user interface should be offered at the same face > value to those writing scripts. This requirement is more or less > undebatable. > We agree on this point I think, just express it differently. I was referring to commands that do things, as compared to ways of inputting parameters that tell thos commands what to do. I was hoping this was as you say so undebatable that we didn't need to debate it :) > > > : There should NOT be another level of commands which only experts > > know how to use. Note that I do not say such commands should not exist > > -- > > there are commands in any package that only the cognoscenti use or > > appreciate. But those esoteric commands should be accessible in the > > standard fashion, and should be documented and maintained in the same > > way. Assuming the "normal" CASA approach conceals objects (for > > instance) from the user, intelligently handles opens/closes so the > > user doesn't have to pay attention to that, etc., the "esoteric" > > commands should share this approach. One should not have to learn > > another level of complexity in style/syntax to write expert scripts > > -- > > one should be able to build directly on the basic system. In this > > same vein the scripts (and pipelines) we provide as examples should > > look like what one types at the command line, not some more advanced > > form of OOP. > > I think you are asking for two mutually incompatible features : full > support for beginner users with a stateless interface and full complexity > for experts. The object oriented (state-full objects in memory) approach > is hugely powerful and really cannot be side-stepped for complex > scripting. I think there is a huge misconception regarding the inner > workings of the toolkit-level objects based on the rejection by those not > used to them. I'd happily admit that this interface is not for the person > that just wants an image to be made from UV data (thats what > tasks/pipelines are for), but requiring someone to spend a few days to > learn the toolkit in order to build a pipeline is not. The toolkit is > really quite logically arranged (though could use a bit of love to smooth > of some rough edges). However, due to its complexity it cannot be fully > exploted in single line execution examples or purely though tasks. What I do NOT want is a completely different way of thinking when I write scripts, to when I run tasks in the "standard" way. I would like some way of leading the basic user (we need a term here :) towards scripting, gently, beckoning onward rather than raising barriers. If we say, "oh, you're writing a _script_! Well then, first you should learn about objects..." that could really stop people from using this. NOT a good thing. One response -- the current response -- is to say "Well, this is just too powerful, and everyone just has to think this way." That's not good enough. In fact it drives the basic users crazy! (well, crazier, anyhow :) If we must have all the baggage to get all the power, we have to come up with some way to either 1- lead people gently from the sort of task approach where objects are hidden, to the full-up OOPing needed for the most powerful scripts, or 2- make the use of OOP at the basic level more intuitive to those not used to programming. I didn't say this was easy. I just want to make clear our top-level requirements, that basic user, developer, and pure programmer ALL be happy. > > : My (crazy?) interpretation of this is that > > - tools, from the user point of view, should vanish > > - methods, from the user point of view, should be callable directly, > > with any requisite instantiation, opening, closing, etc., hidden > > from the innocent eyes of the user. Maybe this means there are > > really boring "tasks" written which check whether the thing is > > already open correctly, open it if need be, run the method, and > > return. That is an *implementation issue*, and not appropriate for > > us to comment on at this stage. If Joe comes back and says "gosh, > > this is hard", we negotiate until we find an acceptable compromise. > > - tasks can be anything the programmers desire, but should look and > > feel the same as methods. > > I think my perspective on this is that users doing mundane processing > should not realize that they are using the toolkit and should be able to > go end-to-end without explicit open/close, using a common input > environment, but this should not preclude the experts from enjoying the > full power of the system. I think we actually *agree* here. Yes! Hooray! :) > > : A subsidiary requirement in all this is that we retain access to > > the full functionality of the current methods. > > Yes > Sorry, I just had to leave in a few YESes... > > : A side effect of this is that the concept of tools basically > > vansihes from the user's p.o.v. A Good Thing for the > > OOP-challenged. > > Vanishes from the users p.o.v. if the user chooses to ignore it. > Alternatively the user could spend a couple days learning about a powerful > way to interact with data. > > I am sort of put off by the assumption that users don't want to learn how > to use a new package. Astronomers are not stupid people. So long as the > learning curve is not steep is not a problem. But the current approach has not been very well received. Astronomers are willing to learn to use a new package; learning a new way of thinking is something else again. If we don't want to lose astronomer/developers like you, we also don't want to use interferometric pundits like Ed, and we don't want to p--- off users the first time they try out the system. The differences between the styles of IRAF, Miriad, AIPS, IDL-based packages, DS9, Karma, difmap, etc. etc. are vastly less than those between CASA and any one of these systems. If you know of another package in general astronomical use which looks and feels like CASA, please tell me about it! Maybe we could learn how they got their community to use the thing. I would love a counter example here. I guess another way of putting this is that the learning curve DOES seem to be steep, so we do have a problem. > I think your ideas are in general good from a pure end-user, non-developer > perspective, but from an algorithm writer's point of view I think there > are holes. I think of instead of imagining a gap between the tools and > the tasks, think of an overlap in functionality between tasks and tools. > Better, think of tool-space as a superset of task-space, and hope for a > task-space big enough for most use cases. And expect that (and allow for) > complicated analysis will use tool-based functionality. My view of this group's task was indeed to think of this from the pure end-user's p.o.v., since the developers seem basically happy at the moment while the end-users have been fairly grumpy. Both groups may have to compromise. The beginning of that process seems to me to be to state the ideals, then move on to the discussion and what compromises may have to be made, on both sides. Again, many thanks for all the comments. I didn't respond to every one individually but I read and am thinking about them all! Cheers, Michael ==============================================================================