Software that might be interesting for Hall D

E.Wolin

26-Nov-2002




Introduction

Below I list and comment on a number of software packages and strategies that might be interesting for Hall D. This list is incomplete, and will change without warning. For an interesting perspective on open-source programming and other topics, see essays by Eric Steven Raymond, especially "The Cathedral and the Bazaar".


XML

XML has become the standard portable data interchange format. We could use it for calibration files, configuration files, GUI building, as well as for documentation. I agree with Larry Dennis that all data interchange should be done in XML format except perhaps for event and monte-carlo data.

Note that the base document for this document (you may need to view the document source...many browsers don't display XML and related files properly) is a valid XML document against its Document Type Definition (DTD, but see below). I used an XSL stylesheet to transform the base XML into HTML (using XSLT, or XSL Transformations). See also a Cascading Style Sheet (CSS), an older method to convert XML to HTML.

XSLT generally transforms one XML document into another.

Note that DTD's have recently been superceded by XML Schemas

Many parsers available in C/C++, Java, PERL, etc. Two strategies are SAX (stream-oriented), and DOM (Document Object Model).

(I use XML in this document as an example; I am not proposing a document format).


Electronic logbook

The CODA group at JLab is working on an integrated notebook/logbook tailored for experiment use (Codalog). It is still under development.

The Control Room Logbook. logbook uses XML to define a custom GUI implementing predefined widgets. The DOE2000 logbook was developed by a group of three national labs.

The Fermilab logbook is an extension of the DOE2000 logbook (alternate link). The MIDAS project at Triumph provides a simple but robust elog. The CLAS online logbook can support the features in all of these, but lacks a sophisticated GUI.

See also a Fermilab review of electronic logbooks.

Products range from simple, single-thread, glorified flat-file managers with a web interface, to sophisticated, multi-threaded, self-notarizing "notebook" systems that accept documents and images of all types. Some are 100% web-based, others stand-alone or mixed. They use databases, xml files, html files, all three, or just flat files.


Code Management

CVS is a standard release and version control package. I personally don't like CVS much (single repository bias and lack of detailed history). There are good commercial alternatives, but I'd rather use CVS than pay. There are some public domain alternatives, none widely used yet.

One very promising package is Bitkeeper. Read the FAQ for an excellent overview and criticism of CVS. Bitkeeper was developed for and adopted by projects such as Linux kernel development. I have been using Bitkeeper for a while and think we should use it.

See also the Sun Teamware document which describes a case study of a similar code management system.

Others: Subversion is an improved CVS clone.

Some sites devoted to code management: at www.dmoz.org and CMTools


Frameworks

We need to structure our software around some sort of formal framework. Gaudi from LHC-B can be used as a starting point for discussion, although I believe it is too much for us.

See also an article on Evolutionary Delivery and other interesting notes from the Atlas DAQ Software Development Site.


Geant4

Geant4 is the future of simulations in HENP. We should start learning Geant4.


GGE and GPE

Graphical Geant4 geometry and physics generators might be useful.


MCFAST

We use MCFAST now, it might even be useful for life of experiment, but it would need to be kept up to date with the standard (Geant4) simulation.


Inter-Process Communication

I have extensive experience with SmartSockets, a publish/subscribe interprocess communication package (CLAS, also used in CDF), CORBA is an old standard for remote method invocation, but seems to be disappearing. I think CORBA is over-designed, overly complicated, old-fashioned, and shouldn't be used.

Other products exist. We need to decide on message-based vs. object-based. The former is simpler and more flexible, the latter is for more tightly coupled systems.

Agents are a new concept which might be useful (there's loads of info...use Google). FIPA agents are being tested by the CODA group at JLab.

Perhaps SOAP is the wave of the future. Or maybe JMS or SOAP's cousin XML-RPC, or a newcomer XML Blaster. See the latter under internet resources for links to all kinds of middleware info.


Controls Software

Many commercial and public domain products and systems exist, perhaps we can unite them all under CDEV (see below). Software must be evaluated along with available hardware by the online group. CERN Controls Group evaluated many systems. We should look carefully at the agent-based package being developed by the CODA group.


CDEV

CDEV unifies control systems via a thin dispatch layer to multiple protocols, and is mainly being developed at JLab. It uses a "device,context,command,datain,dataout" pardigm. CORBA, SmartSockets, and Java RMI could be layers under CDEV, but with some loss of functionality.


EPICS

I dislike EPICS. I think it's overblown, too complicated, out in left field, closed and insular. If you don't buy into EPICS completely you soon start cursing it. Epics has just now (2003) realized that Java exists, an indication of neanderthal-think in the Epics community. If the old guard gets overthrown and sensible people take over, perhaps Epics may be interesting.

The Abeans/Cosybeans framework looks intersting, and the JLab DAQ group may develop a controls framework.


UML

Unified Modeling Language for software design...should look into this, but maybe wouldn't help much...should we get this formal? UML seems to have fizzled out as fast as it became popular...


DOE2000

The DOE2000 project will develop collaboration tools as well as numerical programming and visualization packages. They may produce something useful for us.


Databases

Do we need an object-oriented database? Relational databases (e.g. MySQL) should be adequate. Postgres seems very good.


Data format

I have extended and improved the CODA binary i/o package (EVIO) to meet Hall D requirements. In particular, I added a new 1-word header fragment type, and added full support for 8-byte data types. I further wrote a pair of routines that transform evio to xml (evio2xml) and back (xml2evio). evio2xml replaces cefdump, and probably xcefdump. I also wrote eviocopy, a utility that extracts selected events from an existing evio file. I think we should use EVIO for our binary format.

For all other formats I propose we use xml or a compressed version of xml.


ROOT

Should we use ROOT? How deeply? Many big experiments are using ROOT for their data format, online histogramming, etc.


JAS

Should we use JAS. Plotting widgets looks good. BaBar using this for online histogramming and perhaps for offline analysis. CNU has been using JAS for the trigger analysis and is very happy with it.


Code Documentation

Literate programming is an effort to document programs by writing documents that contain code (i.e. a document-centric approach). These efforts haven't been too successful. CWEB is a variant for C++.

The most successfull strategies are those that document the code via comments, then use a syntax-sensitive preprocessor to extract and reorganize the comments into documention (i.e. code-centric systems). Javadoc is widely used for Java. See also Doxygen, which handles multiple languages.

If we don't use a multi-language documentation package like Doxygen, I believe we should use Javadoc for all Java programs, and pick an equivalent for C++ (perhaps ccdoc).


CLHEP

CLHEP is a widely used standard C++ HENP class library.


HTL

HTL is a the LHC++ Histogram Template Library.


STDHEP

STDHEP is another standard C++ HENP Monte Carlo class library.


CERN ASD

Who knows what else might be useful from the CERN ASD.


ZOOM, FNAL stuff, etc

We need to check out ZOOM and other Fermilab and Fermilab Run II offerings. FreeHEP lists lots of stuff. Also: HepVis, OpenInventor from SGI, and OpenScientist.


NAG libraries

NAG replaces mathematical parts of Cernlib, but costs money.


Velocity, servlets, PHP, JSP, JavaScript, etc.

For server- and client-side scripting, document transformation, etc.. Velocity + servlets looks very good, JSP and PHP violate the MVC (model-view-controller) paradigm. Javascript still seems useful.


Colt

Java numeric and analysis libraries from CERN.


SoftRelTools

Software release tools used by BaBar, CDF, D0, etc.


STL

C++ Standard Template Library, widely used.


Alternatives to Make

I don't like Make. CONS from GNU is worth looking at. and perhaps COOK.

See also a critique of recursive make.

Recent alternatives: ANT from Apache (java only), nmake from Lucent, and JAM.


WIRED

CERN web-based event display package.


GUI Markup Languages

XML based GUI generation may be interesting: UIML, BML, probably others.


Kalman Filter fitting

We must learn how to do a Kalman filter track fit. A package from CERN might be interesting. if they did a good job. Better perhaps is an algorithm from CLEO called "Billoir" fitting. Billoir apparently worked out how to include energy loss in the usual Kalman fit formalism, a major advance.


FNAL OO experts

Should we invite them here for discussions?


Mozilla tools

Mozilla has a number of interesting tools, e.g. Bugzilla, Bonsai, Tinderbox, etc.


Software Configuration

Should we use a software configuration tool, e.g. autoconf?


Video/Audio/PPT archives

CERN has an interesting system for archiving conferences. They combine video, audio, and ppt presentations in one unified presentation called webcast. An LHC XML conference was recorded this way.