ProcessRun starts the whole process from the CUE machines and takes as its first argument a run number. The usage for ProcessRun is as follows
Usage: ProcessRun [-h] run_number +env string +exe <exec> [+o <OS>]
[+queue <queue>][-OPTS] ARGS
-h Print this message.
run_number Run number to be processed.
+env <string> File name, w/path, of environment variable
text file.
+exe <exec> Executable to be run w/full path specified.
+o <OS> Operating system to run job on [default
= Linux].
+queue <queue> Farm queue to run on [default = ].
-OPTS Options preceeded by '-' passed to a1c.
ARGS Arguments are passed to executable.
Any arguments, options preceeded by a minus sign, or options that ProcessRun does not recognize are passed automatically to the executable that will be run on the batch farm.
The other way in which the ``cooking'' scripts share information is through the use of environment variables. Thus, ProcessRun's +env argument is very important to the execution of all of the scripts and must be specified when running ProcessRun. This option allows the user to specify a text file which contains all of the environment variables used by the scripts along with the definitions of these variables. These variables are
The variables CLAS_PROG, CLAS_DB, CLAS_OUT_PATH, and
CLAS_TAPE_PATH define the location where output from RunJob and
the executables it runs will be placed. If these directories do not exist previous to the execution of
ProcessRun, the processing scripts will create them as they are needed. The
standard directory structure used by these scripts can be found in Fig. 1. A sample text
file to use with the +env option can also be found under CVS with the name
scripts/cooking_scripts/ENV_SRC_FILE.
Returning to the initial discussion, if a run number is not specified, ProcessRun goes to the directory specified by the CLAS_DB environment variable and opens the previously mentioned list. This ASCII file should contain a list of run numbers to be processed with one run number per line. Once ProcessRun has a run number it passes it to nextRun. nextRun checks that the run has not already been submitted and that it has not been chosen to be skipped. nextRun performs this task by searching for the run number in the files done, which is automatically generated by ProcessRun after a job is submitted to LSF, and skip, which can be created by the user. Both of these files have the same format as list and are also in the CLAS_DB directory. If the skip file does not exist, nextRun only looks at the done file. If the user supplied the run number and the run has already been processed or it has been chosen to be skipped, the user is prompted as to whether he wishes to continue submitting that particular run. If the run was picked from list, nextRun continues to grab runs from list until it finds a run that is deemed suitable for processing.
Once a run has been selected, ProcessRun calls makeSub. makeSub creates a submission file that tells the farm node how to execute the job passed to it. It takes arguments from ProcessRun about the operating system to run on, the queue to place the job into, and the location of the executable that will perform the job.
After this submission script is completed a file of the form
PROJECT: clasE1
JOBNAME: Process-16159
COMMAND: /work/clas/production/e1/pass0.x/cooking_scripts/RunJob
MAIL: claschefcebaf.gov
OS: Linux
QUEUE: redhat52
OPTIONS: 16159 +P 0x4000 +base prod17
INPUT_FILES: /mss/clas/e1b/production/pass0.x/prod17/cooked/
/run16159_prod17.A00.00 /mss/clas/e1b/production/pass0.x/prod17/
cooked/run16159_prod17.A01.00
exists in the CLAS_DB directory. The above example is for the first two files of run 16159 to be processed on a Linux machine that is part of the redhat52 queue. The options line contains arguments that were passed to ProcessRun which are in turn getting passed on to the executable running on the farm, which in this case is RunJob. ProcessRun then submits the job for the user to LSF.
In their current form, the scripts ProcessRun, nextRun, and makeSub can be used to process a number of runs on the batch farm using any executable. For the calibration and cooking processes, though, the program that is run on the farm nodes is the Perl script RunJob. RunJob performs the task of executing other binaries, managing their output files, and parsing their output for important information that will be stored in the off-line database. A schematic diagram of the operation of RunJob is shown in Fig. 2. RunJob has the following usage:
Usage: RunJob run_num [-h] +P 0x# +env <string> [+base <string>]
[+electron|+photon] [+S][+se | +sp]
-h Print this help message.
run_num Run number to be processed.
+P [0x#] Bit wise process flag.
0x1 Run a1c.
0x2 Run pid_mon.
0x4 Run pdu.
0x8 Run scaler_mon.
0x10 Run photon_mon.
0x20 Run elastic_mon.
0x40 Run inelastic_mon.
0x80 Run cc_mon.
0x100 Run e_filter.
0x200 Run g_filter.
0x400 Run physfilter.
0x800 Run trk_mon.
0x1000 Run rf_mon.
0x2000 Run sc_mon.
0x4000 Run Cole Smith's EC ntupilizer.
0x8000 Run Sync check.
0x10000 Run KK_filter.
0x20000 Run Italian filter.
+electron Set run type as electron.
+photon Set run type as photon.
+S Run pid_mon as seb_mon also
+base <string> Basename to attatch to each file
+se Run standard electron processing (+electron
+P 0x3fdef +S -O -i -F -D0xf3dd -P0x3fb -se)
+sp Run standard photon processing (+photon -i
+P 0xb88f -T1 -st1 -O -D0x103d -cm0 -P0x1fff)
The executable a1c itself has a large number of flags which can be set. Therefore, any argument which is preceeded by a '-' that is passed to RunJob is automatically passed along to a1c. All of the other monitoring and calibration programs have their command line arguments hard coded into their specific subroutine. These subroutines are all found in CLAS_PACK/packages/scripts/cooking_scripts. The premise of these packages, which use the naming convention RunNameofProgram.pl, is that the data file that they are running on has already been ``cooked'' using the standard a1c options. Thus, they will remake the banks which are typically dropped in production ``cooking''.
One other RunJob option which deserves mention is the flag +base. All of the programs which RunJob executes allow the user to specify the output file name. The +base flag for RunJob allows the user to select a standard name that will be tacked on to these output files in order to identify the file. The run number is also an automatic addition to the names of each of the output files. When using a1c, if the +base option is specified for RunJob, it is not necessary to specify the -N option for a1c. Similarly, if RunJob executes both a1c and sync, it is not necessary for the user to specify the -y option for a1c. This will also be automatically taken care of for the user by RunJob.
The most important task that RunJob performs is to monitor and report back the status of the executables that it runs. The first method that RunJob utilizes is to constantly update a progress file specific to the data file being processed. This ASCII file is located in the CLAS_PROG directory and gives the chef an invaluable tool for analysis of the health of a particular job. Primarily what is stored in this file are time stamps when each executable was begun and completed. The second method that RunJob uses to monitor the executables it runs is to parse the output of those executables and store it in the Perl off-line database that was previously discussed in this paper.