Jianglai 11/06/01 For comments, please email me at jianglai@jlab.org. This is a note on how to retrieve the data files from Jlab SILO system and use the batch system to do the analysis for G0. If you are familiar with 'jcache' and 'jsub', go to part 3. 1. Data files are dumpped automatically onto the Mass Storage System (mss). You can see them from any CUE machine (jlabs1, jlabs2 etc). The directory for G0 in the chaintest was /mss/hallc/gzero/chaintest. But you can not access them unless they are moved onto the temporary cache disk. The command to move the files (tapes) from mss to the cache disk is 'jcache'. For instance, 'jcache /mss/hallc/gzero/chaintest/run100.dat' will take the run100.dat from mss to the cache disk /cache/mss/hallc/gzero/chaintest/run100.dat. Note the difference in paths. If you want to move a bunch of files at once, you better put all the filenames in a file and use the command 'jcache -f myfile'. I think somehow 'return' has to the the last character of this file. Another trick is to move the files on the same tape at once. To get the specific tape ID and status information of a file, do 'jls -a -f filename'. The full description on the silo commands can be found on the web-site: http://cc.jlab.org/scicomp/ 2. You may need to wait for a while for the jcache to finish its job. After that, you can access the data files on the cache disk. The way to analyzer the data files using the batch system is by using the 'jsub' command. Basically, you need to create a 'command file' (say, myjob.command), with certain format. The details of the format are described on the web-page I mentioned above. You need to provide 'PROJECT NAME' (which project's computing time you will use, for example, 'hall c') and the 'COMMAND' you will use with full path (It could be an EXECUTABLE or a SCRIPT). I will provide an example later. The syntax for jsub is 'jsub $commandfile'. After jsub submits a job, the next available farm machine will process your job. You can choose to receive an email notification once it is done. 3. The above two steps are basically what one needs to get a job done. To really take advantage of the batch system, you may want to submit several jobs to the farm so individual data file could be analyzed in parallel. A smarter trial would be to write some scripts to execute jcache and loop jsub on all the cached files. I provide some examples of the scripts that Paul and I wrote. I hope it will help you to get some idea how this works. On the CUE, go to /home/jianglai/analyzer/work/. You will find a file job_submitter.csh. This is a script that perform both 'jcache' and make loop on 'jsub'. Sorry for the clumsy 'awk' and 'sed'. We may replace them later with neater perl commands. There are some files (e.g. 'goodfile') in the same directory that has the filenames (in full paths) that we wanted to move to the cache disk and analyze. So, step by step, >> The script does a jcache -f goodfile1. >> Loop through all the file names ++++ Once one of the file in on the cache, create a UNIQUE command file. The format of the command file is generic. Note the 'COMMAND' we provide in this command file is 'asym_analysis_silo.csh $cachefile' where asym_analysis_silo.csh is another script in the same directory that will do the analysis job and $cachefile is a the argument of this script. It is the datafile we just put onto the cache and will be analyzed by 'g0analysis -r' command inside the asym_analysis_silo.csh script. You may want to take a look at an existing command file, e.g., '8319.command' in the same directory. ++++ Then jsub the command file that we just create. >> End of the loop Now, let's take a look at the asym_analysis_silo.csh. It also looks very clumsy. I hope I can explain the general idea. The basic thing here is, since we want to do the analysis in parallel on several different data files, we first initialize the G0Analysis code each time we call this script. Then we link the argument of the script (the data file we want to analyze) with the default $G0SCRATH/data directory and call the executable 'g0analysis -r runnumber ...'. The rest is just a matter of putting the output files (root files and some error files) into the right places and right names. We dealt with some tricky things since sometimes we split the SAME run into several files and need to get around when we try to analyzer these files in parallel. This makes the script ugly. Please ignore that part if it is irrelevant to you.