Installation Summary for Grid Engine |
To install Grid Engine and test the correct setup of its functionality proceed with the following tasks:
General Overview Create a Grid Engine administrator account and set up a service portAn administrator account must be specified. The administrator can be an existing user or a new user may be created for this task. This account will own all of the files and it is used to configure and maintain the cluster once the software is installed.The administrator account must exist prior to installation. We recommend 'sgeadmin' as the administrator account belonging to the 'adm' group. The software uses a tcp port for communication. All hosts in the cluster must use the same port number. The port number can be set in the following places:
Add the following to the services database (the port number does not matter, it must be unused on your system and should be a reserved port) sge_commd 536/tcp # communication port for Grid Engine
Create a directory and unpack the distributionAs the Grid Engine administrator, do the following:If you received the distribution in "pkgadd" format% mkdir <your_gridengine_root_directory> Install the packages with "pkgadd" on your file server (all files will have the correct permissions and ownership)or if you received the distribution in "tar.gz" format
(e.g. /share/gridengine ) % mkdir <sge_root> % cd <sge_root> % gzip -dc sge_<version>_common.tar.gz | tar xvf - % gzip -dc sge_<version>_<arch>.tar.gz | tar xvf - (repeat for all architectures you need) <sge_root>/util/setfileperm.sh (all Grid Engine directories and files should be owned by the administrator, some files need to be installed suid root) This script must run on a machine where user root has appropriate permissons to chown/chmod file. This script not necessarily need to run on the qmaster machine. Additional information before installing
The Grid Engine installation program needs to be run as root in order to start the daemons. Root does NOT need write permission on the fileserver. Once Grid Engine is installed, the administrator can handle all day to day operations. The machines DO NOT need to be rebooted as part of the Grid Engine installation. #!/bin/csh tty -s # checks terminal status if ($status == 0) # succeeds if a terminal is present <place all stty commands in here> endif Install Grid EngineThe installation is a two step process. First, the Grid Engine files are installed and configured on the master. Then, a small installation is done on each execution host to configure and start the daemons, and to add automatic daemon startup to the init area. This requires logging on to each execution host as root and manually running the install program. Alternatively, if there is a secure machine with root rsh access to all machines, the execution host install can be done from a single machine.
As root, on the master host, run: % ./install_qmaster (This is a shortcut for ./inst-sge -fast -m) This will install the Grid Engine master. As root on the execution host machines, run: % ./install_execd (This is a shortcut for ./inst-sge -fast -x) This will install the Grid Engine execution daemon. Verify installationAfter the installation is completed, the installation can be verified. There are some sample scripts in $SGE_ROOT/examples/jobs.First source the proper settings file to setup the Grid Engine environment:
% source $SGE_ROOT/default/common/settings.csh $ . $SGE_ROOT/default/common/settings.sh % qsub $SGE_ROOT/examples/jobs/sleeper.shYou should see output similar to the following: % qsub $SGE_ROOT/examples/jobs/sleeper.shVerify that all of the queues have been installed properly by running the following: % qstat -f (full listing of the queues) Using Grid EngineThe main submit commands are qsub, qrsh and qtcsh. See the man pages for submit(1) and qtcsh(1) for more details.
In general, qsub is used for traditional batch submit, that is where I/O is directed to a file. Note that qsub only accepts shell scripts, not executable files. There is an application script, qs, which will allow qsub to accept executable files directly. Qrsh acts similar to the rsh command, except that a host name is not given. Instead, a shell script or an executable file is run, potentially on any node in the cluster. I/O is directed back to the submitter's terminal window. By default if the job cannot be run immediately, qrsh will not queue the job. Using the '-now no' flag to qrsh will allow jobs to queue. Note that I/O can be redirected with the shell redirect operators. For example, to run the uname -a command: % qrsh uname -a The uname of some machine the scheduler selects in the cluster will then be displayed on the submitting terminal. To redirect the output, % qrsh uname -a > /tmp/myfile The output from uname will be written to /tmp/myfile on the submitting host. To allow the command to queue: % qrsh -now no uname -a If a suitable host is not immediately available the command will block until a suitable host is available. At that time, the command output will be displayed on the submitting terminal. See the qrsh(1) man page for more details. Grid Engine contains a modified tcsh, qtcsh which will automatically submit jobs listed in a task file to the cluster. See the qtcsh(1) man page for more details. |