- What is Grid Computing?
A simple way to think about Grid Computing is that it is a way of taking advantage of the unused CPU-cycles of a pool of interconnected computers. Grids are often comprised of many machines of different architectures, running different operating systems. If the owner of a Windows or Linux workstation agrees to participate in the Grid system, when the workstation is idle its CPU cycles can be used by programs needing them.You may be familiar with internet computing projects such as Berkeley’s SETI@home or Stanford’s Folding@home, which are smart screen-savers that employ idle CPU cycles from participants’ machines by distributing calculations on vast amounts of data over a huge number of computers. Grid computing is a somewhat similar idea, but usually uses only machines within a particular institution. For more details on the U of A grid, see http://uagrid.arizona.edu/. Another useful site is http://www.ggf.org/UnderstandingGrids/ggf_grid_understand.php.
- What are the advantages of Grid Computing?
The sharing of CPU cycles from a large number of workstations provides a tremendous resource to grid users, because at any given time it is highly likely that the grid will have some idle machines that can do computing on the user’s behalf. Another advantage of using Grid Computing is that its computing power continually increases as older workstations are replaced by faster computers.
- If my workstation is part of the grid, how will my own jobs be impacted by Grid Computing?
There is no negative impact on local jobs on the workstation; these are given priority. If a local job is started on the workstation, a grid job will be stopped and transferred to an idle workstation where it is resumed. It is also possible for workstation owners to specify a range of hours in which the workstation is available for grid jobs.
- Can I use the grid if my workstation is not part of it?
Yes.
- What is the Condor system?
Condor is a software system developed by Miron Livny’s group at the University of Wisconsin and made available to the public in 2003. Condor is a specialized batch processing system that manages the jobs for a Grid Computing system. Jobs are prepared and submitted to Condor, and it takes care of finding the correct machine type and running the job.
- How does Condor enable programs to run on a grid?
The source code of the program is linked with the Condor library, which provides the functionality needed for running as a grid application. Condor also provides a Java Universe for Java applications and a front end for submitting a command line shell.
- How does Condor run jobs?
Condor places your job into a queue until the required resources are available, then starts your job on an available grid machine. It also takes care of switching the job to another machine if necessary, and sharing available resources with other jobs in the queue. When the job completes, condor will send you an email notification.
- What are the requirements for grid applications?
For C, C++ or Fortran programs, the source code for the application must be available to link with the condor library. On Linux systems the code must be statically linked. Multi-process jobs (e.g. using fork() or exec()) are not allowed, and neither is interprocess communication. All files used by the application must be opened either read-only or write-only. The complete list of restrictions can be found in the Condor User’s Manual. Perl or shell scripts can be executed using the condor_run command, but must read from STDIN and write to STDOUT (which can be redirected to a file). Java programs can be submitted using the condor Java universe.
- What applications are available to be run on the Condor grid?
At the present time we have the following applications Condor-enabled on the amadeus server: PAUP, genetree, puzzle, proml, baseml, codeml, hon3, hon-new, im, protdist, protpars, and mrbayes. To see the exact list of programs, type the command 'ls /usr/local/bin/linux*' on amadeus. Other applications can be linked with the Condor library if the above requirements are met.
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
to find out if a particular application can run on the grid.
- How can I submit a job to run on the grid?
At the minimum you will need to prepare a “.submit” file telling Condor what resources your job needs, where to find the executable file, what arguments to pass to the program, where to put the program’s output, and where to email the job notifications. Resource specifications include the computer architecture and operating system required by the program. Some examples are INTEL/LINUX, SUN4u/SOLARIS28, and INTEL/WINNT51.If your application prompts the user to choose files or other options, you will also need a “.in” file that contains the entries you would have typed if you were running the program interactively. The “.in” file is specified inside the “.submit” file. You may also specify a log file and an error file. There are several sample “.submit” and “.in” files in /etc/condor on amadeus. You may copy these to your own directory and edit them for your purposes.Once you have prepared your “submit” file (and “in” file if needed), you simply use the command ‘condor_submit myjob.submit’. Condor will return a number that identifies your job.
- How can I check the status of my job after submission?
Use the command ‘condor_q’.
- How can I stop my Condor job?
The Condor command ‘condor_rm job_number’ will stop your job and remove it from the queue.
- How can I find out what version of condor is running?
Use the ‘condor_version’ command.
- Can I see what resources are available to condor?
Yes. Use the ‘condor_status’ command.
- How can I add my computer(s) to the grid?
Begin by reading the Administrator’s section of the Condor manual at http://www.cs.wisc.edu/condor/manual/v6.4/ref.html . Also see: http://ccit.web.arizona.edu/index.php?id=uagrid. If you need more assistance, contact the HPC Help Desk:
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
- Where can I find out more about Condor?
There is an excellent manual at http://www.cs.wisc.edu/condor/
- What if I don’t understand something in this FAQ?
Please
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
to report any answers that are not clear or any errors that may be present in this document.