Queue load-balancing/distributed batch processing and local rsh replacement system

[image of the Head of a GNU] (jpeg 7k) (jpeg 21k) no gifs due to patent problems

Links


News

May 18, 2000 Queue 1.12.9 stable released. Fixes problems with obtaining load averages on some Solaris systems.

May 12, 2000 News flash! GNU Queue is now a participating project on Sourceforge. Developers and interested on-lookers should go here for more info.

Mar 7. 1999 Versions 1.12.7 (stable) and 1.20.1-pre2 (developmental) came out today.

The 1.20 development release supports experimental kernal-level and user-level checkpoint migration, which allows Queue to dynamically move running processes from host to host. The experimental code, using the Linux kernel checkpointing API, can checkpoint migrate the interactive 'vi' editor, along with many simpler scientific applications, including multi-process ones.

These versions may be obtained off the author's home page.

Feb 14. Version 1.12p2 (RPM 1.12-3) came out today. It fixes a number of bugs & should be upgraded to.

Feb 6. Version 1.12p1 fixes a bit of bad code that accidentally made it out into a briefly released version that was downloaded by a few sites. The new code is up on this site under the "1.12" filenames and should be corrected on other sites soon. The corrected code returns "1.12p1" as its "--version" string and lists the bug in its ChangeLog file.

Version 1.12 came out on Feb 5. It fixes the h_name/h_aliases[0] segfault bug that was being seen on some platforms. It also fixes some minor compilation issues and significantly reduces some distrcating compiler warnings on some platforms. Between 1.11 and 1.12, there is also one feature improvement: error mail is now shut down after a fixed number of messages.

Version 1.11 came out on January 22. It fixes some platform issues (Digital UNIX, SunOS, and probably IRIX as well). There are also a few minor changes that affect GNU/Linux (slightly improved load-balancing formula.)

Submitting Bug Reports

If it's not security related, submit it to GNU Queue's Bug Tacker Database on Sourceforge.

Otherwise, send a detailed email describing the problem as best as possible to <bug-queue@gnu.org>.

If you are subscribed to the queue-developers development list, you are encouraged to mail the bug report directly to the development group as well for discussion. (Security bug reports, if there are any, should be sent to the bug-queue list, not to the public development list!).

Generating, Submitting, and Publishing Patches

If you needed to make modifications to get Queue to compile on your platform, or if you've fixed a bug or made improvements to Queue, you are strongly encouraged submit a patch. This includes changes to the documentation or manual. Submitting a patch is relatively easy and will help make your improvements available to everyone. Patch submission is the process through which the net continues to gradually improve Queue.

  1. If the change involves only very simple changes (no more than two lines) to a single source file in the most current release, or involves changes to a documentation file, you may just run the diff command on the file:
    diff -u oldfile newfile > patch
    You may then skip to step 7, emailing and publishing the patch, below. Be sure the note accompanying the patch clearly states the file that was patched, the reason for the patch, and the release version of the distribution of the original file. (usually the same as the name of the directory).
  2. Otherwise, to make a patch, you'll want a prestine copy of the source directory upon which your modified version of Queue is based. You can obtain this from the tar file you originally unpacked (if you still have it), the closest snapshot in the CVS repository on Sourceforge, or, if worst comes to worst, the latest source release off the homepage.
  3. Place the two source directories together in the same top level directory with informative names. The original source directory should keep the same name that it unpacks with. Name your modified source directory something descriptive that describes what your patch does, e.g., 'queue-kerberos-hook-mod'.
  4. Get rid of any extraneous files in your custom source directory. (You might want to use "cp -rp" to create a backup of your source directory first if you've haven't done so already.) If you've installed Queue in the source directory tree (you have "bin", "sbin", "com" files lying around in the source tree) run "make uninstall" to get rid of these directories. Do a "\rm -rf bin sbin com var share" without privileges to make sure. Do a "make maintainer-clean" and a "\rm *~ */*~" to get rid of an extraneous files that were created during the configuration process and also any editor backup files.

    Now compare your custom source directory with the prestine source distribution; they should contain approximately the same files. Any new files should have been deliberately added by you; similarly, and missing files should have been deleted by you. Any missing or added files will be added verbatim to the patch file, so you want to keep these to a minimum as they will greatly increase the size of the patch. You want to use a source release as similar to your modified code as possible; this will keep the patch file as simple as possible.

  5. Run diff to generate the patch:

    diff -uNr queue-1.099-p2 queue-kerberos-hook-mod > patch

    would generate a file "patch" that would convert the queue-1.099-p2 distribution directory into your custom queue-kerberos-hook-mod distribution directory.

  6. If the patch is not security related, submit the patch to Queue's patch system on Sourceforge.

    Otherwise, email the patch to <bug-queue@gnu.org> using your favorite mailer. For example, if you are using mailx,
    mailx -s "Kerberos_patch" bug-queue@gnu.org < patch
    would mail the file "patch" to the bug list with the subject "Kerberos_patch".

    You should also send a detailed email describing the patch to the queue-developers mailing list. This way, users will know what the patch is for. Included in the email should be the name of the prestine source release (usually the same as the directory it unpacks as) that the patch was generated against.If the patch is very small (a few lines) you may want to post it to the list in its entirety as well.

Development projects


For  up-to-the-minute information, help, or support with our development projects, please be sure to subscribe to our development list,
queue-developers. .

queue-developers archives are currently available online.

Currently supported platforms are GNU/Linux (most flavors), Solaris, Sun-OS, and HP-UX; successful compilation on and/or porting to additional platforms should be reported to queue-developers mailing list.

Volunteers are still needed for the following projects. (You can sign up but you don't need to; if you implement the improvement, just follow the instructions above and send us a patch! Subscribing to the queue-developers list is a good idea as well.)

  1. Update: checkpointing support in 1.20; this item will be moved to the "completed" section soon.

    My latest pet project (after adding hetergenuous cluster support via the ASCII mechanism---volunteers?) is to allow Queue to do checkpoint migration for both interactive and batch jobs. What this means is this: you run your big interactive CPU-guzlling job (say, SAS or matlab) via Queue. It migrates out to a slow machine. After a period of time, the large SMP machine becomes free, and your interactive job starts on the slow machine, and then, a few moments later, picks up from where it left off on the fast machine. (The job might also want to leave because a local user wants to use the CPU or for other reasons. In this way, Queue could also scavange up spare CPU cycles by migrating jobs to unused machines.)

    To get this to work, (almost) all syscalls need to be trapped, sent over the network, and evaluated on a common machine. (Files are accessed this way, so /tmp file migration is not an issue, etc. Later this could be integrated into QFS below for jobs that need a fast local disk; temp file migration would then become an issue.)

    Two strategies are being considered: a strategy using ptrace to trap system calls, and a strategy to trap system calls using the dynamic linker. Since many commercial apps are static linked, I'm leaning towards the first strategy for now.

    AFS does some of this and is very nice, but is commercial and rather pricey. Its cost has probably impacted the acceptance of this otherwise exciting technology. A simple global filesystem is a logical extension for Queue.

    An alpha version would probably be a user-level variant amd and a user-level report server that exports only a single UID and maps that UID from one system to another; the login would be done using Queue.

    Three free (open source) projects looks like promising starts. coda a free networked file system based on AFS2, and GFS, which is aimed at FibreChannel networking and shared disk solutions. All projects are kernel-level. Coda currently has kernel modules only for GNU/Linux, FreeBSD, and NetBSD, and, unlike commercial AFS, apparently does not provide an NFS translator, so you can't just mount coda via NFS on unsupported platforms the way you can with commercial AFS. This means our Solaris users would be out of luck. GFS only supports GNU/Linus and SGI IRIX platforms.

    Most promising of all (so far) is the alra, a free AFS implementation with an alpha server. This supports GNU/Linux, SunOS, Solaris, OpenBSD, FreeBSD, NetBSD, and alpha support on IRIX, Digital Unix, AIX, and Rhapsody.

    AFS (and its free variants) have difficulties. ACLs are needed in global networked filesystems, but AFS's are not completely backwards compatible with Unix, so some software has trouble running. AFS stores files in a non-standard way on its servers, so you need to run a special fsck on the servers to prevent it from nuking your server's filesystem. Finally, AFS is very complicated (I was hoping for a very simple system based partially on NFS with user-level support daemons.) However, these projects look like a very promising start.

    One thing that we might want to do is write an NFS translator client for either Coda or GFS, so that unsupported systems with NFS access can import these filesystems. The translator might include a user-level (and thus far more portable) daemon that handles things like access-control and authentication negotiation with or GFS.

    Comments on this project are welcome.

  2. PVM support.
  3. Transmission of process information via TCP/IP sockets rather than NFS; elimination of requirement to export shared NFS directory root-writable in multiuser-installation mode.
  4. High efficiency process spawning; queued pre-forks and uses vfork to spawn processes very rapidly.
  5. Hooks for secure-socket support (best done using an internationally available secure socket library due to U.S. ITAR restrictions.)
  6. Augmentation of binary information exchange mechanism with telnet-style ASCII WILL/WONT mechanism to allow use in non-homogenuous clusters.
  7. Hooks for AFS and/or Kerberos support, which, together with WILL/WONT mechanism would allow jobs to be started anywhere on the internet via queue.
  8. Standard password authentication mechanism to allow optionally jobs to be started from outside the cluster, so that queue would be a true rsh-replacement even when the local file system is not available.
  9. Output multiplexing, sort of like implementing tee(1) into GNU Queue (except much simpler to implement) whereby an option to queue causes all output to be logged to a file (after giving up privileges) as well as being sent to standard out. Should be easy.
  10. Ports (or information on successful compilation) on additional platforms, such as DEC OSF, AIX, adding working pty support to the IRIX port (straightforward; see code in "man pty" on the IRIX), etc.
  11. Use of PostgreSQL or GNU SQL Database (or other database; if ANSI C/ESQL is used it could be written for any ANSI SQL database at the site) as a high-reliability central repository of job information for user-scripts and global caps in large cluster. Queued would periodically send its internal tables to the central server (when it is up) and, when the server is up, check to make sure no global constraints (e.g., on the number of global jobs) have been violated. A daemon, probably on the SQL server, would delete outdated table entries (presumably from clients that have been taken of the network.) Hooks would exist to allow management scripts to query the database for the number of jobs present. Folks on the development list can help with the SQL side of things. Suggested by Andrew Morgan.
  12. Optional automated resubmission of a job that has died due to downed client (Queue sort of supports this already; job will restart when queued restarts on the downed machine. However, other machines won't pick it up because they will set the job status to ANOTHER_HOST once it starts running on that host. Would like feature to control behavior of this on job queue basis. Some queues might restart old jobs, others might delete them. On restartable queues, another host might pick it up if it senses the first has lost a lock.) Suggested by a number of people.
  13. No more than 'NN' jobs running throughout the cluster. (Currently only supports per-machine limits.) Suggested by Andrew Morgan.
  14. Parallel make (already if MPI is installed, but we could have our own that just uses Queue). Suggested by Andrew Morgan.
  15. Ability to advertize resources, e.g., a raid is attached to a specific host, and then a client job requests hosts with a local raid disk more than a certain size. We sort of can do this already using specific job queues, e.g., a job queue could be set to only accept jobs on a certain machine. Suggested by Andrew Morgan.
  16. Maybe more information on job completion status in batch mode, i.e., maximum size of memory. How long it took and where it ran are already emailed back, but it could also explicitly list the hostname.
  17. Cap on number of email error messages per unit time. For example, if you make the mistake of deleting the spool directory (destroying Queue) while queued is running, it will send lots of email. This is a bug. Cap needs somehow to take into consideration that a lot of jobs could be spawned via batch mode, so job result mails run on a separate counter than regular email. Also, there might be an environmental variable like QMAIL that turns off mail completely (rather than just nixing the job's output.)

The current development teams for GNU Queue are below. Give them a hand!

  1. Improved MPI support; Jason Abate <abate@ticam.utexas.edu> has volunteered to look into this.
  2. Project to have GNU Queue send the configuration file over the network using TCP/IP sockets rather than NFS. Dave van Leeuwen <dave@elec.canterbury.ac.nz> has offered to look into this.

Completed projects that were implemented in 1.099:


Return to GNU's home page.

Please send FSF & GNU inquiries & questions to gnu@gnu.org. There are also other ways to contact the FSF.

Please send comments on these web pages and the Queue system to wkrebs@gnu.org; send other questions to gnu@gnu.org.

Copyright (C) 1998, 1999 W. G. Krebs, Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111, USA

Verbatim copying and distribution is permitted in any medium, provided this notice is preserved.

Updated: Jan 4 1999 wgk