ࡱ> ') !"#$%&[ bjbj "@jjc`N lTTTljjjjk<4ll(((<. H4J4J4J4J4J4J4$6 8n4N((NNn4$((4$$$Nz ((H4$NH4$$>x-xH4(l -4SjȤ<1xH4404h219 19H4$Large Scale Cluster Computing Workshop Fermilab, IL, May 22nd to 25th, 2001 Proceedings 1.0 Introduction Recent revolutions in computer hardware and software technologies have paved the way for the large-scale deployment of clusters of off-the-shelf commodity computers to address problems that were previously the domain of tightly-coupled SMP computers. As the data and processing needs of doing physics research increases while budgets remain stable or decrease and staffing levels only incrementally increase, there is a fiscal and computational need that must be and that can probably only be met by large scale clusters of commodity hardware with Open Source or lab-developed software. Near-term projects within high-energy physics and other computing communities will deploy clusters of some thousands of processors serving hundreds or even thousands of independent users. This will expand the reach in both dimensions by an order of magnitude from the current, successful production facilities. A Large-Scale Cluster Computing Workshop was held at the Fermi National Accelerator Laboratory (Fermilab, or FNAL), Batavia, Illinois in May 2001 to examine these issues. The goals of this workshop were: To determine from practical experience what tools exist that can scale up to the cluster sizes foreseen for the next generation of HENP experiments (several thousand nodes) and by implication to identify areas where some investment of money or effort is likely to be needed; To compare and record experiences gained with such tools; To produce a practical guide to all stages of designing, planning, installing, building and operating a large computing cluster in HENP; To identify and connect groups with similar interest within HENP and the larger clustering community. Computing experts with responsibility and/or experience of such large clusters were invited, a criterion for invitation being experience with clusters of at least 100-200 nodes. The clusters of interest were those equipping centres of the sizes of Tier 0 (thousands of nodes) for CERN's LHC project or Tier 1 (at least 200-1000 nodes), as described in the MONARC (Models of Networked Analysis at Regional Centres for LHC) project at  HYPERLINK "http://monarc.web.cern.ch/MONARC/" http://monarc.web.cern.ch/MONARC/. The attendees came not only from various particle physics sites worldwide but also from other branches of science, including biophysics and various Grid projects, as well as from industry. The attendees shared freely their experiences and ideas and proceedings are being currently edited, from material collected by the convenors and offered by the attendees. In addition, the convenors, again with the help of material offered by the attendees, are in the process of producing a Guide to Building and Operating a Large Cluster. This is intended to describe all phases in the life of a cluster and the tools used or planned to be used. This guide should then be publicised (made available on the web, presented at appropriate meetings and conferences) and regularly kept up to date as more experience is gained. It is planned to hold a similar workshop in 18-24 months to update the guide. All the material for the workshop is available at the following web site:  HYPERLINK "http://conferences.fnal.gov/lccws/" http://conferences.fnal.gov/lccws/ In particular, we shall publish at this site various summaries including a full conference summary with links to relevant web sites, a summary paper to be presented to the CHEP conference in September and the eventual Guide to Building and Operating a Large Cluster referred to above. 2.0  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/Workshop Goals.ppt" Opening Session (Chair, Dane Skow, FNAL) The meeting was opened by the co-convenors Alan Silverman from  HYPERLINK "http://www.cern.ch/" CERN in Geneva and Dane Skow from  HYPERLINK "http://www.fnal.gov/" Fermilab. They explained briefly the original idea behind the meeting (from the HEPiX  HYPERLINK "http://wwwinfo.cern.ch/hepix/cluster/" Large Cluster Special Interest Group) and the goals of the meeting, as described above. 2.1 Welcome and  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/Kasemann_Welcome.pdf" Fermilab Computing The meeting proper began with an overview of the challenge facing high-energy physics. Matthias Kasemann, head of the Computing Division at Fermilab described the laboratorys current and near-term scientific programme covering a myriad of experiments, not only at the Fermilab site but world-wide, including participation in CERNs future LHC programme notably in the  HYPERLINK "http://cmsinfo.cern.ch/Welcome.html/" CMS experiment, in  HYPERLINK "http://www-numi.fnal.gov:8875/" NuMI/MINOS,  HYPERLINK "http://www-boone.fnal.gov/" MiniBoone and the  HYPERLINK "http://www.auger.org/" Pierre Auger Cosmic Ray Observatory in Argentina. He described Fermilabs current and future computing needs for its HYPERLINK "http://runiicomputing.fnal.gov/"Tevatron Collider Run II experiments, pointing out where clusters, or computing farms as they are sometimes known, are used already. He laid out the challenges of conducting meaningful and productive computing within worldwide collaborations when computing resources are widely spread and software development and physics analysis must be performed across great distances. He noted that the overwhelming importance of data in current and future generations of high-energy physics experiments had prompted the interest in Data Grids. He posed some questions for the workshop to consider over the coming 3 days: Should or could a cluster emulate a mainframe? How much could particle physics computer models be adjusted to make most efficient use of clusters? Where do clusters not make sense? What is the real total cost of ownership of clusters? Could we harness the unused CPU power of desktops? How to use clusters for high I/O applications? How to design clusters for high availability? 2.2  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/LHC_Tues_WvR.ppt" LHC Scale Computing Wolfgang von Rueden, head of the Physics Data Processing group in CERNs Information Technology Division, presented the LHC experiments computing needs. He described CERNs role in the project, displayed the relative  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/Datavolume.ppt" event sizes and data rates expected from Fermilab RUN II and LHC experiments, and presented a table of their main characteristics, pointing out in particular the huge increases in data expected at LHC and consequently the huge increases in computing power that must be installed and operated for the LHC experiments. The other problem posed by modern experiments is their geographical spread, with collaborators throughout the world requiring access to data and to computer power. He noted that typical particle physics computing is more appropriately characterised as High Throughput Computing as opposed to High Performance Computing. The need to exploit national resources and to reduce the dependence on links to CERN has produced the ( HYPERLINK "http://monarc.web.cern.ch/MONARC/" MONARC) multi-layered model. This is based on a large central site to collect and store raw data (Tier 0 at CERN) and multiple tiers (for example National Computing Centres, Tier 1, down to individual users desks, Tier 4) each with data extracts and/or data copies and each one performing different stages of physics analysis. Von Rueden showed where Grid Computing would be applied. He ended by expressing the hope that the workshop could provide answers to a number of topical problem questions such as cluster scaling and making efficient use of resources, and some good ideas to make progress in the domain of the management of large clusters 2.3  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.ppt" IEEE Task Force on Cluster Computing Bill Gropp of Argonne National Laboratory (ANL) presented the HYPERLINK "http://www.ieeetfcc.org/"IEEE Task Force for Cluster Computing. This group was established in 1999 to create an international focal point in the areas of design, analysis and development of cluster-related activities. It aims to set up technical standards in the development of cluster systems and their applications, sponsor appropriate meetings (see web site for the upcoming events) and publish a bi-annual newsletter on clustering. Given that an IEEE Task Forces life is usually limited to 2-3 years, the group will submit an application shortly to the IEEE to be upgraded to a full Technical Committee. For those interested in its activities, the Task Force has established 3 mailing lists see  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.ppt" overheads. One of the most visible deliverables thus far by the Task Force is a HYPERLINK "http://www.dcs.port.ac.uk/~mab/tfcc/WhitePaper"White Paper covering many aspects of clustering. 2.4  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.ppt" Scalable Clusters Bill Gropp from Argonne described some issues facing cluster builders. The  HYPERLINK "http://www.top500.org/" http://www.top500.org/ web site list of the 500 largest computers in the world includes 26 clusters with more than 128 nodes and 8 with more than 500 nodes. Most of these run Linux. Since these are devoted to production applications, where do system administrators test their changes? For low values of N, one can usually assume that if a procedure or tool works for N nodes then it will work for N+1. But this may no longer stay true as N rises to large values. A developer needs access to large-scale clusters for realistic tests, which often conflicts with running production services. How to define scalable? One possible definition is that operations on a cluster must complete fast enough (for example within 0.5 to 3 seconds for an interactive operation) and operations must be reliable. Another issue is the selection of tools how to choose from a vast array, public domain and commercial? One solution is to adopt the UNIX philosophy, build from small blocks. This is what the  HYPERLINK "http://www-unix.mcs.anl.gov/sut/" Scalable UNIX Tools project in Argonne is based on basically parallel versions of the most common UNIX tools such as ps, cp, and ls and so on layered on top of  HYPERLINK "http://www-unix.mcs.anl.gov/mpi/fastchoice.html" MPI with the addition of a few new utilities to fill out some gaps. An example is the ptcp command which was used to copy a 10MB file to 240 nodes in 14 seconds. As a caveat, it was noted that this implementation relies on accessing trusted hosts behind a firewall but other implementations could be envisaged based on different trust mechanisms. There was a lengthy discussion on when a cluster could be considered as a single system (as in the  HYPERLINK "http://www.scyld.com/" Scyld project where parallel ps makes sense) or as separate systems where it may not. 3.0 Usage Panel (Chair, Dane Skow, FNAL) Delegates from a number of sites presented short reviews of their current configurations. Panellists had been invited to present a brief description of their cluster, its size, its architecture, its purpose; any special features and what decisions and considerations had been taken in its design, installation and operation. HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/lccws_seminar.ps"RHIC Computing Facility (RCF) (Tom Yanuklis, BNL) Most CPU power in BNL serving the  HYPERLINK "http://www.rhic.bnl.gov/" RHIC experiments is Linux-based, including 338 2U high, rack-mounted dual or quad CPU Pentium Pro, II and Pentium III PCs with speeds ranging from 200 to 800 MHz for a total of more than 18K SPECint95 units. Memory size varies but the later models have increasingly more. Currently Redhat 6.1, kernel 2.2.18, is used. Operating System upgrades are usually performed via the network but sometimes initiated by a boot floppy which then points to a target image on a master node. Both OpenAFS and  HYPERLINK "http://nfs.sourceforge.net/" NFS are used. There are two logical farms Central Reconstruction System (CRS) and Central Analysis System (CAS). CRS uses a locally designed software for resource distribution with an HPSS interface to STK tape silos. It is used for batch only, no interactive use; it is consequently rather stable. The CAS cluster permits user login including offsite access via gateways and  HYPERLINK "http://www.ssh.com/" ssh, the actual cluster nodes being closed to inbound traffic. Users can then submit jobs manually or via LSF for which several GUIs are available. There are LSF queues per experiment and priority. LSF licence costs are an issue. Concurrent batch and interactive use creates more system instability than is seen on the CRS nodes. BNL built their own web-based monitoring scheme to provide 24-hour coverage with links to e-mail and to a pager in case of alarms. They monitor major services ( HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS, LSF, etc) as well as gathering performance statistics. BNL also makes use the  HYPERLINK "http://vacm.sourceforge.net/" VACM tool developed by VA Linux, especially for remote system administration. They are transitioning to open SSH and the few systems upgraded so far have not displayed any problems. An important issue before deploying  HYPERLINK "http://web.mit.edu/kerberos/www/" Kerberos on the farm nodes concerns token passing such that users do not have to authorise themselves twice. BNL uses CTS for its trouble-ticket scheme. 3.2 HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/BaBar Use Case.ppt"BaBar (Charles Young, SLAC) At its inception, BaBar had tried to support a variety of platforms but had rapidly concentrated solely on Solaris although recently Linux has been added. They have found that having two platforms has advantages but more than two does not. BaBar operated multiple clusters at multiple sites around the world; this talk concentrates uniquely on the ones in SLAC where the experiment acquires its data. The reconstruction farm, a 200 CPU cluster, is quasi-real time with feedback to the data taking and so should operate round the clock while the experiment is operational. There is no general user access, it is a fully controlled environment. The application is customised for the experiment and very tightly coupled to running on a farm. This cluster is today 100% SUN/Solaris but will move to Linux at some future time because Linux-based dual CPU systems offer a much more cost-effective solution to the problems, running on fewer nodes and with lower infrastructure costs, especially network interfacing costs. There is also a farm dedicated to running Monte Carlo simulations; it is also a controlled environment, no general users. This one has about 100 CPUs with a mixture of SUN/Solaris and PC/Linux. Software issues on this farm relate to the use of commercial packages such as LSF (for batch control), AFS for file access and Objectivity for object-oriented database use. The application is described as embarrassingly parallel. Next there is a 200 CPU data analysis cluster. Data is stored on HPSS and accessed via Objectivity. This cluster offers general access by users who have widely varying access patterns with respect to the amount of CPU load and data accessed. This cluster is a mixture of Solaris and Linux and uses LSF. Still at SLAC there is a smaller, 40 CPU, offline cluster with a mixture of SUN/Solaris and PC/Linux nodes. One of the issues facing the experiment is the licensing cost of the commercial software, which is becoming a significant fraction compared to the hardware acquisition cost. LSF also is showing strain under scaling with many jobs to schedule requiring a complex priority algorithm. Finally the speaker noted that BaBar computing model was already split into tiers, in their case Tiers A, B and C, where Tier A Centres are established already at SLAC, IN2P3 in Lyon and RAL in England. He ended by making reference to Grid computing and how it could possibly help BaBar in the future.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/clusters_wolbers_may2001.ppt" FNAL Offline Computing Farms (Steve Wolbers) These are based on 314 dual-CPU Pentium PCs running Linux, with 124 more on order. The latest PCs are rack-mounted (the previous incarnations were in boxes). The farms are logically divided by experiment and are designed to run a small number of large jobs submitted by a small number of expert users. There are currently 314 PCs with 124 more on order. The CDF experiment has two I/O nodes (large SGI systems) as front ends to the farms and a tape system directly attached to one of these. Stephen anticipates that the data volume per experiment per year will approximately double every 2.4 years and this is taken into consideration during farm design. In the future disk farms may replace tapes and analysis farms may be on the horizon. The primary goal of todays clusters is to return event reconstruction results quickly and provide cost-effective CPU power and not necessarily to achieve 100% CPU usage. The batch scheme in use is the locally developed HYPERLINK "http://www-isd.fnal.gov/fbsng/"FBSng. Rare among HEP labs, FNAL executes a 30-day burn-in on new PCs. Their expansion plans are to acquire more of the same since it appears to work and does not create a heavy support load.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/Tues_Sanger.ppt" Sanger Centre (UK) (James Cuff) The Sanger Centre is a research centre funded primarily by The Wellcome Trust and performing high performance computing for the Human Genome Project. It was founded in 1993 and has approximately 570 staff members now. They have a total of some 1600 computing devices, nearly half of them Compaq Alpha systems but there are also 300 PC desktops as well as a few vector systems. The Alpha clusters are based on Compaqs FibreChannel Tru64 Clustering and use a single system image. The data volume for the human genome is 3000 MB. The central backbone is ATM. The largest cluster is the Node Sequence Annotation Farm consisting of 8 racks of 40 Alpha systems; each Alpha has 1GB memory, plus a number of PCs for a total of 440 nodes, a configuration driven by the needs of the applications. There is also 19.2 Terabytes of spinning disk and the total system corresponds to 1000KW of power. The batch scheme in place is  HYPERLINK "http://www.platform.com/products/LSF/" LSF. Such tightly coupled clusters offer good territory for  HYPERLINK "http://nfs.sourceforge.net/" NFS and the applications use sockets for fast inter-node data transfer. Unlike typical HEP applications, Sangers applications are better suited to large memory systems with high degrees of SMP. However, in the longer term, they are considering moving towards wide-area clusters, looking perhaps towards Grid-type solutions.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/H1.pdf" H1 (Ralf Gerhards, DESY) For historical reasons, the  HYPERLINK "http://www-h1.desy.de/" H1 experiment at DESY operate a mixed cluster, originally based on SGI workstations but now also containing PCs running Linux. Today the SGI Origin 2001 operates as one of the disc servers, the others being PCs serving IDE discs based on a  HYPERLINK "http://wwwinfo.cern.ch/pdp/vm/diskserver.html" model developed by CERN. The farm, one of several similar farms for the various major experiments at DESY, is arranged as a hierarchical series of sub-farms; within the 10-20 node sub-farms the interconnect is Fast Ethernet and the sub-farms are connected together by Gigabit Ethernet. These sub-farms are assigned to different activities such as high-level triggering, Monte Carlo production and batch processing. The node allocation between these tasks is quite dynamic. The farm nodes, 50 dual-CPU nodes in total, are installed using the same procedures developed for DESYs desktops, thus offering a site-wide  HYPERLINK "http://www.suse.com/index_us.html" SuSE Linux environment for the H1 experiment users and permitting them to benefit from the general DESY Linux support services. Over and above this is an H1-developed framework based on CORBA for event distribution. The batch system in use is PBS, integrated to  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS, with difficulty. DESY is also working on a Data Cache project in conjunction with other HEP labs; this project should offer transparent access to data whether on disc or on tape.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/Kek.ppt" Belle (Atsushi Manabe, KEK) Historically KEK experiments have been based on RISC systems (UNIX workstations) and movement to PCs has been slowed by in-place lease agreements. Nevertheless, Belle has 3 PC-based farms: 158 CPUs housed in mainly quad-CPU boxes for event reconstruction and using SUN servers for I/O. The nodes use home-built software to make them look like SMP systems. There are only 5 users accessing this cluster. Installation and cabling was done by in-house staff and physicists. The second cluster, with 88 CPUs, has dual Pentium III PCs; it was fully installed (hardware and software) by Compaq The third 60-node cluster again has quad-CPU systems. It was acquired under a government-funded 5 year lease agreement within which architecture upgrades will be possible All the clusters use  HYPERLINK "http://nfs.sourceforge.net/" NFS for file access to the data. There are 14TB of disk (10TB of RAID and 4TB local) on 100BaseT network. The data servers consist of 40 SUN servers each with a DTF2 tape drive. The tape library has a 500TB capacity. The farm is used for PC farm I/O R&D, HPSS (Linux HPSS client API driver by IBM), SAN (with Fujitsu), and GRID activity for ATLAS in Japan. There is no batch system in place but the batch jobs are long running and so a manual scheme is thought sufficient. HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/22SimoneQCD.pdf"Lattice QCD Clusters (Jim Simone, Fermilab) This theoretical physics application needs sophisticated algorithms and advanced computing power, typically massively parallel computing capabilities where each processor can communicate easily with its neighbours and near-neighbours. In previous implementations, such systems were based on commercial or specially built supercomputers. The current project is to build a national infrastructure across the USA with three 10 Tflop/s facilities by 2005, two (Fermilab and Jefferson Lab) with commodity PCs and the third (BNL/Columbia) using custom-built chips. Moving from super computers to clusters has allowed the project to benefit from fast-changing technology, open source Linux and lots of easy-to-use tools. The FNAL-based 80-node cluster was installed in Fall 2000 with the intention to ramp up by 300 more nodes per year. This is a true Beowulf cluster system whereas many of the farms described above are not Beowulf. This is because Lattice QCD requires tight communication between processes. Myrinet 2000 is used for networking as it is much more efficient than Ethernet for TCP traffic in terms of CPU overhead at high throughput. The PCs BIOS is PXE-enabled for remote boot over Ethernet and it has been modified to permit access to the serial port for remote health checking and certain management tasks. The software installed includes PBS with the  HYPERLINK "http://mauischeduler.sourceforge.net/" Maui scheduler for batch queuing and  HYPERLINK "http://www-unix.mcs.anl.gov/mpi/fastchoice.html" MPI for inter-node communication. The choice of which MPI architecture was made after some tests and mpich/vmi was chosen with the virtual driver layer provided by NCSA but there is a lingering doubt on the final choice with respect to its ability to scale. Discussion Given that current-day PC configurations usually include very large disc systems (typically 20-40GB), how to make use of this space? One suggestion was  HYPERLINK "http://homepage.tinet.ie/~djkoshea/cachefs.html" CacheFS which is a scheme using a local cache in synch with the original file system. There was a mixed reaction to this: SLAC had been using it indirectly but has stopped using it but the Biomed team from the University of Minnesota reported good experiences. Another alternative is to use the local disc for read-only storage of input files for the application, copied automatically on demand or manually from a central data store, essentially creating a local data replica. The next topic was how to use unused CPU cycles, especially on desktops (this would come up later in the meeting also). Although there are clearly solutions (see later,  HYPERLINK "http://www.cs.wisc.edu/condor" Condor, HYPERLINK "http://www.seti-inst.edu/science/setiathome.html"Seti@Home, etc) there is the question of resource competition and management overhead. Also, the fact that users can alter the local setup without warning can make this a too chaotic environment. However, there are many successful examples where Condor in particular is used including some US government agencies and, within the HEP world, INFN in Italy and a particular CERN experiment ( HYPERLINK "http://choruswww.cern.ch/welcome.html" CHORUS). Sanger also makes a little use of Condor. On the other hand, DESY had decided that it is cheaper to invest in a few more farm nodes than to dedicate one person to administer such a cycle-stealing scheme. Talking of management overhead, it was noted that one reason to upgrade old CPUs for more recent faster chips is precisely to decrease such overhead. Management overhead typically increases with the number of boxes and a single 800MHz chip PC can approximately replace 3-4 200MHz systems in power. A poll was taken on which sites used asset management packages. BNL has developed a database scheme for this as has H1 at DESY. The BNL scheme uses manual input of the initial data with subsequent auto-update with respect to a central database. CERN too has a database scheme based on the PCs MAC address with facilities for node tracking and location finding (where a given node is located physically in the computer room). They are seriously considering the use of bar codes in the future. How should asset data for new systems be collected when they start arriving in the sort of bulk numbers expected in the future for example 100+ nodes per week when we consider clusters of 5000 nodes. Inputting asset data for this number of PCs by hand does not scale. We really want something more automatic something in the firmware for example. It was noted that IBM has some such a scheme with embedded data, which can be read out with a hand-held scanner. Turning to system administration tools, it was remarked that no site represented had mentioned using  HYPERLINK "http://www.iu.hio.no/cfengine/" cfengine, a public domain system admin tool commonly found in the wider UNIX community. It was questioned if there is experience on using it on clusters of up to 1000 nodes and beyond. Lastly, the question of file systems in use was brought up. There is heavy use of  HYPERLINK "http://nfs.sourceforge.net/" NFS and AFS at various sites but much less of so-called global file systems ( HYPERLINK "http://www.sistina.com/gfs/" GFS for example). Inside HEP at least, AFS appears to have a dominant position although there is a clear move towards OpenAFS. Large Site Session (Session Chair, Steve Wolberg, FNAL) 4.1  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/LCW_CERN_clusters.ppt" CERN Clusters (Tim Smith) Until recently, CERNs central computer center was comprised of a large number (37) of small to medium sized clusters in a variety of architectures, each dedicated to a given experiment. This clearly does not scale and there is currently a migration to a small number (three) of much larger clusters made up from two architectures (Linux and Solaris) where the clusters are distinguished by usage rather than by experiment and the different experiments share the clusters. This also involves reducing the variety of hardware configurations in place. The three clusters will be heterogeneous which it is accepted will add extra complications in configuration and system administration and possibly means that system administration tools will have to be largely homegrown. There is a 50-node dual-CPU Interactive Cluster: user access is balanced by using DNS where the user connects to a generic DNS name and an algorithm discovers the least loaded node to which the user is then connected. The load algorithm considers not only CPU load but also other factors such as memory occupation and others. There is a 190 node dual-CPU scheduled batch cluster which is dedicated at any given time to one or a small number of experiments for example for dedicated beam running time; or to defined data challenge periods. Dedicating all or some nodes of this cluster to given experiments is done simply by declaring only experiment-specific  HYPERLINK "http://www.platform.com/products/LSF/" LSF queues on these nodes. Then rearranging the split between experiments or replacing one experiment by another is simply a matter of redefining the LSF queue assignment, although it is accepted this is a rather mechanical process today. Lastly, there is a so-called 280-node dual-CPU chaotic batch cluster, which offers general and unscheduled time. There is also a tape and disc server cluster, which does not offer direct user access. Typically the user connects to the Interactive Cluster and launches an LSF job to one of the two batch clusters. There is virtually no user login permitted on the batch clusters but exception can be made for debugging if it appears that there is no alternative to fixing some particular problem. For this purpose LSRUN (from the LSF suite) is run on a few batch nodes in each cluster. Among the tools in use are System installation Kickstart for Linux nodes and Jumpstart for Solaris Automation of installation  HYPERLINK "http://c.home.cern.ch/c/cborrego/www/anis/anis.html" ANIS (CERN developed) Post-installation and system configuration  HYPERLINK "http://wwwinfo.cern.ch/pdp/ose/sue/index.html" SUE (CERN developed) Application installation  HYPERLINK "http://wwwinfo.cern.ch/pdp/ose/asis/" ASIS (CERN developed). But the speaker noted that both SUE and ASIS were originally written to ease system admin tasks when there were a wide range of UNIX flavours to support. With this decreasing to only 2 platforms over time, it is perhaps an opportune moment to reconsider both of these tools. Can they be simplified? Also, for batch nodes, only a very few applications from the ASIS suite are needed. Console management PCM (DECs Polycenter Console Manager) and a console concentrator have been used up to now but they are moving to cross-wiring of the serial ports to better cope with the rapidly-rising number of nodes, although cross-wiring such a large number of nodes is a headache to manage. For the future they are investigating public domain tools such as  HYPERLINK "http://vacm.sourceforge.net/" VACM and  HYPERLINK "http://www.digi.com/solutions/termsrv/etherlite.shtml" Etherlite. Power management none Monitoring currently using a CERN-built alarm scheme known as  HYPERLINK "http://service-sure.web.cern.ch/service-sure/" SURE plus simple home-written tools for performance monitoring. Currently working on a larger project to build a performance and alarm scheme, which will monitor services rather than objects within the servers (the so-called  HYPERLINK "http://proj-pem.web.cern.ch/proj-pem/" PEM project, described later in the meeting). VA Linux (Steve DuChene)  HYPERLINK "http://www.valinux.com/" VA Linux install and support clusters up to quite large number of nodes including installations for two sites represented at the meeting BNL and SLAC. They have noticed a marked trend in increasing CPU power per floor space by moving from 4U to 2U to 1U systems and they expect this to continue several PCs in a 1U high unit shortly. But sites should be aware that such dense racking leads inevitably to greater challenges in power distribution and heat dissipation. For each configuration, a cluster must have a configuration and management structure. For console management for example, VA Linux recommends the  HYPERLINK "http://vacm.sourceforge.net/" VACM tool, which they developed and which is now in the public domain. In VACM configurations, there is a controller PC in each rack of up to 64 nodes on which VACM runs; from there VACM then access the BIOS directly or a special (from Xircom) board connected to the motherboard to get its monitor data. VACM also supports remote power cycling and BIOS serial console port redirection. It can access sensors on the motherboard fan speeds, temperature, etc. Further, the code is not x86-specific so, being open source, is portable to other architectures and there is an API to add your own modules. The source can be found on sourceforge.org. Another tool which they wrote and have made available is  HYPERLINK "http://systemimager.sourceforge.net/" SystemImager: the system administrator configures a master image on a typical node, stores the image on a server and loads that to client nodes via a network bootstrap or on demand from a floppy boot. Obviously this offers most advantages on a homogeneous cluster. Both the initial load of the clients and their subsequent consistency depend on the standard UNIX  HYPERLINK "http://rsync.samba.org/" rsynch protocol with the originally configured node as the master image. It was noted however that at least the current version of rsynch suffers from scaling problems. In the current scheme, it is recommended to base no more than 20 to 30 nodes on a single master but larger configurations can be arranged in a hierarchical manner image a set of masters from a super-master and then cascade the image down to the end nodes. Other effects of scaling could be offset by using faster cluster interconnects. The next version of this tool should offer push updates and it may eventually use the multicast protocol for yet faster on-demand updates by performing the updates in parallel. Once again the source can be found on sourceforge. Citing perhaps an extreme case of redundancy, one VA Linux customer has a policy of purchasing an extra 10% systems. Their scheme for reacting to a system problem is first to reboot; if that fails to re-install; and if that fails to replace the node and discuss offline with the supplier for repair or replacement of the failed node.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/SLAC_WedAM.ppt" SLAC Computer Centre (Chuck Boeheim) There is a single large physical cluster although viewed from a user point of view, there are multiple logical ones. This is achieved by the use of  HYPERLINK "http://www.platform.com/products/LSF/" LSF queues. The hardware consists of some 900 single-CPU SUN workstations running Solaris and 512 dual CPU PCs running Linux (of which the second 256 nodes were about to be installed at the time of the workshop). There are also dedicated servers for  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS (7 nodes, 3TB of disc space),  HYPERLINK "http://nfs.sourceforge.net/" NFS (21 nodes, 16TB of data),  HYPERLINK "http://www.objectivity.com/" Objectivity (94 nodes, 52TB of data) plus LSF and  HYPERLINK "http://www4.clearlake.ibm.com/hpss/index.jsp" HPSS (10 tape movers and 40 tape drives). Objectivity manages the disk pool and HPSS manages the tape data. Finally there are 26 systems (a mixture of large and small Solaris servers and Linux boxes) dedicated to offer an interactive service. All these are interconnected via Gigabit Ethernet to the servers and 100Mbit Ethernet to the farm nodes, all linked by 9  HYPERLINK "http://www.cisco.com/" CISCO 6509 switches. Tha major customer these days at SLAC is the  HYPERLINK "http://www.slac.stanford.edu/BFROOT/" BaBar experiment. For this experiment there is a dedicated 12 node (mixed Solaris and Linux) build farm. The BaBar software consists of 7M SLOCS of C++ and the builds, which take up to 24 hours, are scheduled by LSF although privileged core developers are permitted interactive access. The batch nodes do not accept login access except for a few designated developers requiring to debug programs. NFS is used with  HYPERLINK "http://www.netbsd.org/Documentation/bsd/amdref.html" automounter on all 1400 batch nodes, controlled by the use of net groups. This has been occasionally been plagued by mount storms and needs to be carefully monitored although there seems to be fewer problems using the TCP implementation of  HYPERLINK "http://nfs.sourceforge.net/" NFS as opposed to the more common UDP one. [It was noted later in the discussion that a similar-sized CERN cluster downplays the use of NFS, preferring to adopt a staging scheme based on the  HYPERLINK "http://consult.cern.ch/writeup/coreuser/node14.html" RFIO protocol.] The Centre at SLAC is staffed by some 18 persons some of whom also have a role in desktop support for the Lab. The ratio of systems supported by staff members has gradually risen: in 1998 they estimated 15 systems per staff person; today it appears to be closer to 100 systems per person. One possible reason for this improvement is the reduction in the variety of supported platforms and a reduction in complexity. Asked to explain the move from SUN to PC, the speaker explained that maximising the use of floor space was an important aspect: PCs can be obtained in 1U boxes, which SUN cannot supply today. As elsewhere, limited floor space is an issue but one aspect that may be relatively unique to SLAC, or perhaps in California generally, is the risk of seismic activity physical rack stability is important! In their PC systems. SLAC has enabled remote power management and remote management, the combination of which permits a fully lights-out operation of the centre. They use console servers with up to 500 serial lines per server. As regards burn-in testing, their users never permit them enough time for such luxuries! Further, they have noticed that when systems are returned to a vendor under warranty, sometimes a different configuration is returned! Like CERN and other sites, with so many nodes, physical tracking of nodes is an issue, a database is required with bar codes on the nodes of the clusters. Among the tools in use are Network installation using locally-developed scripts wrapped around Kickstart and Jumpstart they have managed to install some 256 nodes in an hour Patch management a local tool Power management using tools from VA Linux (e.g.  HYPERLINK "http://vacm.sourceforge.net/" VACM) they can power up or down a cluster taking account of inter-node sequence dependencies Monitoring the  HYPERLINK "http://www.jlab.org/hepix-hepnt/presentations/Ranger_Update/" Ranger tool developed by C.Boeheim Reporting a local tool which gathers reports across the cluster and produces short summaries (who wants to read 512 times the same error?) For development purposes, the support team have established a small test bed where they can test scaling effects. The speaker closed with the memorable quote that a cluster is an excellent error amplifier. Hardware Panel This panel was led by Lisa Giachetti of Fermilab. The panel and audience were asked to address the following questions: From among the criteria are used to select hardware - price, price performance, compatibility with another site, in-house expertise, future evolution of the architecture, network interconnect, etc. - which are the 3 most important in order of significance? Do you perform your own benchmarking of equipment? How do you handle life cycles of the hardware, for example, the evolution of Pentium processors where later configurations and generations may need a new system image? Have you experience, positive or negative, with heterogeneous clusters?  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/farm_hardware.pdf" BNL (Tom Yanuklis) Typically BNL consider that PCs have a 3-year lifecycle. In this respect, it is important to understand for how long a vendor will support a particular configuration and what effect future vendor changes might have on the ongoing operation of your farm. Their primary purchase criteria are price performance, manageability and compatibility with their installation image. Like many labs, BNL do not perform rigorous benchmarking of proposed configurations. But they do negotiate with vendors to obtain loaned systems that they then offer to end-users for evaluation with the target applications. They have noted that with increasing experience, users can better specify the most suitable configuration (memory, I/O needs, etc) for their application. BNL prefer to install homogeneous clusters and declare each node either dedicated to batch or to interactive logins, although they reserve the right to modify the relative numbers of each within the cluster. As mentioned earlier by others, they have seen heat and power effects caused by ever-denser racking and they have had to install extra power supplies and supplementary cooling as their clusters have been expanded. Over time, as the current experiments got underway and they built up processing momentum, they were very glad to have had the flexibility to change their processing model: instead of remotely accessing all the data, they were able to move to a model with local caching on the large system discs which are delivered in todays standard configurations. Once installed and running, getting changes to a cluster approved by the various interested groups can be an issue. This brings in the question of who proposes such changes users or administrators and what consensus must be reached among all parties before implementation of significant changes. BNL certainly consider the use of installation tools such as  HYPERLINK "http://www.toolinux.com/linutile/configuration/kickstart/" Kickstart and  HYPERLINK "http://systemimager.sourceforge.net/" SystemImager very important. Also remote power management and remote console operation are absolute essentials for example  HYPERLINK "http://vacm.sourceforge.net/" VACM and IBMs latest Cable Chain System which uses PCI cards with onboard Ethernet allowing commands to be issued to the cards over a private sub-network.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/pdsf.ppt" PDSF at LBNL (Thomas Davis) The  HYPERLINK "http://www.nersc.gov/" NERSC (National Energy Research Scientific Computing Center) at the Lawrence Berkeley National Laboratory has a long-standing tradition of operating supercomputers and the next iteration of this is likely to be a PC cluster so this is where current research is concentrated. PDSF (Parallel Distributed Systems Facility) originated at the US super-collider (SSC) which was planned to be built in Texas but was abandoned around 1995. It originally consisted of a mixture of SUN and HP workstations but recent incarnations are based on PCs running Linux. Users, which include a number of major HEP experiments, purchase time on the facility with real funds and customers include many HEP experiments. But despite buying time on the cluster, clients do not own the systems, but rather rent computing resources (CPU time and disc space). They may actually get more resources than they paid for if the cluster is idle.  HYPERLINK "http://www.platform.com/products/LSF/whatsnew41.asp#c" LSF Fairshare is used to arbitrate resources between users. Customers often want to specify their preferred software environment, down to Linux versions and even compiler versions so one of PDSFs most notable problems is software configuration change control, not surprising faced with a large variety of user groups on the cluster. PC hardware is purchased without ongoing vendor support, relying on the vendors standard warranty: when the warranty period ends, the PCs are used until they die naturally or until they are 3 years old. Another reason given to replace PCs is the need for memory expansion. In general, LBNL purchase systems as late as possible before they are needed in order to benefit from the ever-rising best price/performance of the market, although they tend to purchase systems on the knee of the curve rather than the newest, fastest chip. For example, when the fast chip is 1000 MHz, they will buy 2 650MHz chip systems for a similar price. Their memory configurations are always comprised of the largest DIMMs available at the moment of purchase. They have noticed that disc space per dollar appears to be increasing faster than Moores Law; they estimate a doubling every 9 months. [This estimate is supported by CERNs  HYPERLINK "http://sverre.home.cern.ch/sverre/PASTA_98/index.htm" PASTA review, which was performed to estimate the evolution of computing systems in the timescale of the LHC (until 2005/6).] FNAL FNAL have developed a qualification scheme for selecting PC/Linux vendors. They select a number of candidate suppliers (up to 18 in their first qualification cycle) who must submit a sample farm node and/or a sample desktop. FNAL provide the required Linux system image on a CD. FNAL then subject the samples to various hardware and software tests and benchmarking. They also test vendor support for the chosen hardware and for integration although no direct software support is requested from vendors. The selection cycle is repeated approximately every 18-24 months or when there is a major technology change. After the last series, Fermilab selected 6 agreed suppliers of desktops and 5 possible suppliers of servers from which they expect to purchase systems for the coming year or so at least. 5.4 Discussion Upgrades: at most sites, hardware upgrades are rather uncommon; does this follow from the relative short life of todays PC configurations? PDSF have performed a major in-place disc upgrade once and they also went through an exercise of adding extra memory. BNL have found it sometimes necessary also to add extra memory, sometimes as soon as 6 months after initial installation; this had been provoked by a change of programming environment. BNL had negotiated an agreement with their suppliers whereby BNL installed the extra memory themselves without invalidating the vendors warranty. It was noted that it might not always be possible to buy compatible memory for older systems. Benchmarking: it was generally agreed that the best benchmark is the target application code. One suggestion for a more general benchmark is to join the  HYPERLINK "http://www.specbench.org/" SPEC organisation as an Associate Member and then acquire the source codes for the SPECmark tests at a relatively inexpensive price. Jefferson Lab use a test based on prime numbers (see for example,  HYPERLINK "http://www.iaeste.dk/~henrik/projects/mprime.html" Mprime), which exercises the CPU. VA Linux has a tool ( HYPERLINK "http://sourceforge.net/projects/va-ctcs/" CTCS), which tests extensively the various parts of the CPU. Acceptance Tests: it appears that these are relatively untypical among participating sites although Fermilab performs burn-in tests using the same codes as for the selection evaluations. They also use HYPERLINK "http://www.seti-inst.edu/science/setiathome.html"Seti@Home, which has the advantage to fit on a single floppy disc. NERSC perform some tests at the vendor site to validate complete systems before shipment. On a related issue, when dealing with mixed vendor configurations (hardware from a given supplier and software from another) it is important to define clearly the responsibilities and the boundaries of support. Vendor relations: it was suggested to channel all contacts with a vendor through only a few named and competent individuals on each side. They serve as filters of problems (in both directions) and they can translate the incoming problems into a description that each side can understand. Even with this mechanism in place, it is important to establish a relationship at the correct level technical as well as strategic depending on the information to be exchanged. Until long-standing relationships can be built, it can be hard to reach a correct level within the supplier organisation (deep technical expertise for technical problems, senior managerial for strategic). Panel A1 - Cluster Design, Configuration Management The questions posed to this panel, chaired by Thomas Davis, were ---- Do you use modeling tools to design the cluster? Do you use a formal database for configuration management? 6.1 Chiba City (John-Paul Navarro, Argonne)  HYPERLINK "http://www-unix.mcs.anl.gov/chiba/" Chiba City has been built up to 314 nodes since 1999 to be used as a test bed for scalability tests for High Performance Computing and Computer Science studies. The eventual goal is to have a multi-thousand node cluster. The configurations are largely Pentium III systems linked by 64-bit Myrinet. One golden rule is the use of Open Source software and there is only a single commercial product in use. The cluster is split logically into towns of up to 32 nodes each with a mayor managing each town. The mayors themselves are controlled by a city mayor. Installation, control and management can be considered in a hierarchical manner. The actual configurations are stored in a configuration database. Sanity checks are performed at boot time and then daily to monitor that the nodes run the correct target environment. Mayors have direct access to the consoles of all nodes in their towns. Remote power management is also used and both this and the remote management are considered essential. The operating system is currently Redhat 6.2 with the 2.4.2 Linux kernel but all in-house software is developed so as to be independent of the version of Linux. The programming model is  HYPERLINK "http://www-unix.mcs.anl.gov/mpi/fastchoice.html" MPI-based and job management is via the PBS resource manager and the  HYPERLINK "http://mauischeduler.sourceforge.net/" Maui scheduler. The initial installations and configurations were performed by in-house staff but if they have the choice, they will not repeat this! Myrinet was difficult to configure and the network must be carefully planned; it was found to be sensitive to heavy load situations. Since the original installation they have had to replace the memory and upgrade the BIOS, both of which were described as a pain. The environment is overall stressful: for example they suffer power fluctuations and they make the point that a problem, which occurs on 6 nodes, scales to a disaster when it occurs on 600! They have built their own management tools (which they have put into the public domain) and they make use of rsh in system management tasks but find that it does not scale well so work is in progress to get round rshs maximum 256 node restriction.  HYPERLINK "http://nfs.sourceforge.net/" NFS is used but it does not scale well in conditions of heavy use, especially for parallel applications so a parallel file system,  HYPERLINK "http://parlweb.parl.clemson.edu/pvfs/" PVFS is under investigation. They are developing an MPI-based multi-purpose daemon (MPD) as an experiment in job management (job launching, signal propagation, etc). Work is also going on with the  HYPERLINK "http://www.scyld.com/" Scyld Beowulf system use of a single system image on a cluster and emulation of a single process space across the entire cluster. The Scyld tests at Chiba City are on scalability and the use of Myrinet. Currently this is limited to 64 nodes but tests are being carried out using 128 nodes. In the course of their work they have produced two  HYPERLINK "http://www.mcs.anl.gov/systems/software" toolkits, one for general systems admin and the second specifically for clusters. Both are available from their web site at  HYPERLINK http://www.mcs.anl.gov/systems/software http://www.mcs.anl.gov/systems/software. Finally some lessons learned include Use of remote power and console is essential Many UNIX tools dont scale well how would you build, operate and program a million node cluster? Could you? A configuration database is essential Change management is hard Random failures scale into nightmares on a cluster 6.2  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/pdsf-alvarez.ppt" PDSF and the Alvarez Clusters (Shane Cannon, LBNL) NERSC at LBNL has a number of very large clusters installed. For example they have an IBM SP cluster with more than 2000 nodes, a configuration which is rated fifth in the Top 500 Clusters list. There is also a 692 node Cray T3E. In general the applications in most use at NERSC are embarrassingly parallel and this is reflected in the cluster design. For a new configuration, rather than perform detailed modelling, they find that they can get the best value for money by simply using the principles of buying commodity processors and configurations. The PDSF and Alvarez clusters are planned to offload and perhaps eventually the IBM and Cray systems and are directly targeted at parallel applications. In the Alvarez cluster, a high-speed network was specified in the Request For Prices (RFP) and  HYPERLINK "http://www.myri.com/myrinet/overview/" Myrinet was selected. Among the issues they face are maintaining consistency across a cluster and the scalability not only of the cluster configurations but also of the human resources needed to manage these. 6.3 Linux Networx (Joshua Harr) For installing new systems,  HYPERLINK "http://www.linuxnetworx.com/index.php" Linux NetworX makes use of many tools including  HYPERLINK "http://sourceforge.net/projects/systemimager/" SystemImager (good for homogeneous clusters) and  HYPERLINK "http://oss.software.ibm.com/developerworks/projects/lui" LUI (better for heterogeneous clusters). They find that neither of these meets all the needs today but the  HYPERLINK "http://www.csm.ornl.gov/oscar/" OSCAR project inside IBM is reputed to be planning to merge the best features. They have found however, as mentioned by others, that  HYPERLINK "http://nfs.sourceforge.net/" NFS and  HYPERLINK "http://rsync.samba.org/" rsync by themselves do not scale. They use  HYPERLINK "http://www.acl.lanl.gov/linuxbios/" LinuxBIOS a Linux micro-kernel that can control the boot process but they agree with previous speakers that a remotely accessible BIOS is really desirable and ought be supplied by all vendors and properly supported on the PC motherboard. 6.4 Discussion From a poll of the audience, in-house formal design processes in cluster planning are at best rare. More common is to use previous experience, personal or from colleagues or conference presentations. On the other hand, cluster vendors do indeed perform configuration planning, especially with respect to power distribution, cooling requirements and floor space. The use of Uninterruptible Power Supplies (UPS) varies. NERSC does not have a global UPS, the main argument given that a UPS for a Cray would be prohibitively expensive; they do however protect key servers. SLAC has a UPS for the building, justified by comparing the cost with the clock time, which would be needed to restart an entire building full of computer systems. Likewise, CERNs main computer centre is protected. FNAL at the current time has UPS on certain racks with vital servers, for example  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS, but it is currently considering adding a UPS for the whole centre. They have to perform two complete power cycles per year for safety checks and they find it sometimes takes weeks for stable service to be resumed! Consistency of the configurations: it was agreed that hardware consistency can only be obtained by purchasing an entire cluster as a single unit. Software consistency can be obtained by several methods, described more fully elsewhere in this meeting. For example by cloning a single system image; by using a form of synchronisation against a master system or by using a consistent set of RPMs. In this respect Redhat are reputed to be working on a tool based on the use of RPMs, which should take care of inter-RPM dependencies.  HYPERLINK "http://www.debian.org/" Debian, another Linux distribution, has a similar tool. Monitoring tools: described in detail in Panel A3 (section 10) but it was noted here that as clusters increase in size, it is important to set correct thresholds. In a thousand node cluster, who if anyone should be alerted at 3 oclock in the morning that 5 nodes have crashed? Should one buy 1000 systems or a cluster of 1000 nodes? The former may be cheaper because the second is a solution and often costs more. But dont forget the initial and ongoing in-house support costs. On the other hand, if only the vendor-supplied hardware will be used and not the software (as reported elsewhere), is it worth the extra expense to buy the solution? Tools develop or re-use? A number of sites have built various installation utilities which will install a new PC with the system administrator being asked only a very few questions (see Panel A2, section 7). In general there is a trade-off of adaptability versus flexibility when deciding to use or adapt an existing tool or producing ones own. There is a general desire at some level not to reinvent the wheel but it is often accepted to be more intellectually challenging to produce ones own. Configuration Management: apart from the tools already mentioned, one should add the possibility to use  HYPERLINK "http://www.iu.hio.no/cfengine/" cfengine. PDSF has begun to investigate this public domain tool, which is frequently used in the wider UNIX community if not often in the HENP world. One major question concerns its scalability although other sites report no problems in this respect. Apart from such freely available tools, many vendors are rumoured to be working on such tools. Cluster design: what tools exist? In particular, help would be appreciated in selecting the best cluster architecture and cluster interconnect. However, this only seems possible if the application workload can be accurately specified, and modelled.  HYPERLINK "http://www-unix.mcs.anl.gov/mpi/fastchoice.html" MPI applications should be an area where this could be possible. Panel B1 - Data Access, Data Movement The questions posed to this panel, chaired by Don Petravick, were ---- What is the size of the data store and which tools are in use? How to deal with free storage (large local discs on modern PCs for example)? What protocol and software stacks are used to access this data across the LAN/WAN? Sanger Institute (James Cuff)  HYPERLINK "http://www.ensembl.org/" ENSEMBL is a $14M scheme for moving data in the context of a programme for automatically annotating genomes. The users require 24-hour turnaround for their analyses. The files concerned are multi-GB in size. For example, using  HYPERLINK "http://www.mysql.com/" mySQL as the access tool, the databases may be more than 25GB and single tables of the order of 8-10GB. Sanger has found that large memory caches (for example 3GB) can significantly increase performance. They experience trouble with  HYPERLINK "http://nfs.sourceforge.net/" NFS across the clusters so all binaries, configuration files and databases are stored locally and only the home directories are NFS-mounted because they're not high throughput or too volatile. They use  HYPERLINK "http://manimac.itd.nrl.navy.mil/MDP/" MDP for reliable multicast. It scaled to 40 MB/s in tests over 100BateT links (40 times 1MB/s) but it failed in production incomplete files and file corruption! They have found that it is only possibly to multi-cast at speeds of up to 64KB/s reliably. The problem appears to be with No Acknowledge (NAKs) messages for dropped bits and the error follow-up of these.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/condor-i-o.ppt" Condor I/O (Doug Thain) The University of Wisconsin has 640 CPUs in its Computer Science building; among other uses, these are used for computer architecture development on Condor. The 264 CPUs at  HYPERLINK "http://www.bo.infn.it/calcolo/condor/index.html" INFN is largest single user community. It seems that one of the attractions for particle physics sites is the real case usage need. Is it a funding niche to ask for "integration" projects to take computer science projects to a "phase 3" with early usage testers. Do the funding agencies see value in these pilot projects or do they expect the results to be attractive enough to end users (commercial or academic) to where they will volunteer their own effort to integrate the research results? Condor has as common denominator the principle to hide errors from running jobs but rather propagate failures back to the scheduler, eventually the end user, for recovery. This, along with remote I/O, is easily accommodated by linking against the Condor C libraries. A recent development is  HYPERLINK "http://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.html" DAGMan, which could be thought of as a distributed UNIX "make" command with persistency. Another development is Kangaroo. Under this scheme, the aggregate ensemble of network, memory and disc space is considered as large buffer for the full transfer. The goal is to allow overlaps of the computation and data transport. This presumes there is no (or acceptable) contention for CPU between computation and data transfer. Kangaroo is explicitly trading aggregate throughput for data consistency. In summary, correctness is a major obstacle to high-throughput computing. Jobs must be protected from all possible errors in data access. The question was asked on whether there had been any game theory analysis on where the optimization is in the "abort and restart" philosophy of simple checkpoint/restart. Experimentally users universally chose this method over checkpoint. Where is the breakeven point on job size/error frequency space? HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/DataManagement.ppt"NIKHEF Data Management (Kors Bos) What is the current HEP problem size? The data rate is 10**9 events/year and there is a simulation requirement of a minimum of 10% of this (10**8 events/year). A typical CPU in NIKHEFs cluster typically requires 3 minutes to simulate an event, equals 10**5 events per year per CPU. Multiply that by 100 CPUs and there is still a factor of 10 too few nodes for the required simulation. Hence the need to aggregate data from different remote sites. For the  HYPERLINK "http://www-d0.fnal.gov/" D0 experiment, the generated information is 1.5 GB of detector output per CPU plus 0.7GB simulation data. [A very important question relating to the physics of such simulations is how do they deal with the distribution of the random numbers/seeds.] They thus generate some 200GB of data per day overall and transmit half of this to FNAL. Seeing that ftp throughput is limited by roundtrip latency across the network. They use  HYPERLINK "http://ccweb.in2p3.fr/bbftp/" bbftp to bring this up to 25 Mb/s. With multiple bbftps they can get 45 Mb/s, limited mostly by the network connection between the Chicago access point of the transatlantic link and FNAL itself. The conclusion is that producing and storing data is rather simple; moving it is less simple but still easy. The real issue is managing the data. The bulk of the data is in storage of intermediate results, which are never used in reality. There are advocates of not storing these intermediate results but rather to recalculate them. Is there a market in less reliable storage for this bulk data?  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/LSCCW_umn_ccgb.ppt" Genomics and Bio-Informatics in Minnesota (Chris Dwan) Their planning horizon is much compressed compared to the HEP presentations. They have the genome analysis in hand now that was originally planned for 2008. They are living with the bounty of the funding wave/success. Their computing model is pipelined data processing. The problem size is 100KB and less than 5 seconds processing per event. They run 10,000 jobs/month so the major problem is the job management and job dependency for users with lots of distributed jobs. The users perform similarity searches, searches against one or more 10GB databases where the problem is data movement. They use servers for this. [Could an alternative be to put these on PCs with many replicas of the database, which seems to be the Sanger approach?] The Bio-Informatics component is in data warehousing, including mirroring and crosschecking other public resources. They have local implementations of various public databases stored in ORACLE. They must store large databases of raw information as well as information processed locally with web-based display of results and visualization tools. They have a 100-node  HYPERLINK "http://www.cs.wisc.edu/condor/" Condor pool using desktop PCs, shared with other users. The algorithms are heuristic and so there is a fundamental question about whether different results are erroneous or just produced on different hardware, for example some from 32 bit, some from 64 bit architectures. Biophysics teams experience the same phenomenon as some particle physicists (see previous talk) that it may be easier to recollect the data than to migrate forward from old stored copies, at least for the next decade. Is there a basis for the presumption that there will always be the need for local storage? What about the option of cataloging where the data is? What are the overheads of handling concurrent data accesses? What would be the result of comparing the growth curves of network speed with the local disk speeds? From pure performance reasons, which is likely to be the winner? The question was raised about interest in  HYPERLINK "http://sourceforge.net/projects/intel-iscsi" iSCSI for accessing disks remotely. Core routers are at the 6GB/s fabric level now. What is the aggregate rate of the local disk busses and how does it change? There are concerns with the model of splitting data across multiple CPUs and fragmenting jobs to span the data. These relate especially to the complexity of the data catalog and with the job handling to make sure the sub-portions are a complete set and restart. Discussion 100TB stores are handled today. Stores at the level of PB are still daunting. Network capabilities still aggregate well into a large capability.  HYPERLINK "http://www.fibrechannel.com/" FibreChannel is used only in small scale between large boxes for redundancy. Local disk is still used for local disk buffer for a remote transfer. There is lots of interest in multicast transmission of data, but current implementations have reliability problems. Parallelization of normal tools seems to be serving people well enough in clusters up to 100-200 nodes. This brings up the question of do the sites managing clusters in units of 1000+ (SLAC, CERN, FNAL, etc) have any insights into mental limits that one hits? It's relatively easy to create another layer of hierarchy in systems that already have at least one intermediate layer, but getting that abstraction in the first place is often quite difficult. Panel A2 - Installation, Upgrading, Testing The questions posed to this panel Steven Timm, chaired by, were ---- Do you buy-in installation services? From the supplier or a third-party vendor? Do you buy pre-configured systems or build your own configuration? Do you upgrade the full cluster at one time or in rolling mode? Do you perform formal acceptance or burn-in tests?  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/Dolly.ppt" Dolly+ (Atsushi Manabe, KEK) Faced with having to install some 100 new PCs, they considered the various options ranging from buying pre-installed PCs to performing the installation on the nodes by hand individually. Eventually they came across a tool for cloning disk images developed by the  HYPERLINK "http://www.google.com/search?q=cops+eth&btnG=Google+Search" CoPs (Clusters of PCs) project at ETH in Zurich and they adapted it locally for their software installations. Their target was to install Linux on 100 PCs in around 10 minutes. The PXE bootstrap (Preboot Executive Environment) of their particular PC model starts a pre-installer, which in turn was modified to fire up a modified version of Redhats  HYPERLINK "http://www.toolinux.com/linutile/configuration/kickstart/" Kickstart. This formats the disc, sets up network parameters and calls Dolly+ to clone the disc image from the master. The nodes are considered as connected logically in a ring and the software is propagated from one node to the next with shortcuts in the event of failure of a particular node. Such an arrangement reduces contention problems on a central server. Using SCSI discs, the target was achieved, just over 9 minutes for 100 nodes with a 4GB disc image, although the times are doubled if using IDE discs or an 8GB disc image. A  HYPERLINK "http://corvus.kek.jp/~manabe/pcf/dolly" beta version is available. HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/Rocks.ppt"Rocks (Philip Papadopoulos, San Diego Supercomputer Center) The San Diego Supercomputer Center integrates their clusters themselves, especially since up to now their clusters have been relatively modest in size. They tend to update all nodes in a cluster in one cycle since this is rather fast with Rocks. They have performed rolling upgrades, although this has caused configuration problems. They do not perform formal acceptance tests; they suggest that this would be more likely if there were more automation of such tests. Installing clusters by hand has the major drawback of having to keep all the cluster nodes up to date. Disk imaging also is not sufficient, especially if the cluster is not 100% homogeneous. Specialised installation tools do exist,  HYPERLINK "http://oss.software.ibm.com/developerworks/projects/lui" LUI (Linux Utility for cluster Installation) from IBM was mentioned again but San Diego believe that they should trust the Linux vendors and use their tools where possible thus the use of Redhats  HYPERLINK "http://www.toolinux.com/linutile/configuration/kickstart/" Kickstart for example. But they do require automation of the procedures to generate the configuration needed by Kickstart, which Rocks does. They have evolved two cluster node types. One, front-end, for login sessions, the other for batch queue execution where the operating system image is considered disposable it can be re-installed as needed. The HYPERLINK "http://rocks.npaci.edu/"Rocks toolkit consists of a bootable CD and a floppy containing all the required packages and the site configuration files to install an entire cluster the first time. Subsequent updates and re-installations are from a network server. It supports heterogeneous architectures within the cluster(s) and parallel re-installations one node takes 10 minutes, 32 nodes takes 13 minutes; adding more nodes implies adding more HTTP servers for the installation. It is so trivial to re-install a node that if there is any doubt about the running configuration, the node is simply re-installed. Apart from re-installations to ensure a consistent environment, a hard power cycle triggers a re-installation by default. The cluster-wide configuration files are stored in a  HYPERLINK "http://www.mysql.com/" mySQL database and all the required software is packaged in  HYPERLINK "http://www.rpm.org/" RPMs. They are currently tracking Redhat Linux 7.1 with the 2.4 kernel. They have developed a program called  HYPERLINK "http://rocks.npaci.edu/manpages/insert-ethers.8.html" insert-ethers which parses the /var/log/messages for DHCPDISCOVER messages and extracts the MAC addresses discovered. There is no serial console on their PCs so BIOS messages are not seen and this gives a problem to debug very tricky problems. They have considered various monitoring tools, including those based on  HYPERLINK "http://snmp.cs.utwente.nl/" SNMP,  HYPERLINK "http://www.millennium.berkeley.edu/ganglia/" Ganglia (from UCB),  HYPERLINK "http://www-isd.fnal.gov/ngop/" NGOP from Fermilab,  HYPERLINK "http://proj-pem.web.cern.ch/proj-pem/" PEM from CERN, the HYPERLINK "http://www.mcs.anl.gov/systems/software"Chiba City tools, etc but they feel more investigations are needed before coming to a decision.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/GRID-wp4-install.ppt" European DataGrid Project, WP4  (Jarek Polok, CERN)  HYPERLINK "http://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/" WP4, or Fabric Management, is the work package of the  HYPERLINK "http://www.eu-datagrid.org/" European DataGrid project concerned with the management of a computer centre for the DataGrid. There are many fabric management tools in common use today but how well do they scale to multi-thousand node clusters? Tools at that level must be modular, scalable and, above all, automated. One of the first tasks of WP4 therefore has been a  HYPERLINK "http://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/Tools/tool_survey.htm" survey of non-commercial tools on the market place. Looking longer term, WP4s work has been split into individual tasks in order to define an overall architecture: - Software packages which consist of data, dependency information and methods (of installation for example) Software repository scheme, which should include an interface for system administrators and which interfaces to the WP4 configuration system Node management scheme which performs operations on the software packages and also consults the WP4 configuration system Bootstrap service system Gridification (including user authentication and security issues) The final installation scheme must be scalable to thousand+ node clusters and should not depend on software platform although at lower levels there may well need to be different implementations. The Datagrid project has set month 9 (September) for a first demonstration of what exists and WP4 have decided for this first testbed to use  HYPERLINK "http://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/" LCFG from Edinburgh University for configuration management  HYPERLINK "http://www.dcs.ed.ac.uk/home/ajs/linux/updaterpms/" Updaterpms, also from Edinburgh University, (and maybe ASIS from CERN) for environment tailoring  HYPERLINK "http://systemimager.sourceforge.net/" SystemImager for installation This is not a long-term commitment to these tools but rather to solve an immediate problem (to have a running testbed in month 9) and also to evaluate these tools further. The actual installation scheme will use  HYPERLINK "http://www.toolinux.com/linutile/configuration/kickstart/" Kickstart for Linux,  HYPERLINK "http://www.securityfocus.com/focus/sun/articles/jumpstart.html" JumpStart for Solaris. 8.4 Discussion The various sites represented use virtually all possible schemes for system installation some use the standard vendor-supplied method for the chosen operating system, others develop their own and use it; some develop their own environment and request the box supplier to install it before shipment; a few require the hardware supplier to install the chosen environment on site and some leave the entire installation to the supplier. However, there does seem to be a trend more and more towards network-based installation schemes of one type or another. The  HYPERLINK "http://www.acl.lanl.gov/linuxbios/" LinuxBIOS utility from Los Alamos National Laboratory is used in some sites but there are reports that at least the current version is difficult to setup although once configured it works well. And since it is open source software, it permits a local site to make changes to the BIOS, for example adding ssh for secure remote access. A similar tool is  HYPERLINK "http://bioswriter.sourceforge.net/" BIOSwriter (available on sourceforge.org), which permits to clone BIOS settings across a cluster. It has several weaknesses, the most serious of which is that the time written into all the cloned BIOSs is the one stored when the master BIOS is read; thus all the times written are wrong. It also appears to fail on certain BIOSs. Software upgrades too are handled differently: for example CERN performs rolling upgrades on its clusters; Kek performs them all at once, at least on homogeneous clusters; BNL environments are locked during a physics run except for security patches and this tends to concentrate upgrades together; SanDiego (with Rocks) performs the upgrade in batch mode all at once.  HYPERLINK "http://systemimager.sourceforge.net/" SystemImager permits the upgrade to take place on a live system, including upgrading the kernel, but the new system is only activated at the following bootstrap. VA Linux uses the  HYPERLINK "http://sourceforge.net/projects/va-ctcs/" CTCS tool (Cerberus Test Control System) for system burn-in. This checks the system very thoroughly, testing memory, disc access, CPU operation, CPU power consumption, etc. In fact if certain CPU flags are wrongly set, it can actually damage the CPU. Finally, when asked how deeply various sites specify their desired system configurations, LBNL/NERSC and CERN noted that they go down to the chip and motherboard level. Fermilab used to do this also but have recently adopted a higher-level view. Panel B2 - CPU and Resource Allocation The questions posed to this panel, chaired by Jim Amundson, were ---- Batch queuing system in use? Turnaround guarantees? Pre-allocation of resources?  HYPERLINK "http://conferences.fnal.gov/lccws/papers/weds/Allocation.ppt" BaBar (Charles Young) Should one buy or develop a batch scheduling system? Development must take account of support and maintenance issues. Purchase costs, including ongoing support licences, start to be comparable to the costs of the latest cheap PC devices. Can one manage a single unit of 10,000 nodes? Will we need to? How does one define queues, by CPU speed; by the expected output file size; by I/O bandwidth; make them experiment-specific; dependent on node configuration; etc. Are these various alternatives orthogonal? Pre-allocation of nodes reduces wait time for resources but it may reduce CPU utilization and efficiency. What level of efficiency is important? What kind of priority guarantees must be offered for job turnaround? The speaker stated that "users are damn good at gaming the system". LSF (David Bigagli, Platform) The challenge in data movement was posed as "can we run the same job against multiple pools of data and collect the output into a unique whole?" Can  HYPERLINK "http://www.platform.com/products/LSF/" LSF handle a list of resources for each machine, being a (dynamic) list of files located on the machine itself? Can it handle cases of multiple returns of the same data. How can/should we deal with the "scheduler" problems? There is traditionally the tension between desires to let idle resources be used, but still provide some guarantee of turnaround. Apart from LSF there are at least 4 other batch systems in use in HEP. What are the reasons? Answers from the audience include cost, portability, customization of scheduler/database integration, etc. Discussion The challenges for systems administrators are For network I/O, to use TCP or UDP How to efficiently maintain host/resource information Daemon initialization and the propagation of configurations changes Operating System limitations (file descriptions) Administrative challenges. LSF has a large events file that must be common to the fallback server in order to take over the jobs. Other than that, failover works smoothly (as indicated by SLAC and Sanger). Problems with jobs finishing in the window during failover have to be dealt with. Typically, the frequency of failover is the hardware failure frequency of the master machine. Most sites are using quad/dual Suns server for the master batch servers. FNAL for example has a single Sun. CERN adds disk pool space to the scheduling algorithm but that is the only element that people have added to the scheduling algorithms. On very restricted pools of machines of size 50-200 machines experiments are successfully submitting jobs without a batch system. An audience census demonstrated that 30% use  HYPERLINK "http://www.platform.com/products/LSF/" LSF and 70% use something else, made up of 10% using  HYPERLINK "http://www.cs.wisc.edu/condor/" Condor (opportunistic scheduling.  HYPERLINK "http://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.html" DAGMan component, match-making); 30% use PBS (price, sufficient to the needs, external collaboration, ability to influence scheduler design) impact of commercialization is unknown and worrisome; and 20% have local custom-built tools (historical expertise, sufficient to the needs, guaranteed access to developer, costs, reliability, capabilities to manage resources). Are there any teeth in the  HYPERLINK "http://standards.ieee.org/catalog/olis/index.html" POSIX batch standard for batch? How many people even know there is such a thing? Turning to AFS support in LSF, is the data encrypted in transfer? In LSF 4.0 the claim is that it is possible for root on the client to get the users password via the token that was transferred. No one in the audience, nor the LSF speaker, recognized the problem. On clusters of about 100 machines, DNS round robin load balancing works for interactive logins. MOSIX is used by one site for ssh gateway redundancy to allow clean failover to another box. CERN finesses this by forcing the ssh key to be the same. What is the most effective way of dealing with queue abusers? One site relies on user liaison to get the social pressure to change bad habits. Public posting of monitoring information gets peer pressure to reform abusers. However, in shared pools, people are adamant about pointing out "other" abusers. How do sites schedule downtime? Train people that jobs longer than 24 hours are at risk. CERN posts a future shutdown time for the job starter (internal) BQS has this feature inside. Condor has a daemon (eventd) for draining queues. Some labs reboot and have maintenance windows. Small Site Session (Session chair: Wolfgang von Rueden (CERN) 10.1  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/HEPFarmConf.ppt" NIKHEF (Kors Bos) The NIKHEF data centre includes a 50 node farm with a total of 100 Pentium III CPUs dedicated to Monte Carlo event production for the  HYPERLINK "http://www-d0.fnal.gov/" D0 experiment at the Fermilab Tevatron; it is expected to double in size each year for the next few years. There is also a small, 2 node, test farm with the same configuration; this system is considered vital by the support team and is used for 90% of development and testing. NIKHEF is connected to  HYPERLINK "http://www.surfnet.nl/" SURFNET a 20Gbps backbone in Holland. NIKHEF are also members of a national Grid project now getting underway in the Netherlands. The nodes run Redhat Linux version 6.2, booted via the network with tools and applications being made available via the  HYPERLINK "http://www.fnal.gov/docs/products/upd/" UPS/UPD system developed by Fermilab. Also much appreciated from Fermilab, the batch scheme is  HYPERLINK "http://www.fnal.gov/fermitools/abstracts/fbsng/abstract.html" HYPERLINK "http://www-isd.fnal.gov/fbsng/"FBSng. Jobs to be run are generated by a script and passed to FBSng; typically the script is setup to generate enough jobs to keep the farm busy for 7 days, lowering the administration overhead. The data for D0 is stored in  HYPERLINK "http://d0db.fnal.gov/sam/" SAM a database scheme developed by D0. SAM, installed at all major D0 sites, knows what data is stored where and how it should be processed. It could be considered an early-generation Grid. [In fact NIKHEF participate in both the European DataGrid project and a local Dutch Grid project. Data is passed from the local file system back to Fermilab using ftp so some research has taken place on this particular application. Native UNIX ftp transfers data at up to 4Mbps, a rate partially governed by ftps handshake protocol. Using  HYPERLINK "http://ccweb.in2p3.fr/bbftp/" bbftp developed at IN2P3, a maximum rate of about 20 Mbps has been achieved, with up to 45 Mbps using 7 streams in parallel. A development called  HYPERLINK "http://www-unix.globus.org/mail_archive/datagrid/msg00217.html" grid-ftp has achieved 25 Mbps. Increasing ftp parallelism gives better performance but so far the absolute maximum seen is 60 Mbps and NIKHEF believes that ultimately 100 Mbps will be needed. However, at the current time, the bottleneck is not ftp but rather the line speed from Fermilab itself to the Chicago access point of the transatlantic link, a problem which is acknowledged and which it is planned to solve as soon as possible. A current development in NIKHEF is research into building their own PC systems from specially chosen motherboard and chips. Using in-house skills and bought-in components, they expect to be able to build suitable farm nodes for only $2K (parts only) as compared to a minimum of $4K, from Dell for example for the previous generation of farm nodes. Once designed, it takes only an hour to build a node. So far, there is motivation among the group for this activity but will this last over a longish production run?  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/LCCWS.ppt" Jefferson Lab (Ian Bird) The main farm at JLab is a 250 PC node cluster (soon to expand to 320 nodes) used mainly for reconstruction and some analysis of the data from their on-site experiments, now producing about 1TB of data per day. There is also some simulation and a little general batch load. Access is possible to the farm and its mass storage from anywhere on the JLAB site via locally-written software. Jefferson specifies a particular motherboard and their desired chip configuration and then they get a supplier to build it. Their storage scheme is based on storage racks in units of 1TB and they can fit up to 5 such units in a rack. It used to be controlled by  HYPERLINK "http://mufasa.desy.de/" OSM but they have moved to a home-written software. On their backbone, they have had good experience with  HYPERLINK "http://www.foundrynetworks.com/products/" Foundary Gigabit switches, which give equivalent performance as the more common  HYPERLINK "http://www.cisco.com/" Cisco switches. Jefferson also participate in the Lattice QCD project mentioned above; in partnership with MIT, they have a 28 dual node Compaq Alpha cluster installed, connected with  HYPERLINK "http://www.myri.com/myrinet/overview/" Myrinet and front-ended by a login server and a file server. The first implementation of this cluster was with stand-alone box systems but the most recent version has been rack mounted by an outside vendor. A second, independent, cluster will be added in the near future, probably with 128 Pentium 4 PCs and this is expected to at least double later. The two clusters Alpha and Pentium cannot realistically be merged because of the parallel nature of the application, which virtually demands that all nodes in a cluster be of the same architecture. Jefferson has developed its own UNIX user environment ( HYPERLINK "http://cc.jlab.org/services/cue/cuesw.html" CUE). This includes files accessed via  HYPERLINK "http://nfs.sourceforge.net/" NFS or  HYPERLINK "http://www.microsoft.com/Mind/1196/CIFS.htm" CIFS from a  HYPERLINK "http://www.networkappliance.com/" Network Appliance file server. For batch production, Jefferson uses both  HYPERLINK "http://www.platform.com/products/LSF/" LSF and, because of the licence costs of LSF, PBS and it has devised a scheme ( HYPERLINK "http://cc.jlab.org/docs/scicomp/man-page/jsub-man.html" Jsub) on the interactive nodes which is effectively a thin wrapper to permit users to submit jobs to either in a transparent manner. They have devised a local scheduler for PBS. Actual resource allocation is governed by LSF and  HYPERLINK "http://www.platform.com/products/LSF/whatsnew41.asp#c" LSF Fairshare. LSF is also used to publish regular usage plots. They have also developed a web interface to permit users to monitor progress of their jobs. There are plans to merge these various tools into a web application based on XML and including a web toolkit, which experiments can use to tailor it for their own use. The Lattice QCD cluster uses PBS only but with a locally developed scheduler plug-in which mimics LSFs hierarchical behaviour. It has a web interface with user authentication via certificates. Users can submit jobs either to the QCD cluster in Jefferson or the one in MIT. For installing PCs, Jefferson currently use  HYPERLINK "http://www.toolinux.com/linutile/configuration/kickstart/" Kickstart complemented by post-installation scripts but they are looking at PXE-equipped (Preboot Executive Environment) motherboards and use of the  HYPERLINK "http://www.isc.org/products/DHCP/" DHCP protocol in the future. For upgrades, the  HYPERLINK "http://www.kaybee.org/~kirk/html/linux.html" autoRPM tool is used but new kernels are installed by hand. Redhat version upgrades are performed in a rolling manner. There is no significant use yet of remote power management or remote console management. For checkout of new systems they use  HYPERLINK "http://www.iaeste.dk/~henrik/projects/mprime.html" Mprime, a public domain tool; system monitoring is done also with a public domain tool ( HYPERLINK "http://www.kernel.org/software/mon" mon), which they accept is perhaps simplistic but is considered sufficient for their needs. Statistics produced by LSF are collected and used for performance measurements. A major issue is the pressure on floor space for new systems, especially with their need to add yet more tape storage silos. Operation of their centre is lights-out. Having covered the broad range of work done at the lab, Ian offered the definition of a small site as one where we just have 2 to 3 times fewer people to do the same jobs as at the larger sites. HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/RunningCluster.ppt"CCIN2P3, Lyon (Wojciech Wojcik) The computer centre at IN2P3 operates a multi-platform, multi-experiment cluster. It has to support a total of 35 different experiments spread across several scientific disciplines at sites in several continents. The centre has multiple platforms to support, currently 4 but reducing shortly to 3. The reasons for this are largely historical (the need to re-use existing equipment) but also partly dictated by limited staff resources. The architectures supported are AIX, Solaris and Linux with HP-UX being the one planned for near-term phase-out. There are three clusters batch, interactive and data all heterogeneous. The large central batch farm is shared by all the client experiments and users are assigned to a given group or experiment by executing a particular profile for that group or experiment. Home directories are stored on  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS as are many standard utilities accessed in /usr/local. Data is accessed from local disc, staged there on request. Many experiments use  HYPERLINK "http://consult.cern.ch/writeup/coreuser/node14.html" RFIO for data access (originally from CERN, augmented locally in IN2P3); some experiments (for example BaBar) use  HYPERLINK "http://www4.clearlake.ibm.com/hpss/index.jsp" HPSS and  HYPERLINK "http://www.objectivity.com/" Objectivity for data access; others use  HYPERLINK "http://doc.in2p3.fr/man/xtage.html" Xtage to insulate the user from the tape technology in use (this is common at most sites now although many have their own locally-developed data stager). The batch scheduler used is HYPERLINK "http://webcc.in2p3.fr/man/bqs/intro"BQS, a local development some time ago, still maintained and available on all the installed platforms. Jobs are allocated to a node according to CPU load, the architecture demanded, group allocations and so on. The latest acquisitions at IN2P3 included a major PC purchase where the contract went to IBM who were responsible also for the physical installation in racks in the centre. IN2P3 is also now gaining experience with SANs. They currently have some 35TB of disc space connected. Being only a data processing site and supporting multiple experiments, IN2P3 see many problems in exchanging data, not only with respect to network bandwidth for online data transfer but also the physical expert and import of data. CERN was using HPSS and is now switching to  HYPERLINK "http://wwwinfo.cern.ch/pdp/castor/Welcome.html" CASTOR. SLAC (BaBar) uses HPSS. Fermilab (D0) uses  HYPERLINK "http://www.rl.ac.uk/cisd/Odds/hepix99/petravic/show.html" Enstore. How can a small centre expected to cope with this range? Could we, should we, develop a higher level of abstraction when dealing with mass storage? In addition, they are expected to licence commercial products such as  HYPERLINK "http://www.objectivity.com/" Objectivity in order to support their customers. Apart from the financial cost of this, it only adds to the number of parameters to be considered in selecting a platform or in upgrading a cluster. Not only must they minimise disruption across their client base, they must consider if a particular product or new version of a product is compatible with the target environment. A similar challenge is the choice of UNIX environment to be compatible with their multiplicity of customers, for example, which version of a given compiler should be installed on the cluster (in fact they are currently required to install 3 versions of C). Software Panel The questions posed to this panel, chaired by Ian Bird, were ---- How do you select software tools? By reputation, from conference reports, after in-house evaluation, by personal experience, etc. Obviously all of these may play a role which are the 3 most important in order of significance Do you trade-off personnel costs against the cost of acquiring commercial tools?  HYPERLINK "http://conferences.fnal.gov/lccws/papers/tues/22WrightCondor.ppt" CONDOR (Derek Wright, University of Wisconsin)  HYPERLINK "http://www.cs.wisc.edu/condor/" Condor is a system of daemons and tools that harness desktop machines and computing resources for high throughput computing. Thus it should be noted that Condor is not targeted at High Performance Computing but rather at making use of otherwise unused cycles. It was noted that the average user, even at peak times during the working day, seldom uses more than 50% of his or her CPU as measured on an hourly basis. Condor has a central matchmaker that matches job requirements against resources that are made available via so-called Class Ads published by participating nodes. It can scale to managing thousands of jobs, including inter-job dependencies via the recently developed  HYPERLINK "http://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.html" DAGMan. There is also centralised monitoring of both jobs and nodes and a degree of fault tolerance. Making jobs checkpointable (for example by linking against Condor libraries) helps Condor but this is not essential. Owners of participating nodes can vacate their systems of current jobs at any time and at no notice. Condor is easy to set-up and administer since all jobs and resources are under the control, or at least responsibility, of a central node. There are no batch queues to set-up and administer. Job control is via a set of daemons. Examples of Condor pools in current use include  HYPERLINK "http://www.bo.infn.it/calcolo/condor/index.html" INFN HEP sites (270 nodes), the CHORUS experiment at CERN (100 nodes),  HYPERLINK "http://caulerpa.marbot.uni-bremen.de/crit/nph-med.cgi/http://www.nas.nasa.gov/Pubs/Highlights/1999/19990302.html" Nasa Ames (330 nodes) and  HYPERLINK "http://www.ncsa.uiuc.edu/" NCSA (200 nodes). At its main development site in the  HYPERLINK "http://www.cs.wisc.edu/condor/uwflock/" University of Wisconsin at Madison, the pool consists of some 750 nodes of which 350 are desktop systems. The speaker noted that the Condor development team consists largely of ex-system administrators and this helps focus development from that viewpoint. One interesting feature of Condor is the eventd feature. This can be used to rundown a Condor node before a scheduled event takes place such as a reboot. For  HYPERLINK "http://www-unix.mcs.anl.gov/mpi/fastchoice.html" MPI applications, Condor can gather together a required number of nodes before scheduling the jobs in parallel. Current work in Condor includes the addition of  HYPERLINK "http://web.mit.edu/kerberos/www/" Kerberos and  HYPERLINK "http://www.openssl.org/docs/apps/x509.html" X509 authentication, now in beta test, and the use of encrypted channels for secure data movement (this could include  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS tokens). A lot of work is going on to study how best to schedule I/O. Another issue for the future is Condors scalability beyond 1000 nodes, which is (relatively at least) untested. Condor will shortly be packaged in  HYPERLINK "http://www.rpm.org/" RPM format for Linux but it should be noted that Condor is NOT Open Source. Binaries are freely available for about 14 platforms but the sources must be licensed. Another option is to contract for a support contract from the Condor team with guaranteed response times to submitted problems. Meta-Processor Platform (Robert Reynolds) This is based on the seti@home scheme and is the largest distributed computing project in history. It currently has some 3 million users with over 1 million nodes for a total of 25 Teraflops! It can be packaged to gather resources across the Internet or within an Intranet. Current applications include cancer research as well as the more renowned search for extra-terrestrial intelligence (SETI). One particular application (cancer research) has more than 400,000 members on 600,000 nodes scanning more than 25,000 molecules per second; the sustained throughput is 12 Tflops per day with peaks reaching 33 Tflops. The scheme is based on aggregating resources and measured in millions of devices: there is a small (the suggested maximum is 3MB of code plus a maximum of 3MB resident data), unobtrusive, self-updating agent on the node running at low priority just above the idle loop; indeed it can replace the screen saver. It communicates with a central master by way of an encrypted channel sharing code that has been authenticated by electronic signatures. The node owner or user has control over when this agent runs and when and how it should be pre-empted. Agents and code are available for Linux (the initial platform) and Windows (where most of the current resources come from). Some 80% of the nodes or more have Internet connection speeds equivalent to ISDN or better. The scheduling involves sufficient redundancy, running the same job N times on different clients, to allow for the fact that the client owner may switch his or her system off at any moment. For people interested to port their applications to this tool there is a software development kit for code written in C, C++ and Fortran. Apart from the above-mentioned applications, another example is to use this to stress-test a web site how many simultaneous accesses can it cope with. This typical of the ideal application course-grained parallelism, small application code footprint, small data transfer. Panel A3 Monitoring The question posed to this panel, chaired by Olof Barring, was ---- Do you monitor services or servers? In other words, do you monitor that a service is being delivered or that a particular hardware or software status is faulty  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/chan.ps" BNL (Tony Chan) BNL uses a mixture of commercial and locally developed tools for monitoring. They monitor both hardware and software as well as the health of certain services such as  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS and  HYPERLINK "http://www.platform.com/products/LSF/" LSF; for LSF they use tools provided by the vendor of the product. They also monitor the state of what might be called infrastructure UPS state, cooling system, etc. Alarms are signalled by beeper and by electronic mail. The VA Linux-developed tool  HYPERLINK "http://vacm.sourceforge.net/" VACM is used to monitor hardware system status which permits a limited number of actions to be performed, including a power cycle. From open source modules they have produced a scheme whereby users have access via the web to the current state of CPU load and some similar metrics.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/ngop_lccws.ppt" NGOP (Tanya Levshina, FNAL) The  HYPERLINK "http://www-isd.fnal.gov/ngop/" NGOP project at Fermilab is targeted at monitoring heterogeneous clusters running a range of services for a number of groups. Its goals are to offer active monitoring, to provide problem diagnostics with early error detection and correction and also to display the state of the services. Before starting the project, the team looked at existing tools, public domain and commercial, and decided that none offered the flexibility or adaptability they felt they needed. Reasons cited included limited off-the-shelf functionality of some tools, anticipated difficulties of integrating new packages into the tools, high costs, both initial and ongoing support, and doubts about scalability without yet further investment. Instead they would have to develop their own. The current, first, implementation targets exceptions and diagnostics and it monitors some 6500 objects on 512 nodes. Objects monitored include checking for the presence of certain critical system daemons and file systems, the CPU and memory load, the number of users and processes, disk errors and NFS timeouts and a few hardware measures such as baseboard temperatures and fan speeds. It stores its information in an ORACLE database, is accessed via a GUI and there is a report generator. It is mostly written in Python with a little C code. XML (partially MATHML) is used to describe the configurations. At the present time it monitors objects rather than services although NGOP users can use the framework to build deductive monitors and some system administrators have indeed done this, for example the mail support team. Plans for the future include support for scaling up to 10,000 nodes, to provide a callable API for monitoring a client and to add historical rules with escalating alarms.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/PEM_LCCWS.ppt" PEM (Olof Barring, CERN) Like Fermilab, the CERN PEM team performed a tools survey and like Fermilab they decided that they required to build something tailored to their needs and available resources. Like the NGOP team, they designed an agent-based scheme but they decided that from the beginning they would target services, a correlation engine being part of the original plan. Scalability would be built in by incorporating brokers who would interface to the front-end monitoring agents, currently assigning a broker for every 50 agents. In the current, first, version of  HYPERLINK "http://proj-pem.web.cern.ch/proj-pem/" PEM, some 400 nodes are monitored with about 30 measurements taken every 30 seconds. It is planned soon to extend this test scheme to 1000 nodes. Like  HYPERLINK "http://www-isd.fnal.gov/ngop/" NGOP, PEM is designed to be active it should react and take corrective action if it recognises a problem such as a measurement out of defined limits. The recovery action can be implemented via a broker and/or signalled by an alarm to a registered listener. PEM has a measurement repository to store all data from the brokers. The data is physically stored via JDBC into an ORACLE database; initially this gave scaling problems but work on the interfacing, especially a better implementation of JDBC and better threading of the code, solved this. The correlation engine, today still only a plan, will transform simple metrics into service measurements. Currently, most monitoring code is written in scripts, which permits fast prototyping and easy access to protocols. In the future, PEM plans to duplicate all modules to allow for scaling to very large numbers of nodes. The PEM prototype will be used for the first European DataGrid fabric monitoring tests (see below) with various constituent parts (for example, the configuration manager and the underlying data transport layer) being later replaced by those from the DataGrid project. Discussion There was a discussion about how much measurement data should be stored and for how long. CERN considers that saving historical data is essential in order to be able to plot trends. BNL agreed; they save all monitored data although in condensed mode (partially summarised). With the number of nodes eventually planned for the CERN centre, a PEM Database Cluster might become necessary. For PEM, the current assigned resources are 14 people part-time, totalling about 4-5 FTE. FNAL have about the same number of people contributing to NGOP but for a total of about 1 FTE only. One of the most basic questions to answer is whether to buy a commercial product or to develop something in-house. In favour of the former is the sheer amount of local resources needed to develop a viable tool. IN2P3 for example had started a monitoring project last year with a single person and had been forced recently to abandon it (they are currently evaluating the NGOP and also discussing with the PEM team). On the other hand, commercial tools are typically very large, range from expensive to very expensive and need a lot of resources to adapt to local requirements on initial installation. However, their ongoing resource needs are usually much less than in-house projects, which often need much support over their lifetimes. According to some of the audience, building the framework is not where most of the work is. The difficult part is to decide what to measure. Others disagreed: a good, scalable architecture is very important. And a subscription driven correlation engine is a good step. What communities is monitoring for? System administrators need it for monitoring status, performance and for alarms. Application writers need performance data. Everyone wants to know about exceptional performance states. Apart from the sites listed above, other sites reported on various monitoring tools: SLAC has a locally-developed tool ( HYPERLINK "http://www.jlab.org/hepix-hepnt/presentations/Ranger_Update/" Ranger) for relatively simple tasks such as to monitor daemons and restart those which die or hang. San Diego SuperComputer Centre use  HYPERLINK "http://www.millennium.berkeley.edu/ganglia/" Ganglia from Berkeley, chosen because it was freely available and appeared to satisfy their immediate needs. LBNL (NERSC) looked at various public domain products (including  HYPERLINK "http://www.kernel.org/software/mon/" mon and  HYPERLINK "http://www.maclawran.ca/bb-dnld/" big brother) but found they did not scale well; they have now (in the last few weeks) settled on  HYPERLINK "http://www.netsaint.org/" netsaint which seems better and more extensible. It relies on  HYPERLINK "http://snmp.cs.utwente.nl/" SNMP daemons. Another tool mentioned was  HYPERLINK "http://www.platform.com/products/siteassure/index.asp" SiteAssure from Platform: it is rule-based if this condition is true, perform that action. As described above,  HYPERLINK "http://vacm.sourceforge.net/" VACM relies on so-called mayors to control 30-40 nodes and scalability can be achieved by designating super-mayors to control mayors. But VACM only monitors node availability; how to measure a service which might depend on multiple nodes? Panel B3 User issues, security The questions posed to this panel, chaired by Ruth Pordes, were ---- Do you have written policies for users - non-abuse of the system, the right to check e-mail, the right to enforce password rules Do you have a dedicated security team? Do you permit access from off-site; do you enforce rules for this? 13.1 HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/FNAL Security & Authentication (LCCWS).ppt"Fermilab Strong Authentication Project (Mark Kaletka) As other large labs, Fermilab has established a  HYPERLINK "http://www.fnal.gov/cd/main/cpolcy.html" set of rules that users must accept in order to use computer services. These include rules relating to security and Fermilab has created a security incident response team with well-practised response processes. A major concern of the team is data and file backup since the most damaging incidents are those that destroy or make data unavailable. There is rather less concern, at least among the users, about data privacy. Probably the largest concern however is that hacked systems can be used to launch further attacks both inside the lab and towards the world-wide Internet A project is underway to implement strong authentication, which is defined as using techniques that permit entities to provide evidence that they know a particular secret without revealing the secret." The project addresses approximately half of the problems that have been determined to be root causes of incidents. One of the basic tools being used is  HYPERLINK "http://web.mit.edu/kerberos/www/" Kerberos version 5 with some local enhancements and  HYPERLINK "http://www.cryptocard.com/" CryptoCard challenge and response one-time passwords. Rollout of the project has begun with the goal of creating a secure realm across the site by the end of 2001, with a few declared exceptions. Strong authentication for access to computer farms brings extra issues such as the secure handling of host and user service keytabs across the farm, authenticating processes which do not belong to an individual and so on. In the discussion, it was remarked that encryption and security options could very badly affect the performance of data transfer. A major issue is how to deal with trust relationships of other realms: The project team are technically comfortable that this works because of internal tests with migrations but they have not yet gone through the steps of negotiating a trust with another HEP realm. To the question on whether anyone has tested the Java ssh client for the web access, apparently there was some initial interest but no production use. 13.2 Security at the University of Minnesota (Mike Karo) The support team is charged with looking after the security of about 100 computers. Most of these have non-dedicated functions so any user can logon and do what they chose. This drives the need for any of the machines to accept logons. The first obvious security method is to disable as many unnecessary services as possible. For example, only one machine accepts incoming telnet or ftp. How to enforce security when using the batch features? The simplest is to issue warning calls to the users to ask not to exhibit bad behaviour. Unfortunately for them, the university mandates that they should not use firewalls because of the public funding for the university and the desire to keep broad public access. They are looking at  HYPERLINK "http://www.sun.com/products-n-solutions/hardware/infoappliances.html" SunRays as a method of privacy and simplicity of administration but they discovered that the IP/UDP traffic for these machines does not handle network congestion well. They are using smartcards as physical authorization. Another area of study is to understand best practices in the realm of human interface (GUI). Do you have to select/customize window managers? What value have people found in developing the GUI/custom interfaces to, for example, batch services? Is this a support albatross for limited gain or is this an interface level that allows for the insertion of local necessary customizations/migration protection. It was reported that Jefferson Lab users are looking for a common GUI to aid the training of new users and University of Minnesota users are looking for the same benefits. Users at Wisconsin require access from Palm Pilots and hence a more esoteric access method and format. The question was raised on whether  HYPERLINK "http://www.globus.org/" Globus allows for an abstraction layer above the various batch systems. There appears to be very limited experience with it yet. The speaker was asked if he had looked at  HYPERLINK "http://www.accelrys.com/products/gcg_wisconsin_package/" GCG. It is apparently free to academic sites but it is heavyweight and rather keyed to the biophysics community. 13.3 Discussions Do people have experience with  HYPERLINK "http://saaz.lanl.gov/LSF/LSF_page5.html" LSTcsh from Platform Inc.? It could be a useful method for novices to access the broader resources of an LSF cluster. About 30% of the attendees were using Kerberos. One drawback is the effort in the creation of a centralized registry of accounts. NERSC has the additional problem of being a global resource provider. About 30% people regularly run  HYPERLINK "http://www.users.dircon.co.uk/~crypto/download/c50-faq.html" crack on their clusters. Wisconsin appears to have deployed a method of generation of Kerberos tickets from  HYPERLINK "http://csrc.nist.gov/pki/" PKI certificates. They offered the conclusion that Kerberos turns out to be not useful on a cluster and even impossible between clusters. Question: Is anyone worrying about application authentication? It appears that the answer is no. Instead, people are currently trying to address the problem by restricting sensitive data to private networks, internal to the sensitive applications. How will we protect against people using their Kerberos passwords for Web passwords? What are people looking toward for the registry of large user databases. Globus will require a mapping interface at each site to present a list of users. We need to distinguish between authentication and authorization. To the question of whether people are clear on the distinction between security policy and usage policy there was rather a lot of silence! What do different sites do about dealing with the creating secure environments? Most farms are behind a firewall but, for example, the University of Wisconsin and FNAL are on the public net. Are there centralized methods of dealing with patch selection and installation in relation to security issues? There is usually someone charged with watching the lists and spreading the word. NERSC has 4-5 FTEs, Sanger has 1 dedicated security person, SLAC has 3, JLab has 1, and FNAL has 2. Some university colleges have a floor warden for scans and audits. The University of Wisconsin largely accepts centralized administration. The general theme in the audience seems to be that user administration of machines is a slippery slope to chaos. 100% of the people present administered their own machines. In biophysics, the main concern is the loss of results, which translates directly into a loss of dollars. In addition, bio-physicists are concerned about being a target of upset people. What sort of training is done on security issues? SLAC requires mandatory training for all users. Sanger has a public training series. Are there scaling problems anticipated with 1000 node clusters? Uniformity of a cluster is a big advantage as is limited direct login to the machines. This helps immensely in dealing with the clusters. Getting the time for maintenance and reconfiguration for (urgent) security patches is probably the biggest question. Scalability of the administration tools will necessarily provide the capabilities for the turnaround of machine update. Panel A4 Grid Computing This panel was chaired by Chuck Boeheim. 14.1  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/WP4_LCCWS.ppt" European DataGrid (Olof Barring, CERN) The  HYPERLINK "http://www.eu-datagrid.org/" European DataGrid project is funded for 3 years by the European Commission for a total of some 10M Ecus. There are 6 principal partners and 15 associate partners. The project is split into 12 work packages, 5 for middleware, 3 for applications (HEP, Earth Observation and Bio-informatics), 2 concerned with networking and establishing a testbed and 2 for administration. The work package of most concern to the provision of clustering is  HYPERLINK "http://cern.ch/hep-proj-grid-fabric" Work Package 4, Fabric Management. WP4 is charged with delivering a computing fabric comprised of the tools necessary to manage a centre providing grid services on large clusters. There are some 14 FTEs available spread across 6 partners. WP4 has identified 6 sub-tasks and the interfaces between these Configuration management: databases of permitted configurations Software installation: including software repositories, bootstrap procedures and node management services Monitoring Fault tolerance Resource management Gridification, including security The DataGrid project started in January 2001 and much of the current effort is gather the detailed requirements of the work packages, especially the interfaces between them, and to define a global architecture. The overall timetable specifies major milestones with testbed demonstrations of current grid functionality at 9, 21 and 33 months. The first prototype, due this September, will be based largely on Globus but in the longer term the work packages will match the user requirements (now being gathered) with what is available and what can be developed. For this first prototype, WP4 will be able to provide an interim installation scheme and will make available only low-level queries. It will use  HYPERLINK "http://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/" LCFG from the University of Edinburgh for software maintenance,  HYPERLINK "http://sourceforge.net/projects/systemimager/" SystemImager from VA Linux for the initial software installation and  HYPERLINK "http://vacm.sourceforge.net/" VACM, also from VA Linux for console control. It is assumed that a fabric will execute local jobs as well as jobs submitted via the Grid and a means must be found for these to co-exist. WP4 must take account of multiple operating systems and compiler combinations and the question is whether to store these centrally and replicate them at Grid sites or request them on demand or store them on local discs or send them along with client jobs.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/ppdg_lccws.ppt" PPDG/GriPhyn (Ruth Pordes, FNAL) HYPERLINK "http://www.ppdg.net/"PPDG is a US project involving 6 High-energy and Nuclear Physics experiments and 4 Computer Science groups as well as 4 US scientific laboratories. Among the partners are the  HYPERLINK "http://www.globus.org/" Globus team, the  HYPERLINK "http://www.cs.wisc.edu/condor" Condor team and the producers of  HYPERLINK "http://www.npaci.edu/DICE/SRB/" SRB (Storage Request Broker from the San Diego SuperComputer Center). In all there are some 25 participants. It is expected soon to be funded by the US Department of Energy (DoE) for 3 years. It is a follow-on to a similarly named smaller project, which emphasised networking aspects only. In this incarnation the proposal is to build an end-to-end integrated production system for the named experiments (see overhead). To date, PPDG work has achieved 100 MB/sec point-to-point file transfers using common storage management interfaces to various mass storage systems. It has also developed a file replication prototype ( HYPERLINK "http://cmsdoc.cern.ch/cms/grid/" GDMP Grid Data Mirroring Package). The proposal to the DoE lists a number of milestones and deliverables and, in detail, many of these and the tasks required to achieve them resemble those of the  HYPERLINK "http://www.eu-datagrid.org/" European DataGrid, with the notable exception that the US project has no equivalent of the DataGrids  HYPERLINK "http://cern.ch/hep-proj-grid-fabric" Work Package 4, Fabric Management. The project is in an early stage but already there are concerns that the various client experiments need to resolve their differences (for example in data handling models) and work together, especially since they are at different stages of their lifecycles (BaBar in production; FNAL Run II just getting started; the LHC experiments in development and will be for some time). Going into some detail on PPDG activities, the first major one is to develop a prototype to replicate Monte Carlo data for the CMS experiment between CERN and some remote sites. There is a first prototype for  HYPERLINK "http://www.objectivity.com/" Objectivity files with flat file support to be added next. A second activity is to provide job definition and global management facilities for the analysis of Fermilabs D0 experiment. The starting point is D0s existing SAM database (see the Nikhef cluster talk in Section 10 above) which offers file replication and disc caching). Condor services will be used in this activity.  HYPERLINK "http://www.griphyn.org/" GriPhyN is a 3-year project, funded by the National Science Foundation (NSF). It is essentially a computer science research project involving 17 university departments plus the San Diego SuperComputer Center, 3 US science labs and 4 physics and astrophysics experiments. It started formally in September 2000 and has some 40 participants. GriPhyn addresses the concept of virtual data is it cheaper to access the processed data from a remote site or recalculate it from the raw data? The project should develop a virtual data toolkit offering transparency with respect to location but also with respect to materialization of the data. The results should be applicable to a Petascale Datagrid. University computer science departments will do much of the work and the developed software should be stored in a central repository and made available in the public domain. This differs from the PPDG model where work will be done across the collaboration, should transition to the participating computer science departments and experiments would get the software from there. In summary, the high-level challenge facing all the Grid projects is how to translate the technology into something useful by the clients, how to deliver the G word? An added complication is the mushrooming of grid projects, European and US. How to make them communicate and interact, especially since many have common user groups and experiments? Panel B4 -- Application Enviroment, Load Balancing The questions posed to this panel, chaired by Tim Smith, were ---- What kinds of applications run on the cluster? Does the cluster support both interactive and batch jobs Is load balancing automatic or manual? 15.1  HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/cdfl3_online.pdf" CDF Online Cluster (Jeff Tseng, MIT) The principle application for this cluster is real-time event filtering and data recording. The movement from mainframe to clusters is a phenomenon seen in the experimental online world also. [In fact, can this not be extended to the general question of instrumentation and business?] The CDF online cluster consists of some 150 nodes, dual CPU Pentium IIs and IIIs running Linux (expected to expand to 250 nodes by the end of 2001). Their target configuration was one costing less than $2000 per node and they have usually been able to pay less than this. The systems are packaged in commodity boxes on shelves. They are interconnected by a bank of 9  HYPERLINK "http://www.3com.com/products/en_US/prodlist.jsp?tab=cat&pathtype=purchase&cat=11&selcat=LAN+Switches+%28Stackable%2FFixed%29&family=325" 3COM Fast Ethernet switches. Maximum data logging rate at  HYPERLINK "http://www-cdf.fnal.gov/upgrades/upgrades.html" CDF under Run II conditions is 20 MB/s output and this is the current limiting bottleneck. Availability of the cluster must be 100% of the accelerator operation to avoid losing data so there is no regular schedule for maintenance possible. The data taking is subject to a 5 minute start/stop warnings and long uptimes (data waits for no one).  HYPERLINK "http://www.corba.org/" CORBA is used to ensure truly parallel control operations. LAN performance is crucial to operation and this has driven them to install a private network for the online cluster. In short, data taking down time must be determined by the Fermilab Collider, not by the cluster. The nodes are largely interchangeable; they are tested on arrival as follows: against the  HYPERLINK "http://www-oss.fnal.gov/fss/documentation/linux/521/rpm2html/521/fermi-benchmark-1.0-1.i386.html" Fermilab standard benchmark (tiny) the CPUs are tested under load for heating effects; the disc I/O rates are tested against the specifications On the cluster they are running the full  HYPERLINK "http://www-cdf.fnal.gov/upgrades/computing/computing_cont.html" offline environment and they are using this for the trigger. This raises the problem of delivering an executable to an arbitrary machine and more work is needed because the problem is not solved yet. Filter executable and libraries are ~100 MB. Trigger table and databases are also ~100 MB and they must regularly distribute 100MB files to 150 machines within 5 minutes. For this they have developed a pipelined copy program, which is MPI-like without being MPI. A major effort has gone into online monitoring of the data because the discovery of data corruption further down the data chain causes great problems. There is a high-throughput error reporting scheme with message collation dealing of periodic status reports every 4 seconds per node! 15.2 HYPERLINK "http://conferences.fnal.gov/lccws/papers/thur/LCW_balancing.ppt"Application Environment Load Balancing (Tim Smith, CERN) Question: how do you ensure that the user environment is the same across the production machines? FNAL uses group accounts that they maintain for the login accounts; builds are done by the users on a common build platform or using their own tools. Users are expected to provide specialized tools wherever possible. SLAC uses  HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS for home directories and binaries/products. Users pre-compile their codes interactively. They are encouraged to use dedicated build machines available through the batch build machines. The Sanger Institutes Pipeline has its own dedicated user that the jobs run under so they get a common environment. Direct logins on the worker nodes are administered by  HYPERLINK "http://www.tru64unix.compaq.com/" Tru64 UNIX directly. CERN distributes home directories with AFS and uses  HYPERLINK "http://wwwinfo.cern.ch/pdp/ose/sue/index.html" SUE/ HYPERLINK "http://wwwinfo.cern.ch/pdp/ose/asis/" ASIS for making the same products available. Given access to your personal home directory do people not need the group account structure to keep things together. Most sites said individual accounts are sufficient and they didn't need group accounts. What tools do people use to keep track of the environment? We see occasionally that dynamic libraries are changed underneath the executables and that there is an unknown dependency. One technique seems to be to statically link the executable. However, there are some reasons not to statically link codes: improvements in the library may be desired and would be unavailable sometimes 3rd party libraries are only available as shared libraries. static links don't allow for dynamic changes of infrastructure without redistributing the binaries. On the other, one argument in favour of static linking is to want to use variations on the machine configuration (Operating System versions. Can we merge the static and dynamic linking? Only the infrastructure things independent of the OS could be dynamic. On the subject of load balancing, attention was drawn to  HYPERLINK "http://www-4.ibm.com/software/webservers/edgeserver/iss.html" ISS, which IBM has made into a product and CERN has developed further. ISS is the domain name system (DNS) component of the Network Dispatcher that can be used for wide-area load balancing. ISS balances the load on servers by communicating with load-monitoring daemons installed on each server machine, and then it alters the IP address returned to the client via DNS. Creating too many IP domains could create its own problems. Discussion Question: who uses over-subscription techniques or dynamic load level dependent on the job type. The most common technique appears to be to allocate more concurrent processes than the total number of processors on large SMPs, but to operate with tight restrictions on the worker machines. Lsbind, authored by a Washington University professor, is a program that can submit to nodes that run LSF directly. LBNL uses the technique of putting batch queues on interactive machines to utilise Background cycles on lightly loaded machines. SLAC has a DNS daemon that performs round robin scheduling and is run on all the members of the cluster. The Sanger Institute reports that the DEC TruCluster software takes care of selecting machines within the cluster. They believe that the algorithm is not sophisticated and this works reasonably well. One possible disadvantage of some load balancing techniques is that finding the actual node names on which jobs are running is very useful for retrieving data from partially completed jobs and for diagnosing errors. Sanger uses technique of wrapping the critical executables with a maximum CPU-time script to prevent runaway. There are a number of sites that run watcher daemons that renice or kill processes that run amok. There are differences among sites on whether they allow mixing of interactive and batch processing on the same machine. djns is a name server from David Bernstein that has alternate scheduling algorithms. The next subject discussed was job and queue management. SLAC and FNAL both delegate the queue management to the users. SLAC does this queue by queue from a common cluster and this works fine. How does this work with  HYPERLINK "http://www.cs.wisc.edu/condor" Condor? CONDOR delegation would be done by access control with the configuration files. SLAC use the ranking system for determining whose jobs they prefer. Uni Minnesota does this by gang-editing of farm machine configuration files. Is there another method? Is this a hole in dealing with dynamic priorities within stable group definitions? What to do with the "pre-exec" scripts to trap and react to job failures? How do you distinguish between single rogue nodes and problem cases where a failed server can blacklist a whole cluster? The feeling is that declaring a node "bad" is an expensive failure mode for transient errors (which can be the class of problem that can most rapidly scan the whole cluster). Condor jobs are by default restarted and can be flagged to leave the queue on failure. Queuing in Condor at the level of about 100,000 jobs sees to cause trouble. LSF seems to not have experience here against lists of more than 10K jobs. Job chunking in LSF is a method of reducing the demands on the central scheduler when there are lots of small jobs. This sets up an automatic chain particularly useful for small jobs. One needs good tools for finding the hotspots and the holes in execution so that people can find when the systems run amok. Effort is underway in Condor to determine the rogue machines and to provide methods for jobs to "vote" against machines and find the anti-social machines. Will this all scale up to clusters of 5000 machines? New question: given a 10K node cluster with 100K jobs a day, how do you digest the logs to present useful information to system administrators, let alone the individual users? What sort of reports are you going to need to show to the management and the funding agencies in order to justify decisions on the cluster? This seems to be a fruitful area for exploration of needs and capabilities. Digesting such reports is going to be a high performance computing problem itself in these type of cluster" according to Randy Melen of SLAC. What kind of trend reporting is needed? Do we need to start digesting all these logs into a database and using database tools against the information pool? What sort of size of a problem does this end up dealing with, or creating? Given the usual techniques of dealing with log file growth (roll over on size) there will be a very small event horizon on knowing what is going on with these machines. Will it even be possible given the rate of growth and will there need to be high performance logging techniques necessary to bring to bear? Panel Summaries (Session chair: Alan Silverman) Panel A1 - Cluster Design, Configuration Management Some key tools that were identified during this panel included --  HYPERLINK "http://www.iu.hio.no/cfengine/" cfengine apparently little used in this community (HENP) but popular elsewhere. Does this point to a difference in needs? Or just preferences; doubts were expressed as to its scalability but these are thought to be unfounded  HYPERLINK "http://sourceforge.net/projects/systemimager/" SystemImager and  HYPERLINK "http://sourceforge.net/projects/luis/" LUIS (Linux Unique Init System) neither does the full range of tasks for configuration management but there are hopes that the  HYPERLINK "http://www.csm.ornl.gov/oscar/" OSCAR project will merge the best features of both  HYPERLINK "http://nfs.sourceforge.net/" NFS warnings about poor scaling in heavy use  HYPERLINK "http://www-unix.mcs.anl.gov/chiba/" Chiba City tools these do appear to be well-designed and to scale up well Variations in local environments make for difficulties in sharing tools, especially where the support staff may have their own philosophy of doing things. However, almost every site agreed that having remote console facilities and remote control of power was virtually essential in large configurations. On the other hand, modeling of a cluster only makes sense if the target application is well defined and its characteristics well known this seems to be rarely the case among the sites represented.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/A2summary.txt" Panel A2 - Installation, Upgrading, Testing Across the sites there are a variety of tools and methods in use for installing cluster nodes, some commercial, many locally-developed. Some of the key tools identified for software distribution were: Dolly+, used at Kek as a mechanism that uses a ring topology to support scalable system builds -- it is still in its testing phase and has not been tested at scale. It appears to resolve contention problems where too many clients access a central server at one time. Rocks (http://rocks.npaci.edu) is a cluster installation/update/management system, which automates the process of creating a Kickstart file. It is based on standard RedHat 7.1, has been tested on clusters up to 96 nodes (< 30 minutes to install all 96 nodes), is currently used on more than 10 clusters and is freely available now. In addition, Compaq references Rocks as their preferred cluster integration for customers wanting a 100% freeware solution. CERN has an installation strategy currently being employed as part of their Grid project that leverages Redhat tools and basic shell scripts. They must have something working by September of this year. Many sites purchase hardware installation services but virtually all perform their own software installation. Some sites purchase pre-configured systems, others prefer to specify down to the chip and motherboard level. Three sites gave examples of burn-in of new systems: FNAL performs these on site and BNL and NERSC requires the vendor to perform tests at the factory before shipment. VA Linux has a tool,  HYPERLINK "http://sourceforge.net/projects/va-ctcs/" CTCS, which tests CPU functions under heavy load; it is available on  HYPERLINK "http://sourceforge.net/" Sourceforge. On the other hand, cluster benchmarking after major upgrades is uncommon the resources are never made available for non-production work! Some interest was expressed that the community could usefully share such burn-in tests as exist. But finally the actual target applications are the best benchmarks. Most sites upgrade clusters in one operation but the capacity for performing rolling upgrades was considered important and even vital for some sites.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/MonitoringA3_Summary.ppt" Panel A3 Monitoring Of the three sites that gave presentations, BNL use a mixture of commercial and home-written tools; they monitor the health of systems but also the NFS, AFS and LSF services. They have built a layer on top of the various tools to permit web access and to archive the data. FNAL performed a survey of existing tools and decided that they required to develop their own, focusing initially at least on alarms. The first prototype, which monitors for alarms and performs some recovery actions, has been deployed and is giving good results. The CERN team decided from the outset to concentrate on the service level as opposed to monitoring objects but the tool should also measure performance. Here also a first prototype is running but not yet at the service level. All sites represented built tools in addition to those that come free with a particular package such as LSF. And aside from the plans of both Fermilab ( HYPERLINK "http://www-isd.fnal.gov/ngop/" NGOP) and CERN ( HYPERLINK "http://proj-pem.web.cern.ch/proj-pem/" PEM) ultimately to monitor services, everyone today monitors objects file system full, daemon missing, etc. In choosing whether to use commercial tools or develop ones own it should be noted that so-called enterprise packages are typically priced for commercial sites where downtime is expensive and has quantifiable cost. They usually have considerable initial installation and integration costs. But one must not forget the often-high ongoing costs for home-built tools as well as vulnerability to personnel loss/reallocation. There was a discussion about alarm monitoring as opposed to performance monitoring. System administrators usually concentrate first on alarms but users want performance data. A couple of other tools were mentioned  HYPERLINK "http://www.netsaint.org/" netsaint (public domain software used at NERSC) and  HYPERLINK "http://www.platform.com/products/siteassure/index.asp" SiteAssure from Platform.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/Panel A4 summary.ppt" Panel A4 Grid Computing A number of Grid projects were presented in which major HENP labs and experiments play a prominent part. In Europe there is the  HYPERLINK "http://www.eu-datagrid.org/" European DataGrid project, a 3-year collaboration by 21 partners. It is split into discrete work packages of which one ( HYPERLINK "http://cern.ch/hep-proj-grid-fabric" Work Package 4, Fabric Management) is charged with establishing and operating large computer fabrics. This project has a first milestone at month 9 (September this year) to demonstrate some basic functionality but it is unclear how this will relate to a final, architectured, solution. In the US there are two projects in this space HYPERLINK "http://www.ppdg.net/"PPDG and HYPERLINK "http://www.griphyn.org/"GriPhyN. The first is a follow-on to a project that had concentrated on networking aspects; the new 3-year project aims at full end-to-end solutions for the six participating experiments. The second, GriPhyN incorporates more computer-science research and education and includes the goal of developing a virtual data toolkit is it more efficient to replicate processed data or recalculate from raw data. It is noted that neither of these Grid projects has a direct equivalent to the European DataGrid Work Package 4. Is this a serious omission? Is this an appropriate area where this workshop can provide a seed to future US collaboration in this area? All these projects are in the project and architecture definition stage and there are many choices to be made, not an easy matter from among such wide-ranging groups of computer scientists and end-users. How and when will these projects deliver ubiquitous production services? And what are the boundaries and overlaps between the European and US projects? It is still too early to be sure how to translate the G word into useful and ubiquitous services. 16.5  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/Panel B1 summary.ppt" Panel B1 - Data Access, Data Movement A number of sites described how they access data. Within an individual experiment, a number of collaborations have worldwide psuedo-grids operational today. These readily point toward issues of reliability, allocation, scalability and optimization for the more general Grid. Selected tools must be available free for collaborators in order to achieve general acceptance. For the distribution of data, multicast has been used but difficulties with error rates increasing with data size have halted wider use. The  HYPERLINK "http://www.cs.wisc.edu/condor" Condor philosophy for the Grid is to hide data access errors as much as possible from reaching the jobs. Nikhef are concerned about network throughput and they work hard to identify each successive bottleneck (just as likely to be at the main data center as the remote site). Despite this however, they consider network transfers much better than those transporting physical media. They note for future experiments that a single active physics collaborator can generate up to 20 TB of data per year. Much of this stored data can be recreated, so the challenge was made: Why store it, just re-calculate it instead. The Genomics group at the University of Minnesota reported that they had been forced to use so-called opportunistic cycles on desktops via Condor because of the rapid development of their science. The analysis and computation needs expected for 2008 had already arrived because of the very successful Genome projects. Turning to storage itself as opposed to storage access, one question is how best to use the capacity of the local disc, often 40GB or more, which is delivered with the current generation of PCs. Or, with Grid coming (but see the previous section), will we need any local storage? 16.6  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/lccws-b2-summary.ppt" Panel B2 - CPU and Resource Allocation This panel started with a familiar discussion - whether to use a commercial tool, this time for batch scheduling, or develop ones own. Issues such as vendor licence and ongoing maintenance costs must be weighed against development and ongoing support costs. A homegrown scheme possibly has more flexibility but more ongoing maintenance. A poll of the sites represented showed that some 30% use  HYPERLINK "http://www.platform.com/products/LSF/" LSF (commercial but it works well), 30% use  HYPERLINK "http://www.openpbs.org/" PBS (free, public domain), 20% use  HYPERLINK "http://www.cs.wisc.edu/condor" Condor (free and good support) and 2 sites (FNAL and IN2P3) had developed their own tool. Both IN2P3 ( HYPERLINK "http://webcc.in2p3.fr/man/bqs/intro" BQS) and FNAL ( HYPERLINK "http://www-isd.fnal.gov/fbsng/" FBSng) cited historical reasons and cost sensitivity. CERN and the CDF experiment at FNAL are looking at  HYPERLINK "http://www.mosix.cs.huji.ac.il/txt_main.html" MOSIX but initial studies seem to indicate a lack of control at the node level. There is also a low-level investigation at DESY of Codine, recently acquired by SUN and offered under the name  HYPERLINK "http://www.sun.com/software/gridware/" SUN Grid Engine.  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/LCCWS_B3_summary.pdf" Panel B3 Security Large sites such as BNL, CERN and FNAL have formal network incident response teams. Both of these have looked at using  HYPERLINK "http://web.mit.edu/kerberos/www/" Kerberos to improve security and reduce the passage of clear text passwords. BNL and Fermilab have carried this through to a pilot scheme. On the other hand, although password security is taken very seriously, data security is less of an issue; largely of course because of the need to support worldwide physics collaborations. Other security measures in place at various sites include -- Disabling as many server functions as possible Firewalls in most sites, but with various degrees of tightness applied according the local environment and the date of the most recent serious attack! Increasing use of smartcards and certificates instead of clear-text passwords  HYPERLINK "http://www.users.dircon.co.uk/~crypto/download/c50-faq.html" Crack for password checking, used in some 30% of sites Many sites have a security policy; others have a usage policy, which often incorporates some security rules. The Panel came to the conclusion that clusters do not in themselves change the issues around security and that great care must be taken when deciding which vendor patches should be applied or ignored. It was noted that a cluster is a very good error amplifier and access controls help limit innocent errors as well as malicious mischief. 16.8  HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/B4_Summary.ppt" Panel B4 - Application Environment, Load Balancing When it comes to accessing system and application tools and libraries, should one use remote file sharing or should the target client node re-synchronise with some declared master node. One suggestion was to use a pre-compiler to hide these differences. Put differently, the local environment could access remote files via a global file system or issue puts and gets on demand; or it could access local files, which are shipped with the job or created by resynchronisation before job execution. The question dynamic or statically-linked libraries was summarised as system administrators prefer static, users prefer dynamic ones. Arguments against dynamic environments are increased system configuration sensitivity (portability) and security implications. However, some third-party applications only exist in dynamic format by vendor design. As mentioned in another panel, DNS name lookup is quite often used for simple load balancing. Algorithms used to determine the current translation of a generic cluster name into a physical node range from simple round-robin to quite complicated metrics covering the number of active jobs on a node, its current free memory and so on. However, most such schemes are fixed once a job is assigned to a job, it does not move, even if subsequent job mixes make this node the least appropriate choice for execution. These schemes worked well, often surprisingly well, at spreading the load. Where possible, job and queue management are delegated to user representatives and this extends as far as tuning the job mix via the use of priorities and also host affiliation and applying peer pressure to people abusing the queues. Job dispatching in a busy environment is not easy what does pending mean to a user, how to forecast future needs and availability. Summary of the SuperComputing Scalable Cluster Software Conference This conference in New England ran in parallel to the Workshop and two attendees, Neil Pundit and Greg Lindahl) were kind enough to rush back from it to report to us. The conference is aimed at US Department of Energy sites; in particular the ASCI super computer sites. These sites are much larger in scale to those represented in the workshop, often having above 1000 nodes already. Indeed some sites already have 10,000 node clusters and their questions are how to deal with 100,000 nodes. Clearly innovative solutions are required but they usually have access to the latest technology from major suppliers and, very important, they have the resources to re-architecture solutions to their precise needs. Among the architectures discussed were  HYPERLINK "http://phys.columbia.edu/~cqft/qcdoc.htm" QCDOC (QCD on a chip using custom silicon) and an IBM-developed ASIC using a PowerPC chip with large amounts of on-board memory. It must be noted that such architectures may be good for some applications (QCDOC is obviously targeted at QCD lattice gauge calculations) but quite unsuited to more general loads. A new ASCI initiative is called Blue Lite and based at Lawrence Livermore National Laboratory. A 12,000-node cluster is now in prototype and the target is for a 64,000 nodes to deliver 384 TeraFlops in 2003. The DoE has launched an initiative to try to get round software barriers by means of a multi-million dollar project for a  HYPERLINK "http://www.csm.ornl.gov/scidac/ScalableSystems/" Scaleable System Software Enabling Technology Center. Groups concerned are those in computer science and mathematics. One site chosen to work on this is the San Diego SuperComputing Centre and other sites should be known shortly. As stated above, most of the work described at the conference was commercial or proprietary but one of the most interesting talks came from Rich Ferri of IBM who spoke about Open Source software. He noted that there are many parallel projects in many computing fields with a high degree of overlap, some collaboration but a lot of competition. This results in many projects dying. He noted that when such projects are aimed at our science, we are rather prone at winnowing out overlapping projects. HYPERLINK "http://conferences.fnal.gov/lccws/papers/fri/Cplant-Overview.ppt"Cplant (Neil Pundit, Sandia National Laboratories) HYPERLINK "http://www.cs.sandia.gov/cplant/"Cplant is a development at Sandia Labs, home of  HYPERLINK "http://www.llnl.gov/asci/" ASCI Red as well as others of the worlds largest clusters. In fact, Cplant is based on ASCI Red and attempts to extend the features of it. Cplant is variously described as a concept, a software effort and a software package. It is used to create Massively Parallel Processors (MPPs) at commodity prices. It is currently used at various sites and projects and is licensed under the  HYPERLINK "http://www.gnu.org/copyleft/gpl.html" GPL (Gnu Public Licence) open source licence but it has also been trade-marked and licensed to a firm for those sites wishing guaranteed support. It can in principle scale to 10,000 node clusters although thus far it has only been used on clusters ranging from 32 to 1024 nodes. Clusters are managed in planes of up to 256 nodes with I/O nodes kept separate and worker nodes being diskless. The configuration consists of Compaq Alpha workstations running a Linux kernel, portals for fast messaging, home-grown management tools and a parallel version of  HYPERLINK "http://nfs.sourceforge.net/" NFS known as ENFS (Extended NFS) where special techniques are added to improve file locking semantics above those of standard NFS. ENFS offers high throughput with peaks of 117 MBps using 8 I/O servers connected to an SGI Origin 2000 data feeder. Other tools in use include PBS for batch queues with an improved scheduler,  HYPERLINK "http://www.myri.com/myrinet/overview/" Myrinet for high speed interconnects and various commercial software tools such as the  HYPERLINK "http://www.etnus.com/" TotalView debugger. Myrinet has been the source of a number of incidents and they have worked with the supplier to solve these. Over the past 2 years, it has required some 25 FTE years of effort of which ENFS alone required 3 FTE years. During the building-up phase of the cluster, there appears to be a transition at around 250 nodes to a new class of problem. Many errors are hidden by (vendor) software stacks; often, getting to the root cause can be difficult. Among other lessons learned during the development was not to underestimate the time needed for testing and releasing the package. Not enough time was allowed for working with real applications before release. A structured, 5 phase test scheme is now being formalised with phases ranging from internal testing by developers and independent testers to testing by external friendly testers. Closing The workshop ended with the delegates agreeing that it had been useful and should be repeated in approximately 18 months. No summary was made, the primary goal being to share experiences, but returning to the questions posed at the start of the workshop by Matthias Kasemann, it is clear that clusters have replaced mainframes in virtually all of the HENP world, but that the administration of them is far from simple and poses increasing problems as cluster sizes scale. In-house support costs must be balanced against bought-in solutions, not only for hardware and software but also for operations and management. Finally the delegates agreed that there are several solutions for, and a number of practical examples of, the use of desktops to increase overall computing power available. It was agreed to produce a proceedings (this document) and also to fill in sections of the Guide to Cluster Building outline circulated earlier in the week. Both these will be circulated to all delegates before publication and presented at the Computing in High-energy Physics conference in Beijing in September ( HYPERLINK "http://www.ihep.ac.cn/~chep01/" CHEP01) and at the  HYPERLINK "http://wwwinfo.cern.ch/hepix/meetings.html" Fall HEPiX meeting in LBNL (Lawrence Berkeley National Laboratory) in October. Having judged the workshop generally useful, it was agreed to schedule a second meeting in about 18 to 24 months time. Alan Silverman 18 January 2002  Symmetric Multi-Processor  High-energy and Nuclear Physics   HYPERLINK "http://user.web.cern.ch/user/Index/LHC.html" LHC, the Large Hadron Collider, is a project now in construction at  HYPERLINK "http://www.cern.ch/" CERN in Geneva. Its main characteristics in computing terms are described below, as are the meanings of Tier 0 and Tier 1. It is expected to come online during 2006.   HYPERLINK "http://wwwinfo.cern.ch/hepix/" HEPiX is a group of UNIX users in the High-energy Physics community who meet regularly to share experiences and occasionally undertake specific projects.  Examples of Tier 1 sites are Fermilab for the US part of CMS and BNL for the US part of ATLAS.  Message Passing Interface   HYPERLINK "http://www.transarc.ibm.com/Product/EFS/AFS/index.html" AFS is a distributed file system developed by Transarc and now owned and marketed by IBM.  HYPERLINK "http://www.openafs.org/frameless/main.html" OpenAFS is an open source derivative of AFS where IBM is a (non-active) participant.   HYPERLINK "http://www4.clearlake.ibm.com/hpss/index.jsp" HPSS is hierarchical storage system software from IBM designed to manage and access 100s of terabytes to petabytes of data. It is used at a number of HEP and other sites represented at this workshop.  Graphical User Interface.   HYPERLINK "http://www.platform.com/products/LSF/" LSF is a workload management software suite from Platform Corp that lets organizations manage their computing workloads. Its batch queuing features are widely used in HEP and other sites represented at this workshop.   HYPERLINK "http://www.objectivity.com/" Objectivity is an object oriented database product from the Objectivity Corporation.   HYPERLINK "http://www.corba.org/" CORBA is the Common Object Request Broker Architecture developed by the Object Management Group (OMG). It is a vendor-independent scheme to permit applications to work together over networks.   HYPERLINK "http://www.openpbs.org/" PBS is the Portable Batch System, developed at Veridian Systems for NASA but now available as open source.  1 Tflop/s is a Tera (1000 million) Floating Point operations per second  PXE is Preboot Executive Environment, a method to boot a mini-kernel  They had been using SUNs autoclient, which uses cacheFS but had stopped because of bottlenecks and the risks of a single point of failure for the cluster.  The MAC address (short for Media Access Control address) is the hardware address that uniquely identifies each node of a network.  DNS is the Domain Name Scheme used to translate a logical network name to a physical network address.   HYPERLINK "http://www.toolinux.com/linutile/configuration/kickstart/" Kickstart is a tool from Redhat which lets you automate most/all of a RedHat Linux installation   HYPERLINK "http://www.securityfocus.com/focus/sun/articles/jumpstart.html" JumpStart is Sun's method of providing a turnkey, hands-off solution to installing Solaris   HYPERLINK "http://tim.web.cern.ch/tim/pcm/pcm.html" DEC Polycenter replaces the PC console terminal or monitor. It connects to the serial port of clients via terminal servers and collects the console output from the RS232 serial lines. The host system provides thus a central point for monitoring the console activity of multiple systems, as well as for connecting directly to those systems to perform management tasks.  API Application Programming Interface a defined method to write a new module and interface it to the tool.  SLOC Significant (non-comment) Lines Of Code  Another tip described is to be aware of the vendors business cycle; the possibility of obtaining a good deal increases in the final days of a quarter.  Moores Law states that the number of transistors on a chip doubles every 18-24 months.   HYPERLINK "http://www.myri.com/myrinet/overview/" Myrinet is a high-performance, packet-communication and switching technology from Myricom Inc. that is widely used to interconnect clusters.  See Section 5.2 above for an explanation of these sites.  SLAC has estimated that recovering from a power down takes up to 8 to 10 hours with several people helping.   HYPERLINK "http://www.rpm.org/" RPM is the Red Hat Package Manager. While it does contain Red Hat in the name, it is completely intended to be an open packaging system available for anyone to use. It allows users to take source code for new software and package it into source and binary form such that binaries can be installed and tracked and source can be rebuilt. It also maintains a database of all packages and their files that can be used for verifying packages and querying for information about files and/or packages.  The Multicast Dissemination Protocol (MDP) is a protocol framework and software toolkit for reliable multicasting data objects. It was developed by the US Navy.  DAGman stands for Directed Acyclic Graph Manager  bbftp is described below, in the NIKHEF talk in the Small Site Session.  A cluster or farm in Condor terms is called a pool  iSCSI transmits the native SCSI disc/tape access protocol over a layer of the IP stack.  For example, consider the tool described above by Chuck Boeheim of SLAC that aggregates error reports across a large cluster into an easily digested summary.  MOSIX is a software package that enhances the Linux kernel with cluster computing capabilities. See the Web site at http://www.mosix.cs.huji.ac.il/txt_main.html.  BQS is a batch scheduler developed and used at CCIN2P3. See talk from this site in the Small Site Session.  SAN Storage Attached Network  A cluster or farm in Condor terms is called a pool.  HYPERLINK "http://www.seti-inst.edu/science/setiathome.html"Seti@Home is a scheme whereby users volunteer to run an application on their nodes, effectively using up idle time..  FTE Full Time Equivalent, a measure of human resources allocated to a task or project.  The code is available on request from Fermilab (contact M.Kaltetka) but subject to certain DoE restrictions on export.  For each service using Kerberos, there must be a service key known only by Kerberos and the service. On the Kerberos server, the service key is stored in the Kerberos database. On the server host, these service keys are stored in key tables, which are files known as keytabs  ssh secure shell in UNIX  The Genetics Computer Group (or 'Wisconsin') package is a collection of programs used to analyse or manipulate DNA and protein sequence data  The tcsch shell (called tc-shell, t-shell, or trusted shell) has all of the features of csh plus many more. LSTcsh takes this one step further by building in LSF load balancing features directly into shell commands.  Crack is a password-guessing program that is designed to quickly locate insecurities in Unix (or other) password files by scanning the contents of a password file, looking for users who have misguidedly chosen a weak login password. See the appendix from the previous version for more details.  Public Key Infrastructure  PPDG stands for Particle Physics Data Grid  GriPhyn stands for Grid Physics Network  The input rate from the data acquisition system to the ATM switch peaks at 260 MBps.   HYPERLINK "http://www-unix.mcs.anl.gov/mpi/fastchoice.html" MPI (Message Passing Interface) is a library specification for message passing.  ISS Interactive Session Support  DNS is the Domain Name Scheme used to translate a logical network name to a physical network address.  renice is a UNIX command to adjust the priority of a running job.  This summary, indeed this workshop, does not attempt to describe in detail such tools. The reader is referred to the web references for a description of the tools and to conferences such as those sponsored by Usenix for where and how they are used.  Codine is a job queuing system made by  HYPERLINK "http://www.genias.net/geniasde.html" GENIAS Software from Neutraubling, Germany   HYPERLINK "http://www.llnl.gov/asci/" ASCI is the Accelerated Strategic Computing Initiative funded by the US Department of Energy.  This cluster is 31st in the November edition of the  HYPERLINK "http://www.top500.org/" Top500 SuperComputer list.  from Etmus Inc. &'<>DF]^_k-fG H ( )   %&DJK()*./MNpqrz{ŷjUj_UCJ\0J5jl5U j5UjU0JjU jUOJQJCJ j0JU 5CJ H*5CJ 5>'(MNZ[\]o  'DEF$a$$ & Fa$$a$cdFYZXY6e!T!!"""h#'$ & F!a$$a$$%Zjk(BCuvw)*+NOAj5UjUjXUjUjUjGUOJQJ0J5jP5U j5U50J jUjU=ABC]^+ , [ \ ] c d !!"""@#A#B#f#g#h########s&t&&&&&&''R'S'T'_'`''''''''''=(>(b(c(jU0JCJjCJU jCJUCJj Uj Uj, U0J5j5 5U j5U5 j0JUj U0J jUj U8'''**++///// 1 1144&5'56688999h:_>`>$ & Fa$$a$c(d(z({(F,G,u,v,w,,,--X-Y-Z-]-^-_-//@/A/B/G/H///// 11X1Y1Z1w1x1z1111111111333333333344t5u5ɿ˻jUmH sH mH sH jUmH sH j;U0J5jL5U5 j5U\5CJjU j0JUjUj+U0J jU @ @;Aq?q@q񟙟񓈓zjD)5UmH sH j5UmH sH  5mH sH  \mH sH 5CJmH sH j(UmH sH j'UmH sH jC'UmH sH j0JUmH sH j&UmH sH 0Jj%UmH sH mH sH jUmH sH .#j$jllmm2o3oppppp[q^t_tuuxxxxxxjzkzzz$ & Fa$$ & Fa$$a$@qMqNqOqZq[qttvv*w+w,w/w0wzzzzzz1{2{3{7{8{w{x{{{{{{{{|||| |}}"#MNOSTUYZj.UmH sH j[-Uj,UmH sH j+UmH sH j*UmH sH jUmH sH 0Jj=*U jUj0JUmH sH mH sH  5mH sH j5UmH sH 0J56zJ{{}͋΋ST֓ד͖Ζ$ & F a$$a$$ & Fa$?@AEF#$WXY\]ɁʁˁӁԁ67abcghi#$VWXdeĈňij5UmH sH jt2UmH sH j1UmH sH j0JUmH sH j1U jUjm0UmH sH  5mH sH j/UmH sH j.UmH sH mH sH jUmH sH 0J5ijk%&YZ[^_`WX 678CDElmҐ;<=BCطثȷط؏ 0JmH sH j7UmH sH j6UjE6Uj5UmH sH jUmH sH j4U0Jj4U jUmH sH  5mH sH 0J5j5UmH sH j35UmH sH 5GHWXxy`a|deKLMSTqПӟ΢ħħ j5U5j<Ujl;U jUj:UmH sH j9UmH sH 0Jj9UmH sH j0JUmH sH  0JmH sH jV8UmH sH mH sH jUmH sH 5Ζno`a|1qqrПџҟӟZ[7$ & F a$$ & Fa$$ & F a$$a$΢ϢТӢԢբEFѪҪӪߪOPz{|_`amno}~<=KLhi˿ j0JUjAUjAUj>@5Uj?Uj>UmH sH mH sH jUmH sH 0Jj=U jU50J5 j5Uj=5U;ӥԥz{ڨۨ~ҮӮѱұ,-2*+TUV$a$$ & F t^a$i-2VfԼռּڼۼ޽߽OPQUVYZ"#j0JUmH sH jFUmH sH 5CJmH sH jEUmH sH jDUmH sH jDUmH sH jvCUmH sH jUmH sH  5mH sH mH sH 50J jUjBU6Vf67 *\$ a$ $ ^a$$ & F h**^*a$ & F$a$ &'Z[\jkefde$%/167 j5U5jKUmH sH jJUmH sH j1JUjtIUmH sH jHUmH sH jHUmH sH jUmH sH mH sH 0J jUj=GU7mn M/01()$ a$$ & F% h88^8a$$a$#$WXYfg@ABEFcdjjQUmH sH jPUmH sH jOUj OUmH sH mH sH jUmH sH jPNU0JjMU jU j0JU50J5 j5UjL5U8$ 676}$ & Fa$$ & Fa$$a$ )*$56 567?@6}_}~jiVU5\5CJjUUjTUmH sH jJTUjkSU j0JU5jRU jU0JjUmH sH jRUmH sH mH sH < ^_}|}9:;}~WXxy7$8$H$ $7$8$H$a$7$8$H$$ & F t^a$$a$$ & Fa$}9;<NOuCJOJQJ^JmH sH jZUmH sH jYUmH sH  0J5\j Y5U\5\j5U\j0JUmH sH  0JmH sH jPXUmH sH jWUmH sH jUmH sH mH sH 0J jUj WU. :<=IJKQRSWX ݹԮ|t||ݮj^U jU0J5\mH sH j ^5U\mH sH 5\mH sH j5U\mH sH j0JUmH sH j`]UmH sH 0Jj\UmH sH jUmH sH mH sH  0J5\j[5U\5\j5U\-ab?@:;<]^97$8$H$$0^0a$$a$ $7$8$H$a$$ & F t^a$9:        \]^_!e$ & Fa$$ & Fa$7$8$H$$ & F t^a$$^a$$a$ $7$8$H$a$         ] ^        \_ !"()*?@GH:;<EFWXjcUjbU0JjbU jU0J5j!a5U5 j5U5CJjv`UmH sH 5\mH sH j0JUmH sH  0JmH sH jUmH sH j_UmH sH 5@EF/| } ""s#&&&&}''$ & Fa$ $ & F^a$$a$$ & F ^a$./23478DEFOP./PQRUV     C!D!l!m!n!r!s!u!v!!!jjUjiUj iUjrhUjgUj4gUjOfU0JjneU jU0J5jd5U5 j5U?!!!!!!!!!!! " "=">"?"B"C"S"T""""""""<#=#>#]#^#`#r#s#t########"$#$$$5$6$w%x%%%%%%)))²jpUjeoUjxnU0J5jm5U5 j5UjlUmH sH jlUmH sH mH sH jUmH sH jVkU0J jU9'(Q(R()))+**%+,,,..22a4b45566667$ & Fa$$ & Fa$$a$$ & Fa$)))))+*,*l*m*n*x*y***+++++++A,B,C,L,M,Y,Z,,,,,,,,//2/3/4/=/>/?/f0i000000003333333t4٣ٗjvUmH sH j#vUmH sH jluU 5mH sH j}tUjsUjrUmH sH mH sH jUmH sH jrU0J jUj%qU;t4u44444466e7f777777777::;;;;;;;;;;=>&B'BZB[B\B_B`BaBBBBBBBBBBCƷέjl{UmH sH jzUjzUjCyU jU0J5\mH sH jXx5U\mH sH 5\mH sH j5U\mH sH 5CJmH sH 0JjwUmH sH mH sH jUmH sH 47/7G7d7e7799V:W::::;w<x<0=1====>4>7$8$H$ $7$8$H$a$$ & F t^a$$a$$ & Fa$4>W>>>???@@AA_E`EjFkFcGdGHHHH.ILIII $ & F'7$8$H$a$7$8$H$ $7$8$H$a$ $ & F&7$8$H$a$>CDCEChCkCDD E E EEEkEnEFFFFF.I1I2IIIIIIIIBJCJDJJJKJLJVJWJJJKKKKK2L3LWLXLYL뼶밥{j~UmH sH j3~UmH sH 0J5jF}5UmH sH j5UmH sH  5mH sH  \mH sH 5CJmH sH j0J>*B*Uphj0JUmH sH jq|UmH sH 0JmH sH jUmH sH  0JmH sH 0IIIIIIWJLL-O.OPPOTPTRVSVTVV;X[?[r[s[t[{[|[]]^^^^^9^:^c^d^e^h^i^m^n^^^^^^^^^jUjoUmH sH jU0JjU jUjNUmH sH jUmH sH  0JmH sH jUmH sH jUmH sH  5mH sH mH sH 7^^^^^1_2_e_f_g_j_k________````'a(a)a6a7acccccddddddddddd*e+e,e3e4e#f$fcfdfefkflffffffffjIUjtUmH sH mH sH jUmH sH jUjUjUj0UjQUjU0J jUjU>cceeggHhIhiiiillppqqXvYv[w\w]w^wowwx$ & Fa$$ & Fa$$ & F"a$$a$fiiibicidiqirisiiillmmmmmmmmmmmm[n\nnnnnnnnnnnnnnn(o)o*o/o0ooopppppqqssPsʾ j0JUj֕UjUjvUjUjҒUmH sH mH sH jUmH sH 0JjU jU0J5j5U j5U5:PsQsRsXsYssssssssttttttt^wowwxx6y7y8y>y?y@ygyhyiyyyyyyA|B||||||~~~~234BCU[|֦jU j0JU 0JmH sH jUmH sH jUmH sH jݙUj5U5 j5UmH sH 5CJjAUj^U0J jUjU8xxxhy}}~~&'MN $ & F^a$$a$$ & Fa$|}@ABFGyzҀӀPQ34bcdlmrs$%ijknoJKlmnqr܆݆5CJ j0JU5jCUjdUjUjUjUjZUjU0JjbU jUA؏JKL#$ϚКњ7&'`a $ & F^a$$ & Fa$$a$$ & Fa$ŏƏǏʏˏ׏؏ĐŐƐɐʐϐА -./34LMњҚ !"67]^jUmH sH j5UjUj5UjUUjU0JjU jU0J5jܢ5U5 j5UmH sH =,-XYZ^_dedeڢ pqêĪEF01bcdghmn'()12hi՝j UmH sH jXUjUj֪UjUCJ j0JU5mH sH j>U jU0JjUmH sH jUmH sH ;aʡˡآ٢ڢgh'(  <ab$ & Fa$$ & Fa$ $ & F^a$$a$ vw[\]ıű Ҳеѵҵڵ۵ 123=>hŽŲjtU j0JUjUjU 0J5\jݯ5U\j5U\5\5CJj2UjUUmH sH mH sH jUmH sH 0J jUjU7=>طٷef;'(PQJ7$8$H$7$8$H$ $7$8$H$a$$ & Fa$$a$h|};ijkrsKABCIJ678;<=xy34ҿҿҿҿҿjWUmH sH jUmH sH jUmH sH jUmH sH CJOJQJ^J 0JmH sH jUmH sH jUmH sH 5\mH sH CJOJQJ^JmH sH j0JUmH sH mH sH 4JKپ@XY7$8$H$7$8$H$ $h7$8$H$^ha$ $ & F*7$8$H$a$$a$ $7$8$H$a$4[\]`abcQRSdefbc|t||j0U jUjwUmH sH 0JjθUmH sH 0J5j5UmH sH j5UmH sH  5mH sH 5CJmH sH CJOJQJ^JmH sH j0JUmH sH  0JmH sH jUmH sH j@UmH sH mH sH -89abz{{$ & Fa$$a$7$8$H$7$8$H$ $7$8$H$a$ t%&JK$ & F#a$$ & F 88^8a$$a$NOP\]-./56QR~зųxj[UmH sH jUmH sH jUmH sH j0JUmH sH jtUmH sH 0J5j5UmH sH j5UmH sH  5mH sH j޻U jU0JjUmH sH jUmH sH mH sH /#$#$yzFGHSTj5U\mH sH 5\mH sH 5CJmH sH j0JUmH sH jUmH sH jU jUjdUmH sH jUmH sH  0JmH sH j UmH sH mH sH 0JjUmH sH -%&,g NO0d7$8$H$ $ & F+7$8$H$a$ $7$8$H$a$$ & Fa$$ & Fa$$a$./0KLlm*+,123  %&)*̹խ̹բ՝Ն̹z̹բjvUmH sH jCUmH sH 0JjU jUj0JUmH sH jUmH sH  0JmH sH jVUmH sH jUmH sH mH sH 0J5\mH sH j5U\mH sH jg5U\mH sH -z{6'( $ & F.7$8$H$a$ $ & F-7$8$H$a$ $ & F,7$8$H$a$ $7$8$H$a$7$8$H$ YZ[ !"%&'BCDGHIJ|}~!"#&'}qjpUmH sH jUmH sH jUmH sH  0JmH sH j5UmH sH jUmH sH 0JjVU jUmH sH 0J5\mH sH je5U\mH sH j5U\mH sH 5\mH sH CJOJQJ^JmH sH ,'( NO{|}cd jkjAUmH sH jUmH sH jUmH sH jUmH sH  5mH sH  \mH sH 5CJmH sH 0Jj[UmH sH jUmH sH 5\mH sH mH sH j0JUmH sH 5uvab*+NO    e f   k l j7$8$H$ $7$8$H$a$jk Nc)c$ & F0a$$ & Fa$ $ & F^a$$ & Fa$$a$ $7$8$H$a$)*Z[\lm!"XYZ^_       !!!$$$$$$$зųЍųjUmH sH j5UmH sH jUmH sH j7UmH sH 0J5jP5UmH sH j5UmH sH  5mH sH jUmH sH mH sH 0JjUmH sH jUmH sH 2cd     !'#(# $ $%%2'3'''(((<)++...0/0 $ & F^a$$a$$$$%%%%% ((4(5(6(>(?(k(l(((((((() )!):);)<))))))))a*b******++,, , ,,,,jUmH sH j UmH sH jwUmH sH 0J5j5UmH sH  5mH sH j5UmH sH jUmH sH jUmH sH 0JjEUmH sH jUmH sH mH sH 4,7,8,9,@,A,00000011133D3E3F3L3M3888f8g8h8888::Q:R:S:V:W:::::::::::::;;a;b;;ݦj UjiUjU jUj5UmH sH j UmH sH 0J5j5UmH sH j5UmH sH  5mH sH 0JjUmH sH jtUmH sH mH sH 6/0000133335566 888899====??3@b@@$ & F a$$ & F$a$$a$;;;;;;;;;;;;><?<y<z<{<<<==<===p=q=r============h>i>>>>>>HAIAAAAAACCCCCCDDD)KlKνǹΟǹ5CJj85UjOUjU0J5j5U j5U5jU j0JUjUjpU0J jUjU=@HAAACCCDF FhGiGII&K'K(K)KlK1N2NOOPPU?U@UDUEUVVVVVVV X X#Y$YMYNYOYRYSYbZjUmH sH mH sH jUmH sH jUj[UjU50J5j5CJU5CJj5CJUjU0Jj!U jU j0JU80T1T2T3TTUUWW7\8\^^^^aaccBdCdDdSdcddde$a$$ & F$a$$a$bZeZZZZZZZZ!["[E[F[G[P[Q[Z[[[^^abb)c*c+c1c2c?c@cxcyczccccdddddddddddddd$e%eFeGeHeLeMeeeeefff$f%fffgg7g8g9gjUjUj4UjmUjUCJ5CJ j0JUj!UjdU jU0JFefg7glhtiij&klll7mmZnnoopq8rirs]s#t^ttv[www9g:g~ggggggghhhhhlhmhnhohhhhhhtiuiiiiiiiiiijjjjjjjjj&k'k(k)kLkMkNkSkTkllll7l8l9lhttp://www-numi.fnal.gov:8875/DyK yK 6http://www-boone.fnal.gov/DyK yK ,http://www.auger.org/DyK yK @http://runiicomputing.fnal.gov/DyK yK ~http://conferences.fnal.gov/lccws/papers/tues/LHC_Tues_WvR.pptDyK yK zhttp://conferences.fnal.gov/lccws/papers/tues/Datavolume.pptDyK yK Dhttp://monarc.web.cern.ch/MONARC/DyK yK http://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.pptDyK yK 2http://www.ieeetfcc.org/DyK yK http://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.pptDyK yK ^http://www.dcs.port.ac.uk/~mab/tfcc/WhitePaperDyK yK http://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.pptDyK yK .http://www.top500.org/DyK yK Bhttp://www-unix.mcs.anl.gov/sut/DyK yK `http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlDyK yK ,http://www.scyld.com/DyK yK ~http://conferences.fnal.gov/lccws/papers/weds/lccws_seminar.psDyK yK 2http://www.rhic.bnl.gov/DyK yK 8http://nfs.sourceforge.net/DyK yK (http://www.ssh.com/DyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK :http://vacm.sourceforge.net/DyK yK Bhttp://web.mit.edu/kerberos/www/DyK yK http://conferences.fnal.gov/lccws/papers/tues/BaBar Use Case.pptDyK yK http://conferences.fnal.gov/lccws/papers/tues/clusters_wolbers_may2001.pptDyK yK >http://www-isd.fnal.gov/fbsng/DyK yK |http://conferences.fnal.gov/lccws/papers/tues/Tues_Sanger.pptDyK yK Lhttp://www.platform.com/products/LSF/DyK yK 8http://nfs.sourceforge.net/DyK yK jhttp://conferences.fnal.gov/lccws/papers/tues/H1.pdfDyK yK .http://www-h1.desy.de/DyK yK \http://wwwinfo.cern.ch/pdp/vm/diskserver.htmlDyK yK Dhttp://www.suse.com/index_us.htmlDyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK lhttp://conferences.fnal.gov/lccws/papers/tues/Kek.pptDyK yK 8http://nfs.sourceforge.net/DyK yK |http://conferences.fnal.gov/lccws/papers/tues/22SimoneQCD.pdfDyK yK Lhttp://mauischeduler.sourceforge.net/DyK yK `http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlDyK yK `http://homepage.tinet.ie/~djkoshea/cachefs.htmlDyK yK <http://www.cs.wisc.edu/condorDyK yK bhttp://www.seti-inst.edu/science/setiathome.htmlDyK yK Lhttp://choruswww.cern.ch/welcome.htmlDyK yK >http://www.iu.hio.no/cfengine/DyK yK 8http://nfs.sourceforge.net/DyK yK 8http://www.sistina.com/gfs/DyK yK http://conferences.fnal.gov/lccws/papers/weds/LCW_CERN_clusters.pptDyK yK Lhttp://www.platform.com/products/LSF/DyK yK hhttp://c.home.cern.ch/c/cborrego/www/anis/anis.htmlDyK yK \http://wwwinfo.cern.ch/pdp/ose/sue/index.htmlDyK yK Jhttp://wwwinfo.cern.ch/pdp/ose/asis/DyK yK :http://vacm.sourceforge.net/DyK yK lhttp://www.digi.com/solutions/termsrv/etherlite.shtmlDyK yK \http://service-sure.web.cern.ch/service-sure/DyK yK Lhttp://proj-pem.web.cern.ch/proj-pem/DyK yK 0http://www.valinux.com/DyK yK :http://vacm.sourceforge.net/DyK yK Jhttp://systemimager.sourceforge.net/DyK yK 0http://rsync.samba.org/DyK yK zhttp://conferences.fnal.gov/lccws/papers/weds/SLAC_WedAM.pptDyK yK Lhttp://www.platform.com/products/LSF/DyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK 8http://nfs.sourceforge.net/DyK yK 8http://www.objectivity.com/DyK yK Zhttp://www4.clearlake.ibm.com/hpss/index.jspDyK yK ,http://www.cisco.com/DyK yK Jhttp://www.slac.stanford.edu/BFROOT/DyK yK hhttp://www.netbsd.org/Documentation/bsd/amdref.htmlDyK yK 8http://nfs.sourceforge.net/DyK yK hhttp://consult.cern.ch/writeup/coreuser/node14.htmlDyK yK :http://vacm.sourceforge.net/DyK yK zhttp://www.jlab.org/hepix-hepnt/presentations/Ranger_Update/DyK yK http://conferences.fnal.gov/lccws/papers/weds/farm_hardware.pdfDyK yK thttp://www.toolinux.com/linutile/configuration/kickstart/DyK yK Jhttp://systemimager.sourceforge.net/DyK yK :http://vacm.sourceforge.net/DyK yK nhttp://conferences.fnal.gov/lccws/papers/weds/pdsf.pptDyK yK ,http://www.nersc.gov/DyK  yK hhttp://www.platform.com/products/LSF/whatsnew41.aspcDyK yK jhttp://sverre.home.cern.ch/sverre/PASTA_98/index.htmDyK yK 4http://www.specbench.org/DyK yK dhttp://www.iaeste.dk/~henrik/projects/mprime.htmlDyK yK Rhttp://sourceforge.net/projects/va-ctcs/DyK yK bhttp://www.seti-inst.edu/science/setiathome.htmlDyK yK Fhttp://www-unix.mcs.anl.gov/chiba/DyK yK `http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlDyK yK Lhttp://mauischeduler.sourceforge.net/DyK yK 8http://nfs.sourceforge.net/DyK yK Lhttp://parlweb.parl.clemson.edu/pvfs/DyK yK ,http://www.scyld.com/DyK yK Phttp://www.mcs.anl.gov/systems/softwareDyK (http://www.mcs.anl.gov/systems/softwareyK Phttp://www.mcs.anl.gov/systems/softwareDyK yK ~http://conferences.fnal.gov/lccws/papers/weds/pdsf-alvarez.pptDyK yK Lhttp://www.myri.com/myrinet/overview/DyK yK Lhttp://www.linuxnetworx.com/index.phpDyK yK \http://sourceforge.net/projects/systemimager/DyK yK phttp://oss.software.ibm.com/developerworks/projects/luiDyK yK >http://www.csm.ornl.gov/oscar/DyK yK 8http://nfs.sourceforge.net/DyK yK 0http://rsync.samba.org/DyK yK Fhttp://www.acl.lanl.gov/linuxbios/DyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK .http://www.debian.org/DyK yK >http://www.iu.hio.no/cfengine/DyK yK `http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlDyK yK 0http://www.ensembl.org/DyK yK ,http://www.mysql.com/DyK yK 8http://nfs.sourceforge.net/DyK yK Jhttp://manimac.itd.nrl.navy.mil/MDP/DyK yK zhttp://conferences.fnal.gov/lccws/papers/weds/condor-i-o.pptDyK yK `http://www.bo.infn.it/calcolo/condor/index.htmlDyK yK http://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.htmlDyK yK http://conferences.fnal.gov/lccws/papers/weds/DataManagement.pptDyK yK 0http://www-d0.fnal.gov/DyK yK :http://ccweb.in2p3.fr/bbftp/DyK yK http://conferences.fnal.gov/lccws/papers/thur/LSCCW_umn_ccgb.pptDyK yK >http://www.cs.wisc.edu/condor/DyK yK Xhttp://sourceforge.net/projects/intel-iscsiDyK yK :http://www.fibrechannel.com/DyK yK phttp://conferences.fnal.gov/lccws/papers/thur/Dolly.pptDyK yK vhttp://www.google.com/search?q=cops+eth&btnG=Google+SearchDyK yK thttp://www.toolinux.com/linutile/configuration/kickstart/DyK yK Nhttp://corvus.kek.jp/~manabe/pcf/dollyDyK yK phttp://conferences.fnal.gov/lccws/papers/weds/Rocks.pptDyK yK phttp://oss.software.ibm.com/developerworks/projects/luiDyK yK thttp://www.toolinux.com/linutile/configuration/kickstart/DyK yK 0http://rocks.npaci.edu/DyK yK ,http://www.mysql.com/DyK yK (http://www.rpm.org/DyK yK jhttp://rocks.npaci.edu/manpages/insert-ethers.8.htmlDyK yK 6http://snmp.cs.utwente.nl/DyK yK Xhttp://www.millennium.berkeley.edu/ganglia/DyK yK <http://www-isd.fnal.gov/ngop/DyK yK Lhttp://proj-pem.web.cern.ch/proj-pem/DyK yK Phttp://www.mcs.anl.gov/systems/softwareDyK yK http://conferences.fnal.gov/lccws/papers/thur/GRID-wp4-install.pptDyK yK |http://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/DyK yK 8http://www.eu-datagrid.org/DyK yK http://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/Tools/tool_survey.htmDyK yK phttp://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/DyK yK fhttp://www.dcs.ed.ac.uk/home/ajs/linux/updaterpms/DyK yK Jhttp://systemimager.sourceforge.net/DyK yK thttp://www.toolinux.com/linutile/configuration/kickstart/DyK yK ~http://www.securityfocus.com/focus/sun/articles/jumpstart.htmlDyK yK Fhttp://www.acl.lanl.gov/linuxbios/DyK yK Fhttp://bioswriter.sourceforge.net/DyK yK Jhttp://systemimager.sourceforge.net/DyK yK Rhttp://sourceforge.net/projects/va-ctcs/DyK yK zhttp://conferences.fnal.gov/lccws/papers/weds/Allocation.pptDyK yK Lhttp://www.platform.com/products/LSF/DyK yK Lhttp://www.platform.com/products/LSF/DyK yK >http://www.cs.wisc.edu/condor/DyK yK http://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.htmlDyK yK dhttp://standards.ieee.org/catalog/olis/index.htmlDyK yK |http://conferences.fnal.gov/lccws/papers/thur/HEPFarmConf.pptDyK yK 0http://www-d0.fnal.gov/DyK yK .http://www.surfnet.nl/DyK yK Nhttp://www.fnal.gov/docs/products/upd/DyK yK zhttp://www.fnal.gov/fermitools/abstracts/fbsng/abstract.htmlDyK yK >http://www-isd.fnal.gov/fbsng/DyK yK 4http://d0db.fnal.gov/sam/DyK yK :http://ccweb.in2p3.fr/bbftp/DyK yK ~http://www-unix.globus.org/mail_archive/datagrid/msg00217.htmlDyK yK phttp://conferences.fnal.gov/lccws/papers/thur/LCCWS.pptDyK yK .http://mufasa.desy.de/DyK yK Rhttp://www.foundrynetworks.com/products/DyK yK ,http://www.cisco.com/DyK yK Lhttp://www.myri.com/myrinet/overview/DyK yK Vhttp://cc.jlab.org/services/cue/cuesw.htmlDyK yK 8http://nfs.sourceforge.net/DyK yK Xhttp://www.microsoft.com/Mind/1196/CIFS.htmDyK yK Bhttp://www.networkappliance.com/DyK yK Lhttp://www.platform.com/products/LSF/DyK yK nhttp://cc.jlab.org/docs/scicomp/man-page/jsub-man.htmlDyK  yK hhttp://www.platform.com/products/LSF/whatsnew41.aspcDyK yK thttp://www.toolinux.com/linutile/configuration/kickstart/DyK yK Dhttp://www.isc.org/products/DHCP/DyK yK Xhttp://www.kaybee.org/~kirk/html/linux.htmlDyK yK dhttp://www.iaeste.dk/~henrik/projects/mprime.htmlDyK yK Fhttp://www.kernel.org/software/monDyK yK http://conferences.fnal.gov/lccws/papers/thur/RunningCluster.pptDyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK hhttp://consult.cern.ch/writeup/coreuser/node14.htmlDyK yK Zhttp://www4.clearlake.ibm.com/hpss/index.jspDyK yK 8http://www.objectivity.com/DyK yK Fhttp://doc.in2p3.fr/man/xtage.htmlDyK yK Hhttp://webcc.in2p3.fr/man/bqs/introDyK yK ^http://wwwinfo.cern.ch/pdp/castor/Welcome.htmlDyK yK rhttp://www.rl.ac.uk/cisd/Odds/hepix99/petravic/show.htmlDyK yK 8http://www.objectivity.com/DyK yK http://conferences.fnal.gov/lccws/papers/tues/22WrightCondor.pptDyK yK >http://www.cs.wisc.edu/condor/DyK yK http://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.htmlDyK yK `http://www.bo.infn.it/calcolo/condor/index.htmlSDyK yK http://caulerpa.marbot.uni-bremen.de/crit/nph-med.cgi/http://www.nas.nasa.gov/Pubs/Highlights/1999/19990302.htmlDyK yK 4http://www.ncsa.uiuc.edu/DyK yK Nhttp://www.cs.wisc.edu/condor/uwflock/DyK yK `http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlDyK yK Bhttp://web.mit.edu/kerberos/www/DyK yK Vhttp://www.openssl.org/docs/apps/x509.htmlDyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK (http://www.rpm.org/DyK yK lhttp://conferences.fnal.gov/lccws/papers/thur/chan.psDyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK Lhttp://www.platform.com/products/LSF/DyK yK :http://vacm.sourceforge.net/DyK yK zhttp://conferences.fnal.gov/lccws/papers/thur/ngop_lccws.pptDyK yK <http://www-isd.fnal.gov/ngop/DyK yK xhttp://conferences.fnal.gov/lccws/papers/thur/PEM_LCCWS.pptDyK yK Lhttp://proj-pem.web.cern.ch/proj-pem/DyK yK <http://www-isd.fnal.gov/ngop/DyK yK zhttp://www.jlab.org/hepix-hepnt/presentations/Ranger_Update/DyK yK Xhttp://www.millennium.berkeley.edu/ganglia/DyK yK Hhttp://www.kernel.org/software/mon/DyK yK Bhttp://www.maclawran.ca/bb-dnld/DyK yK 2http://www.netsaint.org/DyK yK 6http://snmp.cs.utwente.nl/DyK yK lhttp://www.platform.com/products/siteassure/index.aspDyK yK :http://vacm.sourceforge.net/#DyK yK http://conferences.fnal.gov/lccws/papers/thur/FNAL Security & Authentication (LCCWS).pptDyK yK Phttp://www.fnal.gov/cd/main/cpolcy.htmlDyK yK Bhttp://web.mit.edu/kerberos/www/DyK yK 6http://www.cryptocard.com/DyK yK http://www.sun.com/products-n-solutions/hardware/infoappliances.htmlDyK yK .http://www.globus.org/DyK yK phttp://www.accelrys.com/products/gcg_wisconsin_package/DyK yK Phttp://saaz.lanl.gov/LSF/LSF_page5.htmlDyK yK xhttp://www.users.dircon.co.uk/~crypto/download/c50-faq.htmlDyK yK 4http://csrc.nist.gov/pki/DyK yK xhttp://conferences.fnal.gov/lccws/papers/thur/WP4_LCCWS.pptDyK yK 8http://www.eu-datagrid.org/DyK yK Hhttp://cern.ch/hep-proj-grid-fabricDyK yK phttp://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/DyK yK \http://sourceforge.net/projects/systemimager/DyK yK :http://vacm.sourceforge.net/DyK yK zhttp://conferences.fnal.gov/lccws/papers/thur/ppdg_lccws.pptDyK yK *http://www.ppdg.net/DyK yK .http://www.globus.org/DyK yK <http://www.cs.wisc.edu/condorDyK yK >http://www.npaci.edu/DICE/SRB/DyK yK @http://cmsdoc.cern.ch/cms/grid/DyK yK 8http://www.eu-datagrid.org/DyK yK Hhttp://cern.ch/hep-proj-grid-fabricDyK yK 8http://www.objectivity.com/DyK yK 0http://www.griphyn.org/DyK yK ~http://conferences.fnal.gov/lccws/papers/thur/cdfl3_online.pdfDyK yK http://www.3com.com/products/en_US/prodlist.jsp?tab=cat&pathtype=purchase&cat=11&selcat=LAN+Switches+%28Stackable%2FFixed%29&family=325DyK yK ^http://www-cdf.fnal.gov/upgrades/upgrades.htmlDyK yK ,http://www.corba.org/3DyK yK http://www-oss.fnal.gov/fss/documentation/linux/521/rpm2html/521/fermi-benchmark-1.0-1.i386.htmlDyK yK ~http://www-cdf.fnal.gov/upgrades/computing/computing_cont.htmlDyK yK http://conferences.fnal.gov/lccws/papers/thur/LCW_balancing.pptDyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK Bhttp://www.tru64unix.compaq.com/DyK yK \http://wwwinfo.cern.ch/pdp/ose/sue/index.htmlDyK yK Jhttp://wwwinfo.cern.ch/pdp/ose/asis/DyK yK zhttp://www-4.ibm.com/software/webservers/edgeserver/iss.htmlDyK yK <http://www.cs.wisc.edu/condorDyK yK >http://www.iu.hio.no/cfengine/DyK yK \http://sourceforge.net/projects/systemimager/DyK yK Lhttp://sourceforge.net/projects/luis/DyK yK >http://www.csm.ornl.gov/oscar/DyK yK 8http://nfs.sourceforge.net/DyK yK Fhttp://www-unix.mcs.anl.gov/chiba/DyK yK vhttp://conferences.fnal.gov/lccws/papers/fri/A2summary.txtDyK yK Rhttp://sourceforge.net/projects/va-ctcs/DyK yK 0http://sourceforge.net/DyK yK http://conferences.fnal.gov/lccws/papers/fri/MonitoringA3_Summary.pptDyK yK <http://www-isd.fnal.gov/ngop/DyK yK Lhttp://proj-pem.web.cern.ch/proj-pem/DyK yK 2http://www.netsaint.org/DyK yK lhttp://www.platform.com/products/siteassure/index.aspDyK yK http://conferences.fnal.gov/lccws/papers/fri/Panel A4 summary.pptDyK yK 8http://www.eu-datagrid.org/DyK yK Hhttp://cern.ch/hep-proj-grid-fabricDyK yK *http://www.ppdg.net/DyK yK 0http://www.griphyn.org/DyK yK http://conferences.fnal.gov/lccws/papers/fri/Panel B1 summary.pptDyK yK <http://www.cs.wisc.edu/condorDyK yK http://conferences.fnal.gov/lccws/papers/fri/lccws-b2-summary.pptDyK yK Lhttp://www.platform.com/products/LSF/DyK yK 0http://www.openpbs.org/DyK yK <http://www.cs.wisc.edu/condorDyK yK Hhttp://webcc.in2p3.fr/man/bqs/introDyK yK >http://www-isd.fnal.gov/fbsng/DyK yK Zhttp://www.mosix.cs.huji.ac.il/txt_main.htmlDyK yK Lhttp://www.sun.com/software/gridware/DyK yK http://conferences.fnal.gov/lccws/papers/fri/LCCWS_B3_summary.pdfDyK yK Bhttp://web.mit.edu/kerberos/www/DyK yK xhttp://www.users.dircon.co.uk/~crypto/download/c50-faq.htmlDyK yK xhttp://conferences.fnal.gov/lccws/papers/fri/B4_Summary.pptDyK yK Rhttp://phys.columbia.edu/~cqft/qcdoc.htmDyK yK `http://www.csm.ornl.gov/scidac/ScalableSystems/DyK yK http://conferences.fnal.gov/lccws/papers/fri/Cplant-Overview.pptDyK yK Bhttp://www.cs.sandia.gov/cplant/DyK yK 4http://www.llnl.gov/asci/DyK yK Jhttp://www.gnu.org/copyleft/gpl.htmlDyK yK 8http://nfs.sourceforge.net/DyK yK Lhttp://www.myri.com/myrinet/overview/DyK yK ,http://www.etnus.com/DyK yK >http://www.ihep.ac.cn/~chep01/DyK yK Vhttp://wwwinfo.cern.ch/hepix/meetings.htmlDyK yK Xhttp://user.web.cern.ch/user/Index/LHC.htmlDyK yK (http://www.cern.ch/DyK yK <http://wwwinfo.cern.ch/hepix/DyK yK nhttp://www.transarc.ibm.com/Product/EFS/AFS/index.htmlDyK yK Vhttp://www.openafs.org/frameless/main.htmlDyK yK Zhttp://www4.clearlake.ibm.com/hpss/index.jspDyK yK Lhttp://www.platform.com/products/LSF/DyK yK 8http://www.objectivity.com/DyK yK ,http://www.corba.org/DyK yK 0http://www.openpbs.org/DyK yK thttp://www.toolinux.com/linutile/configuration/kickstart/DyK yK ~http://www.securityfocus.com/focus/sun/articles/jumpstart.htmlDyK yK Phttp://tim.web.cern.ch/tim/pcm/pcm.htmlDyK yK Lhttp://www.myri.com/myrinet/overview/DyK yK (http://www.rpm.org/DyK yK bhttp://www.seti-inst.edu/science/setiathome.htmlDyK yK `http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlDyK yK Hhttp://www.genias.net/geniasde.htmlDyK yK 4http://www.llnl.gov/asci/DyK yK .http://www.top500.org/] i4@4 NormalCJ_HmH sH tH 6@6 Heading 1$@& 5CJ\0@0 Heading 2$@&CJ<@< Heading 3$$@&a$ 5CJ\B@B Heading 4$<@&5CJ\aJF@F Heading 5 <@&56CJ\]aJ@@@ Heading 6 <@&5CJ\aJ6@6 Heading 7 <@&aJ<@< Heading 8 <@& 6]aJF @F Heading 9 <@&CJOJQJ^JaJ<A@< Default Paragraph Font.>@. Title$a$ 5CJ \,B@, Body Text$a$6@6 Footnote TextCJaJ8&@!8 Footnote ReferenceH*.U@1. Hyperlink >*B*ph8V@A8 FollowedHyperlink>*B* 8P@R8 Body Text 2$a$CJaJLQ@bL Body Text 3 7$8$H$CJOJQJ^JmH sH <T@r< Block Textx]^RM@R Body Text First Indent$x`a$@C@@ Body Text Indenthx^hLN@L Body Text First Indent 2 `JR@J Body Text Indent 2hdx^hLS@L Body Text Indent 3hx^hCJaJ4"@4 Caption xx 5CJ\*?@* Closing ^0@0 Comment TextCJL@ Date JY@J Document Map!-D M OJQJ^J4[@"4 E-mail Signature"0+@20 Endnote Text#CJd$@Bd Envelope Address!$@ &+D/^@ OJQJ^JaJB%@RB Envelope Return%CJOJQJ^J, @b, Footer & !,@r, Header ' !2`@2 HTML Address(6]Fe@F HTML Preformatted)CJOJQJ^J2 @2 Index 1*^`2 @2 Index 2+^`2 @2 Index 3,^`2 @2 Index 4-^`2@2 Index 5.^`2@2 Index 6/^`2@2 Index 70^`2@2 Index 81^`2@2 Index 92p^p`@!@@ Index Heading35OJQJ\^J,/@B, List4h^h`02@R0 List 25^`03@b0 List 368^8`04@r0 List 47^`05@0 List 58^`20@2 List Bullet 9 & F66@6 List Bullet 2 : & F167@6 List Bullet 3 ; & F268@6 List Bullet 4 < & F369@6 List Bullet 5 = & F4:D@: List Continue>hx^h>E@> List Continue 2?x^>F@> List Continue 3@8x^8>G@> List Continue 4Ax^>H@"> List Continue 5Bx^21@22 List Number C & F56:@B6 List Number 2 D & F66;@R6 List Number 3 E & F76<@b6 List Number 4 F & F86=@r6 List Number 5 G & F9d-@d Macro Text"H  ` @ OJQJ^J_HmH sH tH I@ Message HeadergI8$d%d&d'd-DM NOPQ^8`OJQJ^JaJ0^@0 Normal (Web)JaJ6@6 Normal Indent K^,O@, Note HeadingL8Z@8 Plain TextMCJOJQJ^J(K@( SalutationN.@@. Signature O^BJ@B SubtitleP$<@&a$OJQJ^JaJL,@L Table of AuthoritiesQ^`D#@D Table of FiguresR ^` D.@D TOA HeadingSx5OJQJ\^JaJ@ TOC 1T&@& TOC 2 U^&@& TOC 3 V^&@& TOC 4 W^&@& TOC 5 X^&@& TOC 6 Y^&@& TOC 7 Z^&@& TOC 8 [^&@& TOC 9 \^^G(^)/023;JPP8Y!\`TgpvvyGK"5W B1Emz܂|<b'9xH TZW  !"#$%&'()*+,-./0123456789:;<>W -?Cr _ d i"U.r6H\Oo7; M P @'(MNZ[\]o ' DEFYZXY6e!Th###&&''+++++ - --00&1'12244555h6_:`: < <;=<===>>???}@fCgC1E2E3EEGGIIKK KKQQQQRSDTTTWWWWTZUZ\\^^^^GaHadd#f$fhhii2k3klllll[m^p_pqqttttttjvkvvvJwwy{{}}}}͇·ST֏׏͒Βno`a|1qqrЛћқӛZ[7ӡԡz{ڤۤ~ҪӪѭҭ,-2*+TUVf67 *\mn M/01()$ 676} ^_}|}9:;}~WXxyab?@:;<]^9:\ ] ^ _   ! e    @ EF/|}s""""}##$Q$R$%%%+&&%'(((**..a0b01122223/3G3d3e3355V6W66667w8x80919999:4:W:::;;;<<==_A`AjBkBcCdCDDDD.ELEEEEEEEEWFHH-K.KLLOPPPRRSRTRR;Tسٳef;'(PQJKٺ@ŽƽXY89abz{{ t%&JK%&,g NO0dz{6'(uvab*+NOefk l j k       Nc)cd'(  !!2#3###$$$<%''**.,/,,,,-////1122 4444559999;;3<b<<H===???@B BhCiCEE&G'G(G)GlG1J2JKKLLCYLV^fPs|h4'$,;lKbZ9gnzDHIKLNOQRTVWXZ\^`bdehjkmnqstuwxz|}F'`>U#jzΖV9'74>IcxaJjc/0@0TewEGJMPSUY[]_acfgiloprvy{~F  % J).Mqz$jBv*NB]+\cAfs"""#S#_####=$c$z$F(v(()Y)])+A+G+ -Y-w----///t111333!4M4R4*5Z5c55J6P6?N@k@CDoDuD3EEE9InIrIIII KdKgKKKKLLLXOOOPPPQQQU0U4UW[WpW\](]@]]](_g_o_ab%b(bfbpbc!d(dIjwjjkkkk*l.ll?mMmr+s/sv2w7wwwwwwxx"{N{S{Y{{{|@|E|#}X}\}}}}6bg#WdĄj%Z^W 7Cl<BWxdLSϞӞEҦߦO{`m<hոڸ߹PUY&[jed$6#XfAEc)6?};N <JQ] ! ( G   ;EW37EO.QUCmru >BS=]s# 5 w!!!%%%+&m&x&&'''B(L(Y(((+3+=+,,,///t000e333777&>[>_>>>>>=?D?@ AAECFJFFGG2HXH`HWIIII:J;JgJmJnJKKtKxKMMMKNNNVRRRBUhUlUU VV^VVV>WsW{WYZZ9ZdZhZmZZZZZZ1[f[j[[[\\(]6]__`````+a3a#bdbkbbbbeceqehiiiii[jjjjjjj)k/kklloQoXoooopppt7u>uhuuuAxxxz3{B{|{{||A|F|y|||P~~~3clr$jnJmqƋʋŌɌό.3Lі ],Y^pæE0cgm(1h v\ĭѱڱ2=jrBI7;޾x3\`RdbO\.5Q##yGS/Kl+1 %) Z!%CGI}"&N|cj)[l!Y^     !! $5$>$k$$$$ %:%%%%a&&&'( ((8(@(,,-/E/L/4g446R6V6666667a777777>8z88<9q99999h:::H===??@YJJJMTMM3PPPPPPQ?QDQRRR#UNURUVVV!WFWPW^*_1_?_y__XtXtXXXXXXXXXXXXXXXXXXXtXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX@{ x GL/dhAlxa  \ f  19k8vQ 9bg 3 P XXXXXXXXXXXXXXXXXXXX _Hlt518884315 _Hlt518884125 _Hlt518884181 _Hlt518883785 _Hlt522424898 _Hlt522424899 _Hlt525110999 _Hlt525111000 _Hlt525111004 _Hlt518884379 _Hlt518884213 _Hlt521310577 _Hlt521310578 _Hlt526330405 _Hlt526330341 _Hlt526330342 _Hlt526331083 _Hlt526331109 _Hlt526330399 _Hlt520622265 _Hlt520622217 _Hlt519330884 _Hlt519331508 _Hlt519331657 _Hlt532636373 _Hlt532636374 _Hlt526328780 _Hlt526328781 _Hlt519331949 _Hlt519332077 _Hlt519332023 _Hlt532110186 _Hlt532110187 _Hlt516302266 _Hlt519342945 _Hlt519343039 _Hlt519862394 _Hlt519491717 _Hlt519491810 _Hlt519834839 _Hlt519662209 _Hlt520622709 _Hlt529787558 _Hlt529787559 _Hlt522695690 _Hlt522695691 _Hlt522695706 _Hlt522696789 _Hlt522696790 _Hlt522698624 _Hlt522698625 _Hlt519763369 _Hlt529962288 _Hlt529962289 _Hlt519763550 _Hlt519763604 _Hlt529962709 _Hlt529962665 _Hlt519763671 _Hlt520436715 _Hlt529963164 _Hlt522698010 _Hlt522698011 _Hlt519863488 _Hlt519863601 _Hlt522694172 _Hlt522694173 _Hlt519863711 _Hlt519863716 _Hlt520006149 _Hlt526137846 _Hlt517757449 _Hlt517757450 _Hlt526136508 _Hlt526136509 _Hlt520363283 _Hlt520363403 _Hlt520363259 _Hlt520363384 _Hlt520354570 _Hlt520354485 _Hlt525454580 _Hlt525454581 _Hlt525454585 _Hlt525454586 _Hlt525454587 _Hlt525454588 _Hlt525454590 _Hlt525447326 _Hlt525447276 _Hlt525448076 _Hlt525448077 _Hlt525454700 _Hlt525454701 _Hlt520465969 _Hlt520465737 _Hlt520465870 _Hlt520466116 _Hlt520466004 _Hlt520466006 _Hlt531419893 _Hlt520466231 _Hlt520466135 _Hlt522425455 _Hlt520437243 _Hlt525707181 _Hlt526330076 _Hlt519077455 _Hlt520000754 _Hlt519076390 _Hlt519076560 _Hlt519331168 _Hlt519077187 _Hlt520435550 _Hlt520001032 _Hlt519077906 _Hlt519078532 _Hlt519663841 _Hlt519078935 _Hlt520621654 _Hlt520621527 _Hlt520621685 _Hlt520621585 _Hlt520621554 _Hlt520621963 _Hlt520621728 _Hlt520621702 _Hlt526050014 _Hlt526050015 _Hlt520621928 _Hlt520548171 _Hlt520548188 _Hlt520548245 _Hlt518883402 _Hlt519862564VV_``W#Y##q$r$M6c@EEE K KQbWiW]]mbGmyxj_ PP$ L%:+33HFR9Z9Zkeme@?@A@B@CDEF@G@H@I@J@KLMNOPS@T@U@V@W@X@Y@Q@R@Z@[@\@]@^_`abcd@efg@hi@j@klmnopqrstvu{wxyz|}~@@ WW`aaX#Z##r$s$N6d@EEEhKhKQcWjW]]nbHmzԞk QQ% M%;+33IFRiZiZlene=uuււɋɋ ddw}ֱֱX^`! x~8$:$5%9%:%& -499PPPPPPPPPPV/___"bfmr##))K6P6D?K?HHOO]]xjj{{XdpvTW$)˜˘Ӧߦ_f )7?  <EFORW&)n&x&''C(L(4+=+,,//33$G,GMM VV[\_`,a3aebkb*k/kmmaofoooxxz8{9{<{={C{D{H{|{|||| | |||G|H|L|y||||||||||P~~3mnq$oJrsyz}ˋ̋ыҋ׋ʌˌΌ 4578<і!"'],_adegpŦƦɦʦԦE0hilm2389>hveflm{|>?HILstvwxJKQRU=?ABD޾#$,x3ab]^bce !"67;<?Q$%&'+yTUZ[_459:BCL2356:&'/03<9u9v9z9{99YJJJJJJM^M_MeMfMnMoMwMxMMMMPPPPPPQEQFQIQJQLQRRRRRS#USUTUYUZU\UVVVVVV!WQW__c```$aMaa%b9cccdnddeeff(gTgjkqkklel_oopp||:~[v))s0|0\1b1SSZZ![([3;ZcBD!T0T\#\\6]Y_`_ɉ*6dgqKc`$%33333333333333333333333k/M{%Og? !!s""#`###9$9$=${$&'''F(((()`)))+H+ - --Y-Z-w-x-y---}.}.....//00t1123J3O333!4S445<<>>?6?7????@@sAyAAAAeC{CCCDvD8F}FFFGGPHHHHHHHH9IsIII KhKKKLLXOOPPSSU5USUVDYLYZZZ\\*]@]]]*^(_p_a&bb,cUdd$j4jIjjkkk/lppppr0sstv8wwwww x"{T{Y{{|F|#}]}}}dv6h/3́ԁ/#eĄuv%_W DO]lҌԌЍ CT׏WxNl/bkklpȕɕ2^``ژCTLPEO<0Яӯh۸V&ked%#gFc*@}ddmm*Bjk;N#(==R]G  FTY_8P.V{Csu CSs6 w!!%%+&y&&''M(Y((+>+F+j+,,//t00e3377&>`>>>>E?@AFGGG#G,G2HaHWII;JnJKKyKMMKNNS:T&UUUV^VV>W|W@Y@YYZ9ZiZmZZZZ1[k[[\\B\\7]+^9^^^|___````4ab b#blbbbffhiii[jjjjj0kklmmoYooopppprrsXsYsZst?uhuu.x@xAxxyyzzzC{|{||G|y||P~~3mr$oJrف >ISwLNvȇ%ʼn։׉ˋʌό 4qr 5ӔGі!],_*ˣڣ ;<pĦE0hm2hvȬͬDQh}>!&sJ<޾x3awbjs]6QXgp~Q_CyTL2Ue&'  ;;B F   >K& &((55<99?@FHLHYJJ-L6LMMPPQEQRR#USUVV!WQW^2_?__S`U`V`]`a`b`c`aa%bFbQbb7c9cccdnddteteveeefff(gTg7iijjjkqkklel8nino*&c< Y( 63)Xr r*l'+lp;,U>p,۠N}+-#f<0eg̅*T2o\)gk|l^`.^`.88^8`.^`. ^`OJQJo( ^`OJQJo( 88^8`OJQJo( ^`OJQJo(hh^h`. hh^h`OJQJo(h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(HH^H`o(HH^H`o(.HH^H`o(..88^8`o(... 88^8`o( .... `^``o( ..... ^`o( ...... ^`o(....... pp^p`o(........h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(>^`>o(.0>^`>o(.p0p^p`0o(..@ 0@ ^@ `0o(... xx^x`o( .... HH^H`o( ..... `^``o( ...... P`P^P``o(.......  ` ^ ``o(........^`OJPJQJ^Jo(- ^`OJQJo(o pp^p`OJQJo( @ @ ^@ `OJQJo( ^`OJQJo(o ^`OJQJo( ^`OJQJo( ^`OJQJo(o PP^P`OJQJo(\^`\o(\^`\o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........ \^`\o(\^`\o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........hh^h`o(hh^h`o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........^`o(^`o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........ ^` o( ^` o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo( hh^h`OJQJo( ^`o(.0^`o(.p0p^p`0o(..@ 0@ ^@ `0o(... xx^x`o( .... HH^H`o( ..... `^``o( ...... P`P^P``o(....... ^`o(........h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo( hh^h`OJQJo(^`OJPJQJ^Jo(- ^`OJQJo(o pp^p`OJQJo( @ @ ^@ `OJQJo( ^`OJQJo(o ^`OJQJo( ^`OJQJo( ^`OJQJo(o PP^P`OJQJo(hh^h`o(hh^h`o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........\^`\5o(\^`\5o(.0^`05o(..0^`05o(... 88^8`5o( .... 88^8`5o( ..... `^``5o( ...... `^``5o(....... ^`5o(........h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(^`o(.0^`o(.p0p^p`0o(..@ 0@ ^@ `0o(... xx^x`o( .... HH^H`o( ..... `^``o( ...... P`P^P``o(....... ^`o(........h^`. \ ^ `\o(.808^8`0o(..808^8`0o(... ^`o( .... ^`o( ..... `^``o( ...... `^``o(....... pp^p`o(........h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo( hh^h`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(\^`\o(\^`\o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........ hh^h`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo( \^`\o(.0t\t^t`\o(.p0p^p`0o(..@ 0@ ^@ `0o(... xx^x`o( .... HH^H`o( ..... `^``o( ...... P`P^P``o(....... ^`o(........h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(h pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(^`o(^`o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........ hh^h`OJQJo( hh^h`OJQJo(h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo( hh^h`OJQJo(\^`\o(\^`\o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(\^`\o(.0t\t^t`\o(.p0p^p`0o(..@ 0@ ^@ `0o(... xx^x`o( .... HH^H`o( ..... `^``o( ...... P`P^P``o(....... ^`o(........ \^`\o(\^`\o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........\^`\o(\^`\o(.0^`0o(..0^`0o(... 88^8`o( .... 88^8`o( ..... `^``o( ...... `^``o(....... ^`o(........h^`OJPJQJ^Jo(-h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(9f<0.O f5@+xN_zqaa+Y;+/-`*&ST_g2[r r*s/hd2zvHp,}+-Og} Slo;3[[*KM8<^"ZvEx5eg+T2o Y()gk|,=58\,S63)9Gw@^<} H%;,M~}|99 ƥ         ƥ         ƥ         ƥ                  ƥ         ƥ          ƥ         ƥ        @C`C``tC`C``@UnknownGz Times New Roman5Symbol3& z ArialY New YorkTimes New Roman?5 z Courier New5& z!Tahoma;Wingdings"1hVaJZF X.!80dhZ @&Large Scale Cluster Computing WorkshopalanalanOh+'0 $0 L X d p|'Large Scale Cluster Computing Workshop argalan SclanlanNormalcalanlc191Microsoft Word 9.0 @zb@P@vc@ȝr-X՜.+,D՜.+,X hp  CERNu.h2 'Large Scale Cluster Computing Workshop TitleP 8@ _PID_HLINKSA2-+http://wwwinfo.cern.ch/hepix/meetings.html>?http://www.ihep.ac.cn/~chep01/]http://www.etnus.com/R&http://www.myri.com/myrinet/overview/z8http://nfs.sourceforge.net/DA%http://www.gnu.org/copyleft/gpl.htmlC]http://www.llnl.gov/asci/\]!http://www.cs.sandia.gov/cplant/NAhttp://conferences.fnal.gov/lccws/papers/fri/Cplant-Overview.ppt>'0http://www.csm.ornl.gov/scidac/ScalableSystems/ )http://phys.columbia.edu/~cqft/qcdoc.htmz~<http://conferences.fnal.gov/lccws/papers/fri/B4_Summary.ppt< {<http://www.users.dircon.co.uk/~crypto/download/c50-faq.htmlHSx!http://web.mit.edu/kerberos/www/6}uBhttp://conferences.fnal.gov/lccws/papers/fri/LCCWS_B3_summary.pdf_r&http://www.sun.com/software/gridware// o-http://www.mosix.cs.huji.ac.il/txt_main.htmlollhttp://www-isd.fnal.gov/fbsng/lpi$http://webcc.in2p3.fr/man/bqs/introIfhttp://www.cs.wisc.edu/condorp*chttp://www.openpbs.org/L`&http://www.platform.com/products/LSF/E]Bhttp://conferences.fnal.gov/lccws/papers/fri/lccws-b2-summary.pptIZhttp://www.cs.wisc.edu/condor[WBhttp://conferences.fnal.gov/lccws/papers/fri/Panel B1 summary.pptw+Thttp://www.griphyn.org/][Qhttp://www.ppdg.net/.-N$http://cern.ch/hep-proj-grid-fabricg}Khttp://www.eu-datagrid.org/^HBhttp://conferences.fnal.gov/lccws/papers/fri/Panel A4 summary.pptXE6http://www.platform.com/products/siteassure/index.aspSMBhttp://www.netsaint.org/[?&http://proj-pem.web.cern.ch/proj-pem/X<http://www-isd.fnal.gov/ngop/}9Fhttp://conferences.fnal.gov/lccws/papers/fri/MonitoringA3_Summary.ppt2%6http://sourceforge.net/IJ3)http://sourceforge.net/projects/va-ctcs/6{0;http://conferences.fnal.gov/lccws/papers/fri/A2summary.txt,q-#http://www-unix.mcs.anl.gov/chiba/z8*http://nfs.sourceforge.net/|w'http://www.csm.ornl.gov/oscar/$&http://sourceforge.net/projects/luis/!.http://sourceforge.net/projects/systemimager/gdhttp://www.iu.hio.no/cfengine/Ihttp://www.cs.wisc.edu/condorJT=http://www-4.ibm.com/software/webservers/edgeserver/iss.htmlP%http://wwwinfo.cern.ch/pdp/ose/asis/.http://wwwinfo.cern.ch/pdp/ose/sue/index.htmlF !http://www.tru64unix.compaq.com/.z 7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlzR @http://conferences.fnal.gov/lccws/papers/thur/LCW_balancing.ppt]i?http://www-cdf.fnal.gov/upgrades/computing/computing_cont.html ahttp://www-oss.fnal.gov/fss/documentation/linux/521/rpm2html/521/fermi-benchmark-1.0-1.i386.htmlShttp://www.corba.org/#r/http://www-cdf.fnal.gov/upgrades/upgrades.htmlEhttp://www.3com.com/products/en_US/prodlist.jsp?tab=cat&pathtype=purchase&cat=11&selcat=LAN+Switches+%28Stackable%2FFixed%29&family=325n?http://conferences.fnal.gov/lccws/papers/thur/cdfl3_online.pdfw+http://www.griphyn.org/`:http://www.objectivity.com/.-$http://cern.ch/hep-proj-grid-fabricg}http://www.eu-datagrid.org/*i http://cmsdoc.cern.ch/cms/grid/;(http://www.npaci.edu/DICE/SRB/Ihttp://www.cs.wisc.edu/condor%%http://www.globus.org/][http://www.ppdg.net/~R=http://conferences.fnal.gov/lccws/papers/thur/ppdg_lccws.ppt http://vacm.sourceforge.net/.http://sourceforge.net/projects/systemimager/&n8http://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/.-$http://cern.ch/hep-proj-grid-fabricg}http://www.eu-datagrid.org/0X<http://conferences.fnal.gov/lccws/papers/thur/WP4_LCCWS.ppthttp://csrc.nist.gov/pki/< <http://www.users.dircon.co.uk/~crypto/download/c50-faq.htmlP,(http://saaz.lanl.gov/LSF/LSF_page5.htmlk58http://www.accelrys.com/products/gcg_wisconsin_package/%%http://www.globus.org/ _Ehttp://www.sun.com/products-n-solutions/hardware/infoappliances.html6:http://www.cryptocard.com/HS!http://web.mit.edu/kerberos/www/5m(http://www.fnal.gov/cd/main/cpolcy.htmlDYhttp://conferences.fnal.gov/lccws/papers/thur/FNAL Security & Authentication (LCCWS).ppt http://vacm.sourceforge.net/X6http://www.platform.com/products/siteassure/index.asp} http://snmp.cs.utwente.nl/SMhttp://www.netsaint.org/AQ!http://www.maclawran.ca/bb-dnld/o8$http://www.kernel.org/software/mon/%u,http://www.millennium.berkeley.edu/ganglia/dN=http://www.jlab.org/hepix-hepnt/presentations/Ranger_Update/Xhttp://www-isd.fnal.gov/ngop/[&http://proj-pem.web.cern.ch/proj-pem/nM<http://conferences.fnal.gov/lccws/papers/thur/PEM_LCCWS.pptXhttp://www-isd.fnal.gov/ngop/kR=http://conferences.fnal.gov/lccws/papers/thur/ngop_lccws.ppt http://vacm.sourceforge.net/L|&http://www.platform.com/products/LSF/.zy7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlSKv6http://conferences.fnal.gov/lccws/papers/thur/chan.ps|<shttp://www.rpm.org/.zp7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlecm+http://www.openssl.org/docs/apps/x509.htmlHSj!http://web.mit.edu/kerberos/www/cg0http://www-unix.mcs.anl.gov/mpi/fastchoice.html,fd'http://www.cs.wisc.edu/condor/uwflock/ahttp://www.ncsa.uiuc.edu/P^qhttp://caulerpa.marbot.uni-bremen.de/crit/nph-med.cgi/http://www.nas.nasa.gov/Pubs/Highlights/1999/19990302.htmlyi[0http://www.bo.infn.it/calcolo/condor/index.htmlU XJhttp://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.html;1Uhttp://www.cs.wisc.edu/condor/RAhttp://conferences.fnal.gov/lccws/papers/tues/22WrightCondor.ppt`:Ohttp://www.objectivity.com/PUL9http://www.rl.ac.uk/cisd/Odds/hepix99/petravic/show.html pI/http://wwwinfo.cern.ch/pdp/castor/Welcome.htmllpF$http://webcc.in2p3.fr/man/bqs/intropfC#http://doc.in2p3.fr/man/xtage.html`:@http://www.objectivity.com/RA=-http://www4.clearlake.ibm.com/hpss/index.jsp&n:4http://consult.cern.ch/writeup/coreuser/node14.html.z77http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlI\4Ahttp://conferences.fnal.gov/lccws/papers/thur/RunningCluster.ppto81#http://www.kernel.org/software/mon L.2http://www.iaeste.dk/~henrik/projects/mprime.html.~+,http://www.kaybee.org/~kirk/html/linux.htmlQ("http://www.isc.org/products/DHCP/RK%:http://www.toolinux.com/linutile/configuration/kickstart/%"4http://www.platform.com/products/LSF/whatsnew41.aspc''7http://cc.jlab.org/docs/scicomp/man-page/jsub-man.htmlL&http://www.platform.com/products/LSF/ME!http://www.networkappliance.com/h+,http://www.microsoft.com/Mind/1196/CIFS.htmz8http://nfs.sourceforge.net/c(+http://cc.jlab.org/services/cue/cuesw.htmlR &http://www.myri.com/myrinet/overview/Z http://www.cisco.com/U )http://www.foundrynetworks.com/products/ *http://mufasa.desy.de/sw8http://conferences.fnal.gov/lccws/papers/thur/LCCWS.ppt:_?http://www-unix.globus.org/mail_archive/datagrid/msg00217.htmlEhttp://ccweb.in2p3.fr/bbftp/Shttp://d0db.fnal.gov/sam/olhttp://www-isd.fnal.gov/fbsng/=http://www.fnal.gov/fermitools/abstracts/fbsng/abstract.html|0'http://www.fnal.gov/docs/products/upd/dqhttp://www.surfnet.nl/%rhttp://www-d0.fnal.gov/>http://conferences.fnal.gov/lccws/papers/thur/HEPFarmConf.pptT 2http://standards.ieee.org/catalog/olis/index.htmlU Jhttp://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.html;1http://www.cs.wisc.edu/condor/L&http://www.platform.com/products/LSF/L&http://www.platform.com/products/LSF/XN=http://conferences.fnal.gov/lccws/papers/weds/Allocation.pptIJ)http://sourceforge.net/projects/va-ctcs/%http://systemimager.sourceforge.net/ig#http://bioswriter.sourceforge.net/at#http://www.acl.lanl.gov/linuxbios/+'?http://www.securityfocus.com/focus/sun/articles/jumpstart.htmlRK:http://www.toolinux.com/linutile/configuration/kickstart/%http://systemimager.sourceforge.net/vy3http://www.dcs.ed.ac.uk/home/ajs/linux/updaterpms/&n8http://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/'Shttp://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/Tools/tool_survey.htmg}http://www.eu-datagrid.org/[>http://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/){Chttp://conferences.fnal.gov/lccws/papers/thur/GRID-wp4-install.ppt(/(http://www.mcs.anl.gov/systems/software[&http://proj-pem.web.cern.ch/proj-pem/Xhttp://www-isd.fnal.gov/ngop/%u,http://www.millennium.berkeley.edu/ganglia/} http://snmp.cs.utwente.nl/J5http://rocks.npaci.edu/manpages/insert-ethers.8.html|<http://www.rpm.org/Whttp://www.mysql.com/z?http://rocks.npaci.edu/RK:http://www.toolinux.com/linutile/configuration/kickstart/r-8http://oss.software.ibm.com/developerworks/projects/luiau8http://conferences.fnal.gov/lccws/papers/weds/Rocks.ppt'&'http://corvus.kek.jp/~manabe/pcf/dollyRK:http://www.toolinux.com/linutile/configuration/kickstart/|f;http://www.google.com/search?q=cops+eth&btnG=Google+Search~`}8http://conferences.fnal.gov/lccws/papers/thur/Dolly.pptEGzhttp://www.fibrechannel.com/n"w,http://sourceforge.net/projects/intel-iscsi;1thttp://www.cs.wisc.edu/condor/FEqAhttp://conferences.fnal.gov/lccws/papers/thur/LSCCW_umn_ccgb.pptEnhttp://ccweb.in2p3.fr/bbftp/%rkhttp://www-d0.fnal.gov/Z_hAhttp://conferences.fnal.gov/lccws/papers/weds/DataManagement.pptU eJhttp://www.cs.wisc.edu/condor/manual/v6.2/2_10Inter_job_Dependencies.htmlyib0http://www.bo.infn.it/calcolo/condor/index.htmlOT_=http://conferences.fnal.gov/lccws/papers/weds/condor-i-o.pptL\%http://manimac.itd.nrl.navy.mil/MDP/z8Yhttp://nfs.sourceforge.net/WVhttp://www.mysql.com/e4Shttp://www.ensembl.org/cP0http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlgdMhttp://www.iu.hio.no/cfengine/:?Jhttp://www.debian.org/.zG7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlatD#http://www.acl.lanl.gov/linuxbios/e4Ahttp://rsync.samba.org/z8>http://nfs.sourceforge.net/|w;http://www.csm.ornl.gov/oscar/r-88http://oss.software.ibm.com/developerworks/projects/lui5.http://sourceforge.net/projects/systemimager/J2&http://www.linuxnetworx.com/index.phpR/&http://www.myri.com/myrinet/overview/k6,?http://conferences.fnal.gov/lccws/papers/weds/pdsf-alvarez.ppt(/)(http://www.mcs.anl.gov/systems/software(/&(http://www.mcs.anl.gov/systems/softwareK#http://www.scyld.com/_ &http://parlweb.parl.clemson.edu/pvfs/z8http://nfs.sourceforge.net/\&http://mauischeduler.sourceforge.net/c0http://www-unix.mcs.anl.gov/mpi/fastchoice.html,q#http://www-unix.mcs.anl.gov/chiba/O1http://www.seti-inst.edu/science/setiathome.htmlIJ)http://sourceforge.net/projects/va-ctcs/ L 2http://www.iaeste.dk/~henrik/projects/mprime.htmlQhttp://www.specbench.org/@u5http://sverre.home.cern.ch/sverre/PASTA_98/index.htm%4http://www.platform.com/products/LSF/whatsnew41.aspcEhttp://www.nersc.gov/.)7http://conferences.fnal.gov/lccws/papers/weds/pdsf.ppt http://vacm.sourceforge.net/%http://systemimager.sourceforge.net/RK:http://www.toolinux.com/linutile/configuration/kickstart/Iv@http://conferences.fnal.gov/lccws/papers/weds/farm_hardware.pdfdN=http://www.jlab.org/hepix-hepnt/presentations/Ranger_Update/ http://vacm.sourceforge.net/&n4http://consult.cern.ch/writeup/coreuser/node14.htmlz8http://nfs.sourceforge.net/e.4http://www.netbsd.org/Documentation/bsd/amdref.htmlAF%http://www.slac.stanford.edu/BFROOT/Zhttp://www.cisco.com/RA-http://www4.clearlake.ibm.com/hpss/index.jsp`:http://www.objectivity.com/z8http://nfs.sourceforge.net/.z7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlL&http://www.platform.com/products/LSF/dZ=http://conferences.fnal.gov/lccws/papers/weds/SLAC_WedAM.ppte4http://rsync.samba.org/%http://systemimager.sourceforge.net/ http://vacm.sourceforge.net/l)http://www.valinux.com/[&http://proj-pem.web.cern.ch/proj-pem/ H.http://service-sure.web.cern.ch/service-sure/^6http://www.digi.com/solutions/termsrv/etherlite.shtml http://vacm.sourceforge.net/P%http://wwwinfo.cern.ch/pdp/ose/asis/.http://wwwinfo.cern.ch/pdp/ose/sue/index.html8.4http://c.home.cern.ch/c/cborrego/www/anis/anis.htmlL&http://www.platform.com/products/LSF/_VDhttp://conferences.fnal.gov/lccws/papers/weds/LCW_CERN_clusters.ppt+9http://www.sistina.com/gfs/z8http://nfs.sourceforge.net/gdhttp://www.iu.hio.no/cfengine/O&http://choruswww.cern.ch/welcome.htmlO1http://www.seti-inst.edu/science/setiathome.htmlIhttp://www.cs.wisc.edu/condor,10http://homepage.tinet.ie/~djkoshea/cachefs.htmlc0http://www-unix.mcs.anl.gov/mpi/fastchoice.html\&http://mauischeduler.sourceforge.net/DU>http://conferences.fnal.gov/lccws/papers/tues/22SimoneQCD.pdfz8http://nfs.sourceforge.net/~6http://conferences.fnal.gov/lccws/papers/tues/Kek.ppt.z{7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlgx"http://www.suse.com/index_us.htmlBu.http://wwwinfo.cern.ch/pdp/vm/diskserver.html>:rhttp://www-h1.desy.de/A o5http://conferences.fnal.gov/lccws/papers/tues/H1.pdfz8lhttp://nfs.sourceforge.net/Li&http://www.platform.com/products/LSF/) f>http://conferences.fnal.gov/lccws/papers/tues/Tues_Sanger.pptolchttp://www-isd.fnal.gov/fbsng/18`Khttp://conferences.fnal.gov/lccws/papers/tues/clusters_wolbers_may2001.pptO^]Ahttp://conferences.fnal.gov/lccws/papers/tues/BaBar Use Case.pptHSZ!http://web.mit.edu/kerberos/www/ Whttp://vacm.sourceforge.net/.zT7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlb>Qhttp://www.ssh.com/z8Nhttp://nfs.sourceforge.net/DKhttp://www.rhic.bnl.gov/}TH?http://conferences.fnal.gov/lccws/papers/weds/lccws_seminar.psKEhttp://www.scyld.com/cB0http://www-unix.mcs.anl.gov/mpi/fastchoice.html@?!http://www-unix.mcs.anl.gov/sut/2l<http://www.top500.org/rm9Chttp://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.ppthp6/http://www.dcs.port.ac.uk/~mab/tfcc/WhitePaperrm3Chttp://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.ppt]C0http://www.ieeetfcc.org/rm-Chttp://conferences.fnal.gov/lccws/papers/tues/fermi-tfcc-final.pptW*"http://monarc.web.cern.ch/MONARC/ZV'=http://conferences.fnal.gov/lccws/papers/tues/Datavolume.ppt $?http://conferences.fnal.gov/lccws/papers/tues/LHC_Tues_WvR.ppt&f! http://runiicomputing.fnal.gov/Whttp://www.auger.org/o*http://www-boone.fnal.gov/>5http://www-numi.fnal.gov:8875/_V%http://cmsinfo.cern.ch/Welcome.html/!Chttp://conferences.fnal.gov/lccws/papers/tues/Kasemann_Welcome.pdfA &http://wwwinfo.cern.ch/hepix/cluster/CB http://www.fnal.gov/8| http://www.cern.ch/@Ahttp://conferences.fnal.gov/lccws/papers/tues/Workshop Goals.ppt8d#http://conferences.fnal.gov/lccws/W"http://monarc.web.cern.ch/MONARC/2l9http://www.top500.org/C]6http://www.llnl.gov/asci/c83$http://www.genias.net/geniasde.htmlc00http://www-unix.mcs.anl.gov/mpi/fastchoice.htmlO-1http://www.seti-inst.edu/science/setiathome.html|<*http://www.rpm.org/R'&http://www.myri.com/myrinet/overview/ql$(http://tim.web.cern.ch/tim/pcm/pcm.html+'!?http://www.securityfocus.com/focus/sun/articles/jumpstart.htmlRK:http://www.toolinux.com/linutile/configuration/kickstart/p*http://www.openpbs.org/Shttp://www.corba.org/`:http://www.objectivity.com/L&http://www.platform.com/products/LSF/RA-http://www4.clearlake.ibm.com/hpss/index.jsp~} +http://www.openafs.org/frameless/main.html.z 7http://www.transarc.ibm.com/Product/EFS/AFS/index.htmlhttp://wwwinfo.cern.ch/hepix/8|http://www.cern.ch/cg,http://user.web.cern.ch/user/Index/LHC.html  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      "#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     (Root Entry FK-*Data 1Table!19WordDocument"@SummaryInformation(DocumentSummaryInformation8CompObjjObjectPoolK-K-  FMicrosoft Word Document MSWordDocWord.Document.89q