It is foreseen that, in this project, we start with the GRAPE-4 system, currently present at the University of Amsterdam. It will be possibility to expand our GRAPE-park with the more modern GRAPE-6, but a final decision on this issue will be made based on the advise of Prof. J. Makino of the GRAPE team, who will be collaborating in this project (see section 7). Up till now GRAPE systems have been connected to a single host workstation. Usually the amount of work related to the force calculations, to be carried out on the GRAPE system, is orders of magnitudes larger than the remaining calculations, to be carried out on the host. However, when using fast N-body solvers such as hierarchical tree methods, it turned out that the speed of the host effectively determines the achievable throughput. The use of tree-codes reduces the amount of work on GRAPE and at the same time increases the amount of work on the host (see 4.2.3). The calculations on the host have become the bottleneck. The same situation may occur in the combined stellar evolution and stellar dynamics code, where the evolution code is executed on the host. To remove the host bottleneck we propose to attach GRAPE to a cluster of general purpose computers. The work to be carried out on the host will be parallelized and executed on the cluster.Currently the Advanced School for Computing and Imaging (ASCI, participants: University of Amsterdam, State University Leiden, Technical University Delft, and Free University Amsterdam) is in the process of realizing the Distributed ASCI Supercomputer (DAS). The DAS system consists of four clusters (installed at the sites of the participants). The clusters at Delft, Leiden and UvA contain 24 nodes, the cluster at the VU has 64 nodes. The nodes are 200 MHz Pentium Pro processors running Unix and connected by a fast network (Myrinet). The clusters will be connected by a wide-area network using ATM (Surfnet). The first DAS was installed in 1997, the second upgrade has been operational since 1999 [25]. Figure 1 schematically shows the proposed architecture of the UvA N-body lab. The exact configurations will be established in close cooperation with Prof. Makino of the GRAPE team and the DAS advisory committee, of which one of the proposers (PS) is a member. The UvA N-body lab will be installed at the computer science department at the Kruislaan.
Figure 1: Architecture of UvA N-body lab.The UvA N-body lab will be established in three phases.Phase 1: We start with a "traditional" GRAPE configuration, i.e. a single host attached to a GRAPE. We will realize basic software modules containing direct and hierarchical N-body kernels, relying as much as possible on existing (GRAPE) software. As a first application the combined stellar evolution with stellar dynamics code will be implemented and used for simulations of the evolution of star clusters which contain a realistic fraction of binaries. Starting with a system of a few ten-thousand stars, our final aim is to study the evolution of globular clusters containing hundreds of thousands of stars.Phase 2: The phase 1 system will be attached to the UvA DAS cluster. The hierarchical tree-code will be parallelized and optimized. Important issues are resource management, data locality and caching behavior.Phase 3: We will investigate to what extend the UvA N-body lab can be further optimized or extended to allow for (more accurate) higher order hierarchical methods and/or P3M methods (depending on specific needs). The extension can be in hardware (dedicated GRAPE boards) or in software, running on the DAS node. We will also investigate to what extent other applications, specifically molecular dynamics, can benefit from the UvA N-body lab. To do so we will establish a link with the NWO-MPR priority program dedicated to molecular dynamics of Prof. Dr. D. Frenkel, Prof. Dr. S. de Leeuw, Prof. Dr. H. Berendsen and Dr. P. Sloot.The Phase 2 system will be a prototype N-body lab already capable to study the evolution of dense stellar systems. Phase 3 is more experimental and will be used to address questions from computational science with respect to the capabilities of the proposed hybrid architecture to efficiently execute sophisticated fast N-body solvers.