|
|
This is the beowulf cluster room. The limited space available
makes rack-mounted nodes a very attractive option. The old
wet-lab bench that came with the lab has enough space for the
non-rack mountable servers, plus some room for a construction
and debugging station.
|
| |
|
|
Here they are - these four racks hold the 66 nodes that make up
the computational muscle of the cluster. The bottom of the
first rack also contains several servers including the cluster
login, batch, and file servers. Nodes are numbered from the
bottom up, and racks from right to left. Thus the oldest nodes
in the cluster are in the top half of the right-most rack.
| |
And here are the non-rack mountable servers, along with their
UPS's. Of the five machines on this side of the room, only the
silver tower on the far left is a part of the cluster (database
server). The others are research web servers, remote X servers,
and a Windows 2003 terminal server. All servers both here and
in the rack are connected to a KVM switch mounted in the first
rack, and then to a console on the lab bench to the right of
this photo.
|
| |
|
|
A view of the back of the cluster racks. Alas, the mess of
wires is somewhat of a necessary evil - this way we have enough
slack to pull the nodes out and work on them without removing
them from the rack.
| |
A bird's eye view of the racks. You can see the temperature
meter perched on top between Cabinets 1 and 2, and the RTD sensor attached
to the back-top of Cabinet 2.
|
| |
|
|
The inside of the cluster login machine.
There's not much here - just a processor, system drive, backup
drive, and a whole row of ethernet cards.
| |
The back of the login machine, with a close-up of the bank of
ethernet cards.
|
| |
|
|
The inside of the cluster file server. Things
are a little tighter in here. The dual Xeon board is larger
(extended ATX), plus the cabling for the two fast SCSI drives
fills up the inside a little more.
| |
The backside of the file server. The motherboard only has two
66 MHz PCI slots, so we use dual-port gigabit cards for
connections to the rack switches.
|
|
|
The inside of the database server.
This is probably the most advanced computer in the cluster. Two
of its four gigabit ports are built into the motherboard and two
are on 66 MHz PCI cards (bottom). The SCSI RAID controller sits
in a 133 MHz PCI slot (top), with the server subnet 100-baseT
NIC below it. The two SCSI RAID disks are mounted in removable
enclosures (top front), along with the backup drive (below).
The system drive is hard-mounted below that.
|
| |
|
|
The inside of a first
generation node. These are the only computers in the
cluster to use Slot 1 processors. Despite the relatively open
layout of the case, these three nodes run hotter than any of the
other computers in the cluster. Note that the two back case
fans do not blow directly out the back of the case.
| |
The front of a first generation node. The extra length of these
cases combined with the unusually short rails makes it
impossible to open the case while it is still mounted in the
rack. [The first generation nodes were retired in December,
2004]
|
| |
|
|
The inside of a second
generation node. These cases are more compact, but the back
exhaust fan keeps them much cooler than the generation one
nodes. The iWill motherboards have had some stability issues,
and a very odd compatibility problem with the original Intel
NICs.
| |
A second generation node pulled out from the rack. A bad batch
of DDR memory meant that this was a common sight.
|
| |
|
|
The inside of a third
generation node. The Intel cases are a little overkill for
this application, but the combination of the cases and the Intel
server board is rock-solid stable. The cases stay remarkably
cool despite the lack of an exhaust fan (other than through the
power supply). The video card and ethernet interface are built
into the motherboard, so the riser card is not needed.
| |
The front of a third generation node pulled out from the rack.
The CD-ROM drive came with the case. These nodes are the only
ones with SCSI system drives. The hot-swappable backplane was
very convenient during setup and debugging. Beginning with this
generation we quit configuring the nodes with floppy drives. We
now keep 3-4 spare floppy drives and just plug them in when they
are needed (basically only for testing failed hard drives).
|
| |
|
|
The inside of a fourth
generation node. The SuperMicro server board is very
stable, and the built-in NIC and video card again meant we could
forgo the riser card hassle. The new style wind tunnel heat
sinks for the Xeon processors did cause some problems. The
proper configuration and installation sequence that would
prevent damaging the caps on the motherboard was non-obvious
(note that the top processor has the fan on the right blowing in
and the bottom has the fan on the left blowing out).
| |
The front of a fourth generation node. These cases are slightly
longer than the second generation cases, but still allow the top
to be opened without removing the node from the rack. The power
and reset switches are also mounted outside of the locked front
doors which means we can reboot nodes without digging out the
keys.
|
| |
|
|
Here is the main electrical feed to the cluster room. Power
comes in on a single 3-phase, 240 volt, 150 amp line. It is
broken down into 20 20-amp 120 volt circuits for the nodes, plus
a 20-amp 3-phase circuit for the chiller motor. The servers run
on the room's original 20-amp circuit. The small box on the
wall to the left is the 100-baseT uplink to the campus network.
| |
The power main may feed the cluster, but this is its lifeline.
The two pipes are the chilled water supply and return lines for
the air chiller. The chiller is rated for 5-tons of
refrigeration (17.5 KW). They enter the room through the hole
formally used by the fume hood that came with the lab.
|
| |
|
|
Here is a close-up of the temperature meter used to monitor the ambient
temperature of the cluster room. The meter is sitting on top of
the RS-232 to ethernet converter and web server.
| |
One of the few drawbacks of the rack-mounted option for
designing Beowulf clusters is that everything comes with
a key. This is the minimum complete set of keys for the
cluster. So, do you remember which one of these opens up the
second generation nodes?
|