| Joseph Lappa | |
| Pittsburgh Supercomputing Center | |
| ESCC/Internet2 Joint Techs Workshop |
| Supercomputing 2004 Conference | ||
| Application | ||
| Ultimate Integration | ||
| Resource Overview | ||
| Did it work? | ||
| What did we take from it? | ||
| Annual Conference | ||||
| Supercomputers | ||||
| Storage | ||||
| Network hardware | ||||
| Original reason for application | ||||
| Bandwidth Challenge | ||||
| Didn’t apply due to time | ||||
| Runs on Lemieux (PSC’s supercomputer) | ||
| Application Gateways (AGW) | ||
| Cisco CRS-1 | ||
| 40Gb/sec OC-768 cards | ||
| Few exist | ||
| Single application | ||
| Be used with another demo on the show floor if possible | ||
Ultimate Integration Application
| Checkpoint Recovery System | |||
| Program | |||
| Garden variety Laplace solver instrumented to save its memory state in checkpoint files | |||
| Checkpoints memory to remote network clients | |||
| Runs on 34 Lemieux nodes | |||
| 750 Compaq Alphaserver ES45 nodes | |||
| SMP | |||
| Four 1GHz Alpha Processors | |||
| 4 GB of Memory | |||
| Interconnection | |||
| Quadrics Cluster Interconnect | |||
| Shared memory library | |||
| 750 GigE connections are very expensive | ||||
| Reuse Quadrics network to attach cheap Linux boxes with GigE | ||||
| 15 AGWS | ||||
| Single processor Xeons | ||||
| 1 Quadrics card | ||||
| 2 Intel GigE | ||||
| Each GigE card maxes out at 990Mb/sec | ||||
| Only need 30 GigE to fill link to Teragrid | ||||
| Web100 kernel | ||||
| Cisco 6509 | ||||
| Sup720 | ||||
| WS-X6748-SFP | ||||
| Two WS-X6704-10GE | ||||
| Used 4 10GE interfaces | ||||
| OSPF load balancing was my real worry | ||||
| >30 GE streams over 4 links | ||||
| Cisco CRS-1 | ||||
| 40 Gb/sec slot | ||||
| 16 slots | ||||
| For Demo | ||||
| Two OC-768 cards | ||||
| Ken Goodwin’s and Kevin McGratten’s big worry was the OC-768 transport | ||||
| Two 8 Port 10 GE cards | ||||
| Running production IOS-XR code | ||||
| Had problems with tracking hardware | ||||
| Ran both without 2 Switching Fabrics
with no effects on traffic |
||||
| Cisco CRS-1 | ||||
| One at Westinghouse Machine Room | ||||
| One on show floor | ||||
| Fork lift needed to place it | ||||
| 7 feet tall | ||||
| 939 lbs empty | ||||
| 1657 lbs fully loaded | ||||
| Stratalight – OTS 4040 transponder “compresses” the 40Gbs signal to fit into the spectral bandwidth of a traditional 10G wave | ||
| http://www.stratalight.com/ | ||
| Uses proprietary encoding techniques | ||
| The Stratalight transponder was connected to the Mux/DMUX of the 15454 as an alien wavelength | ||
| OC-768 wasn’t worked on until one week before the conference |
| Lustre Filesystem | ||
| http://www.lustre.org/ | ||
| Developed by Cluster File Systems | ||
| http://www.clusterfs.com/ | ||
| POSIX compliant, Open Source, parallel file system | ||
| Separates metadata and data objects to allow for speed and scaling | ||
| 8 Checkpoint Servers with a 10GigE and Infiniband connections | |
| 5 Lustre OSTs connected via Infiniband with 2 SCSI disk shelves (RAID5) | |
| Lustre meta-data server (MDS) connected via Infiniband |
| Laplace Solver w/ Checkpoint Recovery | |||
| Using 16 Application Gateways (32 GigE connections): 31.1Gbs | |||
| Only 32 Lemieux nodes were available | |||
| IPERF | |||
| Using 17 Application Gateways + 3 single GigE attached machines: 35 Gbs | |||
| Zero SONET errors reported on interface | |||
| Over 44TB were transferred | |||
| AGWs | |||
| qsub command now has AGW option | |||
| Can do accounting (and possibly billing) | |||
| Mysql database with Web100 stats | |||
| Validated that AGW was cost effective solution | |||
| OC-768 Metro can be done by mere mortals | |||
| Application receiver | ||||
| Laplace solver ran at PSC | ||||
| Checkpoint receiver program tested / run at both NCSA and SDSC | ||||
| Ten IA64 compute nodes as receiver | ||||
| ~10 Gb/sec Network to Network (/dev/null) | ||||
| 990 Mb/sec * 10 streams | ||||