Post by Alexander Pyhalov Post by Udo Grabowski (IMK)
We have 400 TB and are still in...
Could you share some specifications of this installation? I'm interested
in hardware specs, zfs pools organization, what do you use for HA,
backup, any other interesting details, how do you share volumes
See attached pdf for our current setup. No HA, we rely on NFSv4,
therefore all pools have SSD or Sun F20 flash log devices for
fast sync write. SGE gridengine for the compute jobs, which are
about 150.000 per 3 days. /home and some rpools are backed up
via IBM Tivoli to our local computing center (very large
StorageTek band roboter), results are backed up to a very large
11 PB DDN/SONAS storage park.
All old, but extremely reliable Sun storage Thumper X4540 + Sun
Constellation type cluster based on C48 with fast X6275 blades
(768 cores), a X4275 as fast compute server with a 12 TB online
pool, and a Fujitsu RX 900 S2 with 80 cores and 2.1 TB RAM for
larger databases and parallel SMP optimized programs. All connected
with broadband 10Gb/s and 1Gb/s stacked Cisco and BlackDiamond
network switches and a separate local management network for the
The larger Work_Pools are 48 and 96 TB in RAIDZ1 config with
7-8 vdevs/5-7 1 or 2TB disks, one RAIDZ2 with three (ten 1 TB disk) vdevs
for the precious stuff, all running on oi151a7/a9, and 21 workstations
Sun M27, Dell Precision T3500/T3600, and Fujitsu Celsius M720 also
running OI 151 a7/a9. All of these nodes are based on a single,
well crafted rpool that we multiply regardless of hardware with
a procedure I described earlier in this list.
And right to the picture the blue arrows point to the aforemantioned
DDN/SONAS storage based on IBM GPFS in our local computing center
for the stuff we don't work on everday. Wich means that most of the
data in our local cellar is indeed used everyday in computations,
the current workload produces 3.7 TB new data every week on average
(we do this for 15 years now, of course we started with a smaller,
but unbelievably expensive 6 TB FC storage RAID based on DEC Alpha
OSF/1 AdvFS and a 32 core 8x Alpha ES40 cluster).
Pools are a bit too large for conventional disks, we are often hurt
by 24 days resilverung times for a 2 TB disk due to high fragmentation,
where all resilver optimizations in kernel variables will not help
anymore. Therefore, our current plan is to buy or build a 50 TB
SSD storage machine for the compute cluster workload which will
eradicate that bottleneck (and others too...).
What for? It's 10 years of ENVISAT/MIPAS satellite data (infrared
spectra) which we invert to get atmospheric trace gas constituents,
to finally understand the atmospheric chemistry (e.g., ozone
destruction, atmospheric ciculation, etc.pp.). Additionally,
about 100 TB of our own balloon data, and data of about 40 other
satellites and ground stations to compare our data with.
So mostly we are doing heavy data movement and have large RAM
requirements (~3-25 GB/core) for our compute jobs.
Hope that gives you enough details.
Dr.Udo Grabowski Inst.f.Meteorology & Climate Research IMK-ASF-SAT
KIT - Karlsruhe Institute of Technology http://www.kit.edu
Postfach 3640,76021 Karlsruhe,Germany T:(+49)721 608-26026 F:-926026