[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20041231: Bigbird.tamu.edu status (cont.)



>From: Gerry Creager N5JXS <address@hidden>
>Organization: Texas A&M University -- AATLT
>Keywords: 200412201755.iBKHtxlI027412 IDD

Hi Gerry,

>fdisk and the RAID controller report a 2.399TB array.

Given this it is totally weird that 'df -k' would report less than
200 GB!

By the way, I changed the scouring scripts to keep _way_ less
data in an attempt to keep the LDM relay working with no latencies.
The original scour invocations are still in ldm's crontab, so
when the RAID size gets straightened out, all that needs to be
done is delete the new entries and uncomment the old.

>I've started 
>trying some things, so the Bird will be up and down today.  mkfs with 
>xfs, ext3 and jfs ALL find less than 200GB so something's wrong.  The 
>stock kernel should find all 2.4TB according to online and 3Ware docs.

Very strange indeed.

>I'm going to try RAID50 today, which will have the benefit of some 
>redundancy.

OK.

>I'll keep you posted.

Thanks for letting me know what is going on.  I visited bigbird because
I was working on moving folks off of emo.unidata.ucar.edu.  In doing
so, I found two sites feeding NEXRAD2 data from emo when they should be
feeding from bigbird.  As soon as bigbird is back up and stable, I will
work on moving them off of a UPC-based NEXRAD2 feed.

Have a great New Year's eve/day!

Cheers,

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.

>From address@hidden  Fri Dec 31 12:02:19 2004

I'll let you know.  For what it's worth, when it was up and running 
before the file system filled up, load averages were hovering around 
4.5.  When I was doing 1 simultaneous 160 GB restores, they peaked at 
about 16.  I think, once I figure out the operator error, it's going to 
be working better.

gerry

>From address@hidden  Fri Dec 31 23:05:17 2004

Well, 6+ hours of New Year's Eve partying later...

1.  We had to rebuild the RAID array.  This could have been the root of 
all evil.  We may never know.
2.  We upgraded the 3Ware driver and firmware.  This should have been 
done but I missed it initially.  It may have contributed but it wasn't, 
we think, a show stopper.
3.  We briefly upgraded to the 2.6.10 kernel, but recompile issues were 
precluding us from using it consistently.  Since we were able to get 
2.6.9 to cooperate we left it there.
4.  When we rebuilt the array we did so to create a RAID50 array with 2 
RAID5 units of 4 disks plus one spare.  We've now an aggregate array 
size of 1.7 TB.  NOTE: This is below the magical limit of 2TB for the 
stock Linux kernel.  NOTE2: The RH/Fedora kernel is config'd for Large 
Block Devices already (well, truth be known, that's the config from 
kernel.org, too...) so it should handle RAID larger than 2TB.
5.  A KEY failure element appeared, erroneously, to be that fdisk didn't 
want to perform with a partition size > 2TB.  We believe the combination 
of the fdisk problem and the array build problem conspired to convince 
us that there was an overflow condition and we were seeing active memory 
rewritten to make us think there was a 167GB partition.  Interestingly, 
this is close enough to make us old guys remember the 137GB partition 
limit for logical volumes in older IDE controllers...

The Bird is up and running.  For grins, I'm going to start some restores 
of data.  We won't have the full set of cache, but it's a start.  I'll 
look, in the morning (it's now midnight and I'm tired) at the scour 
scripts to put 'em back where they were...

That's about it.  If you've focused questions, as I'm not real focused 
here, I'll try to answer them.

Happy New Year!
gerry
-- 
Gerry Creager -- address@hidden
Texas Mesonet -- AATLT, Texas A&M University    
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.