[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20020208: LDM resources under Linux



David Wojtowicz wrote:
> 
> I went to the boss this morning and got permission to buy a replacement
> server.  Just ordered it now.    Dual Athlon MP 1900+ processors,
> 2GB of RAM, 100GB of fast disk.   That'll hopefully help.  :-)
> 
> --david
> 


That sounds good!  Glad to hear it.

Anne




> >Hi David,
> >
> >David Wojtowicz wrote:
> >>
> >>  Hi Anne,
> >>
> >>    Actually, I was reporting a problem primarily with flood.atmos.uiuc.edu
> >>  which serves NMC2 and NNEXRAD|FNEXRAD.  squall.atmos.uiuc.edu does
> >>  everything else.  To answer some of your questions...
> >>
> >
> >Excuse me for confusing flood with squall.
> >
> >>  The load on flood can range from 3.0 to 10.0+  depending on the time
> >>  of day.  There are presently 22 rpc.ldmd's running.  If you look at
> >>  top there are usually 5-6 in the "runable" (R) state, rather than
> >>  sleeping.  That means this many of them are awaiting service by the
> >>  processor.
> >>
> >
> >This does indicate that the CPU isn't fast enough for the load.
> >
> >>  There are no topology listings of NNEXRAD or NMC2 like there are with
> >>  MCIDAS or FOS so I'm not sure who all is actually using us at the moment
> >>  since not, everyone who has an allow line is using it at any given time.
> >>
> >
> >Here are a few command lines to give you a close approximation of who is
> >currently feeding from you.  In the logs directory, do:
> >
> >       grep feed ldmd.log | grep Exit | awk '{ print $5 }' | sort > exited
> >       grep feed ldmd.log | awk '{ print $5 }' | sort | uniq > feeding
> >       diff feeding exited
> >
> >The diff will report the processes that are still feeding from you.
> >
> >Here is similar info (cleaned up) from the stats you are sending us:
> >
> >TOPOLOGY
> >flood.atmos.uiuc.edu redwood.atmos.albany.edu NMC2
> >flood.atmos.uiuc.edu ice.atmos.uiuc.edu NMC2
> >flood.atmos.uiuc.edu cirrus.atmos.uiuc.edu NNEXRAD|FNEXRAD|NMC2
> >flood.atmos.uiuc.edu measol.meas.ncsu.edu NNEXRAD|FNEXRAD
> >flood.atmos.uiuc.edu waterspout.cst.cmich.edu NNEXRAD|FNEXRAD
> >flood.atmos.uiuc.edu anvil.eas.purdue.edu NNEXRAD
> >flood.atmos.uiuc.edu yin.engin.umich.edu NNEXRAD
> >flood.atmos.uiuc.edu ldmdata.sws.uiuc.edu NNEXRAD
> >flood.atmos.uiuc.edu twister.sbs.ohio-state.edu NNEXRAD|FNEXRAD
> >flood.atmos.uiuc.edu aeolus.valpo.edu NNEXRAD
> >flood.atmos.uiuc.edu zelgadis.geol.iastate.edu NNEXRAD
> >flood.atmos.uiuc.edu squall.atmos.uiuc.edu NNEXRAD|FNEXRAD
> >flood.atmos.uiuc.edu papagayo.unl.edu FNEXRAD
> >flood.atmos.uiuc.edu data2.atmos.uiuc.edu FNEXRAD
> >TOPOEND
> >
> >Looks like you are feeding CONDUIT to two local sites, as well as
> >Albany.
> >
> >It would be interesting to know how much NEXRAD data your downstream
> >sites are requesting.   We can get some sense of that here from an
> >analysis of the incoming stats.  You can also grep through your logs for
> >'Start' or 'RECLASS' to see what feed types and patterns sites are
> >requesting (but if the connection has been stable and products timely
> >for a while you might not see any of these).
> >
> >
> >>  I'm guessing that it is the bottleneck, especially at the very busy
> >>  12Z run distibution via CONDUIT.
> >>
> >>  I'm concerned both that we are not servicing our downstream sites properly
> >>  and are losing CONDUIT products ourselves when the latencies exceed 1hr.
> >>
> >
> >
> >I can see that redwood is only getting a small percent of all the
> >CONDUIT products you're getting (are they asking for less?).  And, the
> >delay between your site and redwood's site is significant - I'm seeing
> >10 to 15 minutes.
> >
> >Regarding NEXRAD, both umich and purdue are timely, although they must
> >not be requesting a lot.
> >
> >
> >>  Again, the machine does only relay and nothing else (no pqact or
> >>  other significant processes)  It is a 400Mhz PC machine running Linux.
> >>  Granted, this is "slow" now, but was state of the art when we bought it
> >>  for CONDUIT about two years ago.   So I wondering what specs I need to
> >>  be looking for in a replacement or if that would even help if the machine
> >>  is not the bottleneck.
> >>
> >>  Thanks.
> >>
> >>  --david
> >>
> >
> >Yes, your machine seems overloaded.  And I suspect it is introducing
> >further latencies into the CONDUIT feed.  But, you are receiving and
> >relaying *a lot* of data.
> >
> >Regarding a replacement, for a "relay only" machine I would first
> >consider CPU speed and RAM.  Having enough RAM to hold a sufficient
> >queue is important.  Disk speed is important for the "read/write
> >through" aspect of the queue, so a fast disk certainly wouldn't hurt.
> >
> >For CONDUIT data, the biggest hour is currently 1.2Gbytes.  NEXRAD is
> >now averaging 80Mbytes/hour, so you need more to handle a maximum hour.
> >So, a queue size of 1.4Gbytes should accomodate both.  For this volume,
> >I'd say a gigabyte of RAM or more would be best.  Room to grow is always
> >good.  We have several sites succesfully running with nearly 2Gbyte
> >queues.
> >
> >How big is your queue now?  And how much RAM do you have?  And, how old
> >is the oldest product in your queue?  (Use pqmon to determine the
> >latter.  The last field is the age of the oldest product in the queue in
> >seconds.)
> >
> >Regarding your current configuration, Chiz said he could have redwood
> >feed elsewhere if that's better for you.  Or, could one of your local
> >sites relay to the other to reduce the load on flood?  Or, we could
> >reduce the NEXRAD feeds, although I'm guessing that's not making a big
> >impact on your machine, assuming the other NEXRAD sites are being
> >judicious like Purdue and UMich.
> >
> >Anne
> >--
> >***************************************************
> >Anne Wilson                    UCAR Unidata Program
> >address@hidden                 P.O. Box 3000
> >                                 Boulder, CO  80307
> >----------------------------------------------------
> >Unidata WWW server       http://www.unidata.ucar.edu/
> >****************************************************
> 
> --
> | David Wojtowicz, Sr. Research Programmer
> | Department of Atmospheric Sciences Computer Services
> | University of Illinois at Urbana-Champaign
> | email: address@hidden  phone: (217)333-8390

-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************