Re: [awips2-users] CPU load on EDEX/LDM server

I don't think I've changed the pqact.conf from the default.  It's attached just 
in case.

_____________________________________________
Kevin Tyle, Systems Administrator 
Dept. of Atmospheric & Environmental Sciences   
University at Albany
Earth Science 235, 1400 Washington Avenue                        
Albany, NY 12222
Email: ktyle@xxxxxxxxxx
Phone: 518-442-4578                             
_____________________________________________


-----Original Message-----
From: Michael James [mailto:mjames@xxxxxxxxxxxxxxxx] 
Sent: Thursday, October 24, 2013 3:41 PM
To: Tyle, Kevin R
Cc: awips2-users@xxxxxxxxxxxxxxxx
Subject: Re: [awips2-users] CPU load on EDEX/LDM server

from qpid-stat -q -S msg -L 10


This means that 924k messages (1 msg per text product from the LDM) into the 
Text decoder and 873k messages out, leaving 50,4k text products waiting to be 
decoded.

So yes, this means that the LDM is writing incoming data to Raw Data Store 
(/data_store) faster than the EDEX decoders can decode and write to hdf5 
Processed Data Store (/awips2/edex/data/hdf5/).  The grib decoder has 29k 
product "waiting to be decoded".

Yeah, system is I/O bound.   if you tail the text or ingestGrib logs in 
/awips2/edex/logs/ you should see some pretty high Latency times.

Kevin, have you changed any entries in pqact.conf since installing?  The
NEXRAD3 entry, for example, only requests certain high-res products, not 
everything since EDEX standalone can not process the entire NEXRAD3 feed 
without faster disk writing.  I'm currently testing 13.5.1 server on identical 
machines, one with a solid state drive, another with a striped raid.

About that, I hope to make available the next beta (14.1 possibly) in the 
coming months and by then will have better disk recommendations for 
you all.   The SSD standalone server is currently handling all of 
NEXRAD3, NGRID, NIMAGE, WMO feeds as well as 0.5 deg GFS from CONDUIT, all 
without any qpid backup.  I need to turn on the GFS ensemble ingest to really 
turn the fire hose on and test.

I should also mention a problem I've seen, I call them hiccups because I can't 
pin down an exact cause.  Here's an example: turn on EDEX services and the LDM 
with default pqact entries, watch LDM/EDEX ingest and decode run well for day 
after day, until something "hiccups" and the NEXRAD files or grid files start 
to backup.  This sort of thing happens more frequently with 13.4 and 13.5, 
which is why I have kept the beta testers on 13.2, where these hiccups are less 
frequent.  I think I ran UPC
13.2.1 for two months without any hiccups once, that's the longest stretch.

The solution is

as ldm:
* ldmadmin stop

as root:
* edex stop
* rm -rf /awips2/qpid/messageStore/  (there's a subdirectory in there, I forget 
exactly what it's called, in 13.5.1 it's /edex)
* edex start

as ldm again
* ldmadmin start





On 10/24/2013 01:18 PM, Tyle, Kevin R wrote:
> Queues
>    queue                                                         dur  autoDel 
>  excl  msg    msgIn  msgOut  bytes  bytesIn  bytesOut  cons  bind
>    
> ==============================================================================================================================================
>    Ingest.Text                                                   Y            
>        50.4k   924k   873k   3.49m  64.1m    60.6m        2     2
>    Ingest.Grib                                                   Y            
>        28.6k   315k   287k   2.62m  28.9m    26.3m        8
>
>    

Attachment: edex_pqact.conf
Description: edex_pqact.conf

  • 2013 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the awips2-users archives: