[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030324: pqact causes unusually high CPU usage on LDM 6/Soalris Intel



>From: Robert Mullenax <address@hidden>
>Organization: NMSU/NSBF
>Keywords: 200303250413.h2P4D9B2010509 LDM-6 pqact

Robert,

>I converted our two SPARC Solaris 8 machines over to LDM 6 about 10 days ago.
>Everything is working fine with those..but tonight I built and installed LDM 6
>on our Solaris 8 Intel machine (wxmcidas.nsbf.nasa.gov,400MHz PII) and now I
>am seeing pqact using 8 to at times 33% of CPU. Under LDM 5 it never used
>more than 2-3%.

It is possible that the load generated by pqact can go up since your
machine is getting products from its upstream feed faster.  We did not,
however, see the same effect on our heavily loaded Solaris x86 box here
at the UPC.  There shouldn't be any effect with LDM-6 since the code for
pqact did not change from LDM-5.x to LDM-6.x.

>My ldmd.conf entries just contain ".*" for HDS, UNIWISC, and IDS|DDPLUS|FSL2,
>and NNEXRAD.

Those entries specify the patters for the data to get.  pqact, on the
other hand, does its work based on the actions defined in
~ldm/etc/pqact.conf.

>The feed source is our SPARC Solaris 8 box, psnldm.nsbf.nasa.gov.
>
>I can't sustain that kind of load as the box has only one CPU and must serve
>as a workstation as well.

I would run an experiment to see if the higher than normal CPU use by
pqact is caused by the LDM-6.0.2 version or by something else:
try switching back to LDM-5.x and see if things quiet down.  If they
do, try switching back to LDM-6.0.2 (the switching back and forth should
be very easy if you followed our recommendations and have a runtime
link that points to the version to be used).

The other thing you could try is setting up your machine to run more
than one invocation of pqact.  Chiz did this on our x86 box so that
pqact wouldn't get behind in its processing (our x86 box is receiving
all of the IDD data including a good fraction of the CONDUIT feed).
Here is what x86 ldmd.conf entries for pqact look like:

#
# Exec GEMPAK specific pqact processing
exec    "pqact -f ANY-NNEXRAD /local/ldm/etc/GEMPAK/pqact.gempak_decoders"
exec    "pqact -f MCIDAS /local/ldm/etc/GEMPAK/pqact.gempak_images"
exec    "pqact -f NNEXRAD|WSI /local/ldm/etc/GEMPAK/pqact.gempak_nexrad"
exec    "pqact -f WMO /local/ldm/etc/GEMPAK/pqact.gempak_nwx"
exec    "pqact -f WMO|SPARE|NMC2 /local/ldm/etc/GEMPAK/pqact.gempak_upc"

As you can see, Chiz created several GEMPAK-specific pqact.conf files
and runs separate invocations of pqact using each one.

One last thing I would try (probably first) is to see if your queue
is somehow damaged:

<as 'ldm'>
cd ~ldm
ldmadmin stop
pqcat -s -q data/ldm.pq -l-

This runs a sanity check on the queue; it will tell you if there is
a mismatch in the number of products seen in the queue and the
number of structures to access those products.  If there is a mismatch,
I would delete and remake your queue:

ldmadmin delqueue
ldmadmin mkqueue

>I built LDM 6 with the Sun SC5.0 Intel compiler.

We built LDM-6.x using the exact same setup here.

Tom