[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050630: Some LDM questions (cont.)



>From: Celia Chen <address@hidden>
>Organization: NCAR/RAL
>Keywords: 200506292232.j5TMWKjo010289 IDD tuning

Hi Celia,

>Thanks so much for your quick reply to my questions.

No worries.

>We also 
>appreciate your offer to help out with our ldm issues. We are
>thinking of inviting you or someone in your group to our next 
>LDM meeting to help us understand more about the LDM. Will you 
>be able to do that?

Yes, depending on when your next meeting is.  As you know, our
workshop series will begin at the end of the month, and there are
a number of us who will be fairly consumed by preparations.

>Also, I want to point out that chisel has been  a total relay machine
>for over a month now - not running pqact at all.

OK.  We learned through our attempt to get the NSSL QPESEUMS products
RAL is receiving on the EXP feed that chisel is receiving about 5 GB of
data per hour.  We didn't realize this before since the real time
statistics reporting routine, rtstats, has a default to report values
for all feeds except EXP (this decision was mine and seemed like a good
idea at the time, but a bad one now).  As a first step towards better
monitoring of chisel's ingest processing, please do the following:

<as 'ldm' on chisel>
cd ~ldm/etc
-- edit ldmd.conf and change:

exec   "rtstats -h rtstats.unidata.ucar.edu"

to:

exec   "rtstats -f ANY -h rtstats.unidata.ucar.edu

cd ~ldm
ldmadmin restart

This will cause chisel to start sending in numbers for all of the feeds
that it is receiving.

The other thing that I would like to be able to do is instrument chisel
with a script we run on all of our LDM relay hosts.  This script
requires the installation of Tcl/Tk, so if that is not already
installed on chisel, it would help if it could be installed.  The
script basically monitors system load, available memory, swap used, I/O
wait and system idle time once-per-minute (it is run out of cron in the
'ldm' account) saving the information to a log file that is easy to
review.

>One of our system
>folks said there is "no memory problem" when he was watching it 
>running over 110 rpc.ldmd with load avg of 16.

A "memory problem" in the sense that I was talking about would show
up as the system spending a lot of time in I/O waits.

>I don't understand how 
>memory map works with the LDM. Do we need to run pqact to activate it? 

No.  The LDM queue is memory mapped by all processes that access it.
This includes pqact if it is running, but is mainly done by the
rpc.ldmds that are bringing in or sending out data.

By the way, the old thelma.ucar.edu, a Sun SunFire 480, routinely
handled 140 downstream connections and 12 upstream connections with no
introduction of latency.  The load average on it would vary between
5 and 20, but the system remained completely responsive.  Even though
the old thelma worked well, we moved to a cluster approach so that
we could better handle the higher volume datastreams that will be
coming in the not too distant future.

>Thanks in advance.

No worries.  Again, we will be happy to work closely with RAL folks
to get your system tuned up.  Our ability to do this in the next
one and a half months is governed by our workshop commitments.

Cheers,

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.