[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050629: Some LDM questions



>From: Celia Chen <address@hidden>
>Organization: NCAR/RAL
>Keywords: 200506292232.j5TMWKjo010289 IDD

Hi Celia,

>I was looking for some info on rpc.ldmd on your website and found this
>statement:
>
>rpc.ldmd - main LDM server. One per incoming and outgoing feed.
>
>Could you please tell me what a "feed" is in this case?

A feed here is a reference to an rpc.ldmd process servicing the
request that came from a downstream host all or part of one or more
IDD feed types made in a single request.  For each request line in
a downstream's ~ldm/etc/ldmd.conf file, there will be one rpc.ldmd
run on the upstream that the request is made to.

>Does the feed size affect the processing of rpc.ldmd?  

I am not sure I understand the question.  If you mean can a feed request
contain too much data, then the answer is yes, but this is typically
when the requesting machine is electronically distant from the upstream.
Here in UCAR, one should be able to request all data available from
any other machine here in UCAR without introducing latencies.  Please
note that I am talking about requesting data and bringing it into
the local LDM product queue.  I am not referring to the processing
of that data out of the local queue.

>RAL's ldm host chisel is now configured to be a relay machine. We just
>saw 115  rpc.ldmd running on it this afternoon. The load average is
>greater than 10 most of the time.  We are wondering what we can do to
>make it work better.  The pq is set to 4GB but the memory is only 2GB.
>Will it help if we increase the memory to 4GB?  

A few things here:

- if chisel is trying to process the data using pqact, then you
  will see performance degredation depending on what it is processing

- second, when the LDM queue size is larger than real memory, you there
  is the real possibility that the machine will be spending a lot of time
  swapping memory to disk.  This will definitely happen when downstreams
  disconnect for awhile and then reconnect (or new connections are
  initiated) since the request will be for data that is further back
  in the queue, and the operating system will be forced to swap that
  portion of the queue back into memory to service the feed request

- yes, adding more memory will help a great deal!  We have found that
  the best way to keep a relay machine working efficiently (not thrashing
  and with low load averages) is to have the size of the queue be some
  fraction of the size of real memory.  The size of that fraction is
  an inverse function of the number of downstream AND upstream feed
  requests that are being serviced.  For instance, the LDM queue
  size on the data server back ends of the cluster idd.unidata.ucar.edu
  all have 8 GB queues, but real memory is 12 GB.  Each of these
  data servers is capable of handling over 200 downstream feed requests
  without causing the OS to swap pieces of the queue to/from disk.
  Because of this setup, the load averages on the data servers
  is typically below 1.0 even when the number of connections appoaches
  200.
  
>Also, we like to know what kind of configuration do you have on thelma
>(uni1/uni?).  What type of machine is thelma and what is its memory size?

thelma.ucar.edu is now the same system as idd.unidata.ucar.edu.
idd.unidata.ucar.edu is a cluster composed of:

1 IPVS 'director'
3 IPVS real server (data server) backends

Each of the data server backends is a Sun V20Z dual Opteron 1U
rackmount box with 12 GB of RAM and 2x36 GB 10,000 RPM SCSI disks.  The
director is also a Sun V20Z, but it has less RAM (4 GB) than the data
servers.  The director does not need to be such a well equipped
machine.  It could be a single processor system with a modest amount of
memory.  In fact, we will be replacing the V20Z with a Dell 2850
sometime in the near future to better use the V20Z.

By the way, we recently ran a stress test of our cluster's ability to
send data to downstream hosts, and we were able to send an average of
500 Mbps (5.4 TB/day) over a 3 day period _without_ the introduction of
product latencies.  During this test, the two data servers being used
in the test and the cluster director were basically idling.  Because of
this, we are convinced that the the limitation in the cluster's ability
to relay data was the size of the network pipe, and this is 1 Gbps for
UCAR/NCAR.

>When we have primary and alternate feed setup in ldmd.conf, does each also
>generate a rpc.ldmd process?

Yes.  Each feed request generates a separate rpc.ldmd on the requesting
machine AND on the server from which it is requesting data.  Since the
number of processes will affect the performance of any machine, we
strongly recommend combining feed requests where/when possible.  For
instance, if you have RAL machines requesting data from chisel, then
there is no need to have separate feed requests for different data
streams.  What I mean is that a set of requests like:

request IDS|DDPLUS .* chisel.rap.ucar.edu
request HDS .* chisel.rap.ucar.edu
request NIMAGE .* chisel.rap.ucar.edu
request NNEXRAD .* chisel.rap.ucar.edu
 ...
request CONDUIT .* chisel.rap.ucar.edu

would be much better serviced with a single request line like:

request ANY .* chisel.rap.ucar.edu

** IF ** the requesting machine is on the same subnet or in the same
LAN, or electronically close.  One would not combine feed requests when
the machines are electronically distant and/or if the network bandwidth
between the two machines is not great (UCAR has gigabit networking, so
this is not an issue).

>Could you get back to me asap?  We are going to have a LDM meeting
>tomorrow afternoon and need to have some answers to the above questions.

I hope that this was timely enough.

>Any info will be greatly appreciated.

We would be glad to help you (RAL) develop an efficient setup for
your LDM/IDD use.  Please let us know if you would like some help!

Cheers,

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.