[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[IDDBrasil #XPZ-611749]: Re: 20060609: CONDUIT Latency (cont.)

Subject: [IDDBrasil #XPZ-611749]: Re: 20060609: CONDUIT Latency (cont.)
Date: Tue, 13 Jun 2006 07:45:23 -0600

Hi Waldenio,

re: 10-way split of CONDUIT feed requests 
> Done !

Thanks.  I think that you have probably already seen that the 10-way split
and switching to idd.unidata.ucar.edu has resulted in you getting all of
the CONDUIT data that you are requesting:

CONDUIT volume received by moingobe:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?CONDUIT+moingobe.cptec.inpe.br

CONDUIT latency received by moingobe:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?CONDUIT+moingobe.cptec.inpe.br

This tells me that the network pipe into CPTEC is large enough to get the
CONDUIT data successfully, but there appears to be some sort of artificial
limit on how much data can be moved an any single feed request.

re: we must find out where the limit is and work to fix it
> I have the same opinion. The question is: how can we show were the
> software is working ?

The current 10-way split shows that the LDM works.  What we need to do
next is find out exactly what the limit is.  My guess yesterday that a
10-way split would work was based on 150 MB/hour being successfully received
in the NNEXRAD datastream.  It could be the case that the limit is 300 MB/hour
or some other value.  I would, therefore, like to change the request lines
on moingobe to find out where the limit is.  The first step in this would be
to return to the 5-way ldmd.conf request split, but with the request going
to idd.unidata.ucar.edu instead of idd.cise-nsf.gov.  Again, yesterday we
were seeing large packet loss on datafeeds out of idd.cise-nsf.gov.  Our
original theory was that the loss was occurring in the machine itself.
Through further investigation, we found that the problem is in a network
switch at NSF.

> Me and Alex Almeida tried to detect a sharp change of the latencies when
> the data volume cross some number. (The trigger number). Till now, our
> study is inconclusive.

The good news is that the study you and I are conducting _is_ conclusive.

> I have an interesting case for you. I'll send to you in a separated
> e-mail (in portuguese - sorry).

No worries about the other information being in Portugese.

> In one particular weekend, our transfers have became very good, just like
> the "artificial" device had been removed. In the monday at 9h everything
> returned as were before (end of the magic).

Interesting.  It could be the case that whatever the logic behind the
limit is a function of overall bandwidth use.  This would make sense.

> I am looking for the statistics...

Please give me your reactions to the results of the request modifications
we did yesterday.

re: a separate machine to test with
> The test machine can be my new desktop machine:
> The mambae.cptec.inpe.br, an AMD64 with Suse linux.

The only problem with SuSE Linux is that it does not support syslogd
with a standard installation.  I am assuming that you are using SuSE
because it is being used at ECMWF.  Is this true?

> I'll install ldm (may be tomorrow) and talk to you.
 
Thanks.  This may not be necessary given the success of our 10-way
split yesterday.

Are you willing to try the 5-way split I suggest above even if it means
that you might return to a mode of losing data?

Cheers,

Tom
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: XPZ-611749
Department: Support IDD Brasil
Priority: Normal
Status: Closed

Prev by Date: [IDDBrasil #XPZ-611749]: Re: 20060609: CONDUIT Latency (cont.)
Next by Date: [IDD #ZYT-779967]: LDM upgrade for aeolus.ucsd.edu
Previous by thread: [IDDBrasil #XPZ-611749]: Re: 20060609: CONDUIT Latency (cont.)
Next by thread: [IDDBrasil #XPZ-611749]: Re: 20060609: CONDUIT Latency (cont.)
Index(es):
- Date
- Thread