[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050106: clock on brisa is 17 years in the future (cont.)



>From:  David Garrana Coelho <address@hidden>
>Organization:  UFRJ
>Keywords: 200412302001.iBUK1rlI005807 IDD-Brazil LDM clock Linux latency

Hi David,

I was skiing yesterday, so I didn't see your email until this morning.

>       Time is running way too fast for me too. Hope its just because I 
>am having fun and not noticing it...=)

This would be very nice :-)

>I wish the holydays went fine for 
>you too. I went "totally offline" during new year's eve/day, and I am 
>paying the price until today (checking backlog, so to speak), that's the 
>cause of the delay in my reply.

I was not able to go offline, but I did manage to have a very nice
holiday season.  We have a party every New Year's day, and it is always
a lot of fun.  The good thing about this year is that the next day
(January 2) was a Sunday, so we could recover!

>Re: Brisa back from the future
>
>       Can't even imagine what happened, since it was an isolated event. 
>The problem solved by itself, because no one even logged by that time.

OK.

>       I don't know if you noticed, but solon is suffering with awful 
>latencies since December 30.

I did not notice -- I have been working on other things lately.

I took a quick look at the latencies on solon and notice that the very
high latencies for the most part are coming from duplicate feed
requests, but the primary feed request latencies are OK (I looked at
HDS and IDS|DDPLUS).  This can happen when the mount of data in an
upstream's product queue is significantly more than the downstreams.  A
new product is sent immediately to the downstream by one upstream and
then it is sent again by a different server at a later time.  This
would happen if/when the connection between the downstream and the
primary upstream is broken and the downstream reconnects.  This
situation has been happening on solon mainly due to my testing of a
cluster of machines acting as idd.unidata.ucar.edu.  An upcoming new
version of the LDM will limit the requests to one hour back in time (by
default; but it is configurable), so it would not request those
products that are very old in alternate upstream LDM queues.  I hope
that this explanation is clear enough, but I am afraid that it might
not be.  The simple explanation is:

- it appears that solon is getting the data from its primary feed
  site, atm.geo.nsf.gov, with little to no latency

- very high latencies (e.g., 15000 seconds) for some products can
  be caused by old products being resent when the feed connection
  is broken and reestablished

- the connection from solon to machines here at Unidata like
  idd.unidata.ucar.edu _have_ been going up and down; they
  have been going up and down because of stress testing we are
  conducting

The only feed that does not fit this scenario is CONDUIT.  The
latencies I see for solon's reception of CONDUIT are very bad.

>I lowered data volume requested to a
>minimun, but no effect. I already checked with network ppl here, but they 
>are still clueless. Any chance the AMPATH/I2 link is down/with problems? 
>Or are you aware of any possible sources for the problem?

The high solon CONDUIT latencies coupled with the low latencies for low
volume feeds like IDS|DDPLUS have the characteristic signature of
"packet shaping".  "packet shaping" is a term we use for artificially
induced feed volume limiting.  This kind of volume limiting could be caused
by AMPATH/I2 problems, but CONDUIT reception on moingobe.cptec.inpe.br
has low latencies.  So, I would initially suspect some sort of a
network problem in the I2 connection to UFRJ only (i.e., not with
AMPATH/I2 in general), or on the UFRJ campus.

>       I am also considering switching solon to Fedora Core 3 instead of 
>FreeBSD. The filesystem issue (files not showing on ls) always leave me 
>with the feeling that any problem I am having could be caused by this 
>issue, that I don't know how to fix. It's just an idea so far. Do you 
>think it's too extreme or not desirable?

We are now running Fedora Core 3 here at Unidata.  Our experience is
not extensive, but we have had good luck so far.  If you feel that
running Linux on solon helps you to maintain it better (for whatever
reasons), I say that you should go ahead and make the change.
I think, however, that this will require us to reinstall just about
everything on solon: LDM, McIDAS, etc.  This is not a problem, but
it will take some time.

By the way, I will be at the American Meteorological Society (AMS)
Annual meeging in San Diego starting Sunday and lasting pretty much all
of next week; Waldenio will be there also.  It might be better for you
to hold off on upgrading solon until after I am back so I can jump on
and reinstall McIDAS and the data servers that are running there.

Also, before upgrading solon, please take a close look at the setup
on it to make sure that nothing vital gets lost (e.g., scripts, etc.).

Ciao,

Tom
--
+-----------------------------------------------------------------------------+
* Tom Yoksas                                             UCAR Unidata Program *
* (303) 497-8642 (last resort)                                  P.O. Box 3000 *
* address@hidden                                   Boulder, CO 80307 *
* Unidata WWW Service                             http://www.unidata.ucar.edu/*
+-----------------------------------------------------------------------------+