[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[CONDUIT #GXA-551280]: Update on UW-Madison AOS Conduit/0.25 GFS



Hi Pete,

re:
> I am back from vacation and now taking a look at the 0.25 GFS stuff.
> 
> While I was on vacation, I happened to look at my idd stats for conduit
> on the 28th after the data started blasting through, even though I had
> restricted my conduit requests to prevent the 0.25 GFS from coming
> through idd.aos.wisc.edu, just to see if the added volume upstream had
> any impact on the rest of the data, and all looked good.
> 
> Fast forward to today, when I checked my ldmd.conf and realized while I
> put in the request lines for everything but the 0.25 deg data, I had
> 
> <boneheadmove>
> NEGLECTED TO COMMENT OUT THE ORIGINAL CONDUIT REQUEST LINES THAT ASK FOR
> THE WHOLE FEED
> </boneheadmove>

:-)

re:
> Which means that we have been ingesting and relaying the entire 0.25 deg
> GFS set along with everything else since July 28th. Go figure..

I noticed the 0.25 degree GFS addition to your ingest via idd.aos.wisc.edu's
rtstats reporting.  Since I did not notice any major difference in
latencies, I did not bother to login to your machine to tweak things.

re:
> Looking at my latencies, I see some spikes up to around 300 sec
> (presumably) when the 0.25 data is coming in, but no other issues.
> 
> http://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?CONDUIT+idd.aos.wisc.edu

I think you will see the same kind of latency spikes for all of
the sites relaying CONDUIT, so this is not unusual.

re:
> My volume graph clearly shows that we've been getting the 0.25 data,
> with the peaks of 20+ Gb/h every 6 hours
> 
> http://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?CONDUIT+idd.aos.wisc.edu

Yup.  Your volumes are a bit lower than what we are seeing here in
the UPC.  More on this at the bottom of this email.

re:
> I have not been saving any of the data yet, but so far I've seen no
> adverse impact on my systems or heard of any from our downstreams as a
> result of this.

OK.

re:
> My queue on idd.aos.wisc.edu is 24 Gb (32 Gb memory total) but only 1/4
> of the memory slots are full, so I can definitely add more if I need to.

Can you send the output of 'pqmon'?

re:
> The machine only has two ethernet ports on it, one of which is 1 Gbps
> in/out to the world, and the other of which is 1 Gbps private to my
> local data server here (weather.aos.wisc.edu)

OK.  We had to bond two Gbps Ethernet ports together on all of the
real server backends for idd.unidata.ucar.edu because of the number
of downstream connections we are servicing (now well over 1000).

re:
> So far I have not had any issues with saturating my bandwidth to the
> point where people have lost data, but we really aren't feeding CONDUIT
> data to that many downstreams.

Can you get time series plots of the volume of data flowing out
of your machine?

re:
> From my logs, I'm only feeding maybe a
> half a dozen sites the CONDUIT feed, and many of those are not
> requesting the whole feed. I probably can handle a few more CONDUIT
> feeds, if any of you want to lighten your load a bit.

OK.  It is generally the case that sites use sets of redundant
REQUESTs for high volume feeds like CONDUIT, so it would not
necessarily be the case that the number of REQUESTs to other
relays would drop.

re:
> If it gets to the point where 1 Gbps isn't enough, I probably can
> eliminate my private to-my-data-server link and instead bind the two 1
> Gbps connections together as you have at Unidata.

OK.  Just so you know, this would require some work on the router
into which the 2 Ethernet connections are connected.

re:
> One thing strange that I did notice, while I am requesting the CONDUIT
> feed from both ncepldm4.woc.noaa.gov and conduit.ncep.noaa.gov, and both
> are replying (one primary, one alternate), I only see one line of data
> on my stats graph, as if it's only coming from one server
> (vm-lnx-conduit2.ncep.noaa.gov)  - this is different from before I left
> on vacation - I used to have two separate lines from two upstream
> servers showing up on that graph.

Hmm... I'll have to look into this.

One of the things that we are concerned with is the minimum latency
time in the LDM queues at top level relay sites.  As you know, we
would prefer that relays have queues that are large enough to hold
an hour of data.  Examining the ouput of 'pqmon', or, better yet, looking
through the log files if you are running metrics monitoring will tell us
how close to/far from "the edge" idd.aos.wisc.edu is.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: GXA-551280
Department: Support CONDUIT
Priority: Normal
Status: Closed