[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #MMY-940696]: radar3 LDM user latencies



Holly,

> NCEP/NCO is providing Radar Level 3 data via LDM to a few customers.  We
> are seeing some strange behavior in our logs and are hoping you can provide
> some insight and/or troubleshooting tips.
> 
> First we have 2 sites that are set up with the same data feeds.  One in
> College Park and one in Boulder that are serving as backups of each other.

Are your customers requesting redundantly from both upstream sites?

> This occurs mainly around 04-06Z, but does not impact all of our users.
> Substituting IPs below.  I grepped for a specific filename in our logs to
> see when it was sent during a time window when I know we see latent data.
> Our customers are requesting NEXRAD3 (or FT27) *

Do all of your customers have the same REQUEST entries (i.e., for the same
feedtype and pattern)?

> Results below:
> 
> Jun 12 04:28:48 radar3-ldm customer1(feed)[18840] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:28:48 radar3-ldm customer2[5734] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:28:48 radar3-ldm customer3(feed)[18841] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:28:48 radar3-ldm dataflowcustomer(fee[7260] INFO: sending: 4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:28:48 radar3-ldm1b customer4(feed)[24960] INFO: sending: 4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:28:48 radar3-ldm customer5[5735] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> 
> *Jun 12 04:28:52 radar3-ldm customer6(feed)[21958] INFO: sending: 4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:29:16 radar3-ldm customer7(feed)[18839] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:35:41 radar3-ldm customer8(feed)[24927] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:39:52 radar3-ldm customer9(feed)[13977] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000 wsr88d04286.tbz2
> Jun 12 04:45:12 radar3-ldm customer10(feed)[22735] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz2
> Jun 12 04:46:54 radar3-ldm customer11(feed)[25115] INFO: sending:    4546170 
> 20170612042848.020 NEXRAD3 000  wsr88d04286.tbz*
> 
> The bolded portion is where we begin to see latent data, but customer6 has
> a delay on the order of a few seconds, and customer11 is seeing an 18
> minute delay.  Can you provide any information for us as to why this may be
> happening?  We have engaged our network team and they have looked but were
> unable to find any issues with asymmetric routing or problems on the user
> ISP side.  Really, any help or insight you can provide would be much
> appreciated.

There are several possibilties:
    1. Differential subscriptions. For example, customer1 requests
       "NEXRAD3 .*\.tbz2" but customer11 requests "ANY .*". grep(1) for the
       string "Starting Up" in the upstream LDM log file.
    2. Link congestion between the upstream and downstream sites. This can
       decrease the effective bandwidth between the two sites. The best way
       to investigate this is by making time-series plots of parameters like
           A. The ping(1) times between the two sites
           B. The round-trip-time (RTT) and variance (RTTVAR) parameters from
              the command "ip tcp_metrics show <<addr>>", where <<addr>> is
              the IPv4 address of a downstream customer.
    3. Differential process priority. All LDM processes run with the same 
       priority. Something outside the LDM could, however, be messing with
       them. The ps(1) command can be made to show the priority of a
       process.
    4. Unknown downstream problem. Have the downstream LDM user investigate
       the LDM log file for anything untoward around the time of the
       slowdown.

Please keep us apprised.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: MMY-940696
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.