[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #EFS-209580]: primary/alternate switching



Hi Karen,

> I've been working on setting up my "one-stop-shopping" ldm servers here
> at NSSL.  That is; two redundant ldm servers that are repositories for
> our agencies internal realtime data and I've noticed that because I have
> two servers that are close together I get a lot of switching back and
> forth.
> 
> I'm not sure if it's a problem or not

It isn't.

> -- but I was curious if there was
> some way to stop (or at least minimize) the "flapping"?
> 
> This is the kind of thing I am seeing.
> 
> Dec 01 21:35:24 nmqwd32 benjy.protect.nssl[2470] NOTE: Switching
> data-product transfer-mode to primary
> Dec 01 21:35:24 nmqwd32 benjy.protect.nssl[2470] NOTE: LDM-6 desired
> product-class: 20101201203524.729 TS_ENDT {{EXP,  "VILMA"},{NONE,
> "SIG=dd94f3986233a6f9a77a8ac1e3882006"}}
> Dec 01 21:35:24 nmqwd32 benjy.protect.nssl[2470] NOTE: Upstream LDM-6 on
> benjy.protect.nssl is willing to be a primary feeder
> Dec 01 21:35:24 nmqwd32 frankie.protect.nssl[2471] NOTE: Switching
> data-product transfer-mode to alternate
> Dec 01 21:35:24 nmqwd32 frankie.protect.nssl[2471] NOTE: LDM-6 desired
> product-class: 20101201203524.733 TS_ENDT {{EXP,  "VILMA"},{NONE,
> "SIG=dd94f3986233a6f9a77a8ac1e3882006"}}
> Dec 01 21:35:24 nmqwd32 frankie.protect.nssl[2471] NOTE: Upstream LDM-6
> on frankie.protect.nssl is willing to be an alternate feeder
> Dec 01 21:36:33 nmqwd32 frankie.protect.nssl[2469] NOTE: Switching
> data-product transfer-mode to primary
> Dec 01 21:36:33 nmqwd32 frankie.protect.nssl[2469] NOTE: LDM-6 desired
> product-class: 20101201203633.585 TS_ENDT {{EXP,  "NSE64"},{NONE,
> "SIG=16b751a2982b490e8985fd182831d000"}}
> Dec 01 21:36:33 nmqwd32 benjy.protect.nssl[2468] NOTE: Switching
> data-product transfer-mode to alternate
> Dec 01 21:36:33 nmqwd32 benjy.protect.nssl[2468] NOTE: LDM-6 desired
> product-class: 20101201203633.587 TS_ENDT {{EXP,  "NSE64"},{NONE,
> "SIG=16b751a2982b490e8985fd182831d000"}}
> Dec 01 21:36:33 nmqwd32 benjy.protect.nssl[2468] NOTE: Upstream LDM-6 on
> benjy.protect.nssl is willing to be an alternate feeder
> Dec 01 21:36:33 nmqwd32 frankie.protect.nssl[2469] NOTE: Upstream LDM-6
> on frankie.protect.nssl is willing to be a primary feeder
> Dec 01 21:46:58 nmqwd32 benjy.protect.nssl(feed)[23903] NOTE: feed or
> notify failure; COMINGSOON: RPC: Unable to receive; errno = Connection
> reset by peer
> Dec 01 21:46:58 nmqwd32 rpc.ldmd[2457] NOTE: child 23903 exited with
> status 6
> Dec 01 21:46:59 nmqwd32 benjy.protect.nssl(feed)[2874] NOTE: Starting
> Up(6.6.5/6): 20101201204658.669 TS_ENDT {{EXP,  "(.*)"}},
> SIG=c1ff60668b1c88f160a6ebee009027d3, Primary
> Dec 01 21:46:59 nmqwd32 benjy.protect.nssl(feed)[2874] NOTE: topo:
> benjy.protect.nssl {{EXP, (.*)}}
> Dec 01 21:48:01 nmqwd32 benjy.protect.nssl(feed)[2874] ERROR: Couldn't
> flush connection; nullproc_6() failure to benjy.protect.nssl: RPC:
> Unable to receive; errno = Connection reset by peer
> Dec 01 21:48:01 nmqwd32 rpc.ldmd[2457] NOTE: child 2874 exited with status 6
> Dec 01 21:48:02 nmqwd32 benjy.protect.nssl(feed)[3319] NOTE: Starting
> Up(6.6.5/6): 20101201204801.410 TS_ENDT {{EXP,  "(.*)"}},
> SIG=f9754eb28edf52465243191b89389729, Alternate

Here are your options:
    1. Don't do any switching. Modify the LDM REQUEST entries so that the 
feedtype/pattern pairs are syntactically unique BUT semantically identical, 
e.g.,

          REQUEST ALL .* benjy.protect.nssl
          REQUEST ALL (.*) frankie.protect.nssl

       This puts both connections in PRIMARY mode with the consequent 
disadvantage that the bandwidth usage will be approximately doubled. The 
product-queue will automatically reject duplicate products.

    2. Live with the log file entries. I routinely use grep(1) and egrep(1) 
with the "-v" option when scanning the log files. For example

          grep -v NOTE ldmd.log

> --
> -------------------------------------------
> 
> There are 2 kinds of people in the world:
> 
> 1) Those who can extrapolate from incomplete data.
> 
> -------------------------------------------
> address@hidden
> 
> Phone:  405-325-6982
> Cell: 405-834-8559
> INDUS Corporation
> National Severe Storms Laboratory


Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: EFS-209580
Department: Support LDM
Priority: Normal
Status: Closed