[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030505: use of PRIMARY/ALTERNATE ldm.conf line vs failover



Neil,

> To: address@hidden
> From: "Neil R. Smith" <address@hidden>
> Subject: use of PRIMARY/ALTERNATE ldm.conf line vs failover
> Organization: UCAR/Unidata

The above message contained the following:

> The 6.0.10 level site manager guide isn't out yet so I'm having
> difficulty figuring out this on my own: Does the PRIMARY/ALTERNATE field
> in 6.0.10 replace ~ldm/bin/ldmfail functionality?

The PRIMARY/ALTERNATE functionality differs from the ldmfail(1)
functionality. A PRIMARY upstream LDM sends products to the downstream
LDM in one transaction without waiting for a reply. In contrast, a
ALTERNATE upstream LDM first asks the downstream LDM if it want the
product (which requires waiting for a reply) and then sends it in a
single action if the response is affirmative. These protocols don't
change if a PRIMARY upstream LDM goes off-line.

By contrast, the ldmfail(1) utility restarts the LDM with a new
configuration file.  Frankly, I would avoid using ldmfail(1) because
it's not very smart.

How to use the PRIMAY/ALTERNATE mechanism depends on your situation. If
bandwidth is cheap, then you might well have a bunch of PRIMARY upstream
LDM-s for the same feed (the product-queue will reject duplicates). If,
on the other hand, bandwidth is expensive, then having an ALTERNATE feed
is relatively cheap and will carry you through the hopefully short period
of time that the PRIMARY LDM is off-line (0 to 2 hours?).

If the PRIMARY LDM is offline for a long period of time, then something
is very wrong and something like the ldmfail(1) utility might well be
appropriate. Unfortunately, there's no way, at present, to detect this
situation.

> It would seem a second purpose would be useful .. that of the problem of
> the independent NIDS routing topology. If my designated NNEXRAD feed
> fails, the ldm doesn't know to fail over, or what to fail over to, does
> it? Say my primary feed for NNEXRAD is cirp (which is down right now)
> and primary for everything else (except NLDN) is thelma. I've got iita
> for failover of everything else but have thelma as NNEXRAD backup incase
> cirp is unavailable. How should I set up for failover of the non-NNEXRAD
> and NNEXRAD feeds in the 6.0.10 ldmd.conf?

If bandwidth is cheap, one possibility is to get everything from
everybody (figuratively speaking):

    request NEXRAD .* cirp PRIMARY
    request NEXRAD .* thelma.ucar.edu PRIMARY
    request ANY-NEXRAD .* thelma.ucar.edu PRIMARY
    request ANY-NEXRAD .* iita.ucar.edu PRIMARY

This can be consolidated into

    request NEXRAD .* cirp PRIMARY
    request ANY .* thelma.ucar.edu PRIMARY
    request ANY-NEXRAD .* iita.ucar.edu PRIMARY

(I'd be very careful about the use of ANY, by the way.  Thelma's
receiving A LOT of data.)

If bandwidth is expensive on either your end or the upstream LDM's end,
then the following is a solution for short-term problems:

    request NEXRAD .* cirp PRIMARY
    request NEXRAD .* thelma.ucar.edu ALTERNATE
    request ANY-NEXRAD .* thelma.ucar.edu PRIMARY
    request ANY-NEXRAD .* iita.ucar.edu ALTERNATE

(Again, I really wouldn't use ANY unless I was certain it was
appropriate.)  If either Cirp or Thelma goes off-line, you'll still
receive the data -- but at a reduced rate.

> Thanks, -Neil
> -- 
> Neil R. Smith, Comp. Sys. Mngr.               address@hidden
> Dept. Atmospheric Sci., Texas A&M Univ.       979/845-6272 FAX:979/862-4466

Regards,
Steve Emmerson