[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #SOP-413350]: Fwd: networking LDM configuration query



Gregory,

Sorry for taking so long to respond. I'been out of the office.

> I'm wondering if you can tell me what role the various TCP keepalive
> settings on a linux system have regarding LDM and downstream LDM
> connections?  This is regarding a proposed change to systems at NWS
> running MRMS and feeding data into the LDM systems for downstream
> customers.
> 
> For example the default Linux settings consist of:
> 
> tcp_keepalive_time = 7200 (seconds)
> tcp_keepalive_ intvl = 75 (seconds)
> tcp_keepalive_ probes = 9 (number of probes)
> 
> and they are going to be changed to:
> 
> tcp_keepalive_time =  120 (seconds)
> tcp_keepalive_ intvl = 60 (seconds)
> tcp_keepalive_ probes = 6 (number of probes)
> 
> Apparently there were some issues with the NWS MRMS LDM systems the week
> of August 27th where TCP connections were not being closed and took up all
> the connections on the LDM systems in College Park.

That's very odd. It's possible, I suppose, that a downstream LDM could close 
its connection and the upstream LDM not be notified -- but the next time the 
upstream LDM tried to send something, the attempt would fail, the connection 
would be marked as closed, the upstream LDM would be notified and would 
terminate.

How often are MRMS products sent?

You can use the uldbutil(1) utility to list the active upstream LDM processes.

> It is thought the
> proposed changes above will decrease the time that the connections will
> exist on the Virtual Machines.  During the week of the 27th connections
> were being monitored and they were not being disconnected until 2 hours
> later, and the VMs ran out of connections.

Very odd. It's like your network or firewall is dropping the disconnect packets 
from the downstream LDMs.

> What is the impact to immediate downstream LDM customers receiving lots
> of data (e.g. MRMS)?

There should be no impact to downstream LDMs with an open connection.

As I previously wrote, an attempt by an upstream LDM to send on a disconnected 
connection will result in the termination of that connection's upstream LDM 
process.

> Does LDM have any predefined connection limit, for example 1024, that
> has some role in these settings as well?

The "/server/max-clients" registry parameter can be used to set the maximum 
number of allowed upstream LDM processes. The default is 256.

> On a telcon a little earlier today when the proposed changes were
> mentioned I mentioned UNIDATA IDD moves a LOT of data as well (e.g. many
> small products and many big products, etc) and I had not heard of
> similar changes needing to be made for the IDD community.

That's correct. We move something like 400 petabytes per year and haven't seen 
anything like this.

> I'm hoping
> the LDM feeds of MRMS don't have any issues if these changes are made,
> hence my email.

I don't see how the new parameters would cause a problem -- but without knowing 
the cause of the problem, I'm not sure this will fix it.

Please keep us apprised.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: SOP-413350
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.