Mike, > I have a situation on the COD Noaaport servers Gilbert has been testing LDM > on that I wanted to bring to your attentions. We have two Noaaport > servers, and to my knowledge each are running the latest beta version > (looks like 184.108.40.206). More like "alpha". > On a downstream server running LDM-6.13.6 I've > been monitoring bandwidth, and I've been noticing sharp increases in > traffic coming from Unidata and Wisc.edu idd servers via LDM. When I look > at the logs, this downstream server isn't able to connect to either of the > two Noaaport servers, so it fails over to retrieve data from the other > sites. (noaaport1.cod.edu WARN error.c:236:err_log() Couldn't > connect to LDM on noaaport1.cod.edu using either port 388 or portmapper; : > RPC: Remote system error - Connection refused) A "connection refused" message usually means that there's no appropriate ALLOW entry in the upstream LDM's configuration-file. We've also seen that message due to a firewall or intrusion detection/prevention system (IDS, IPS). In such cases, one can try using telnet(1) and/or ncat(1): telnet noaaport1.cod.edu 388 ncat noaaport1.cod.edu 388 ncat(1) is called nc(1) on some systems. > A remake of the product queue and a restart of LDM on the Noaaport servers, > and the downstream server connects immediately and traffic from the > fail-over sites ceases. If I run pqcheck prior to deleting the queue it > returns status 3, which in my experience is normal / no problems found. An exit code of 3 indicates that writer-counter in the queue isn't zero, either because a process has the queue open for writing or because a process terminated without closing the queue (because it was killed, for example). If no process has the queue open for writing, then it's a good idea to execute the command "pqcat -l- -s && pqcheck -F -q". > It > looks like there is more verbose logging turned on with these betas so the > logs are hundreds of megs, sometimes over a gig in size, but after > filtering out things like decoding and grib2 errors I do see what seems > like an excessive number of "Gap in packet sequence" messages. Hard to for > me to tell if this is related or a red herring, but figured I'd mention > it. That's a separate issue caused by poor NOAAPort reception. > Also, it's common that some combination of ldmd, noaaportIngester or > even rtstats procs hang on 'ldmadmin stop' where I need to forcibly kill > them in order to restart LDM, and in some cases reclaim disk space from the > product queue; it's far from cleanly shutting down. You should give it at least 30 seconds. > Today was the second time within a week where I've had to restart LDM on > those servers to get the downstream server to reconnect. I don't recall > having this kind of issue prior to testing the beta version. And since > I've been watching bandwidth more closely I suspect this has been happening > pretty regularly in the last month at least maybe longer, but I'm only just > now connecting the dots of the connection issue, fail-over & the rest. I > don't like that I can't depend on either of the Noaaport servers, so if > this keeps happening I'm probably going to revert to 6.13.6 on at least one > of them. I'll let you all know if/when I do. You might try running 6.13.6 on one of the NOAAPort ingest systems to see if it terminates better. > If anyone has any ideas I'd love to hear them. Otherwise I'll keep you > posted. Appreciate it. Regards, Steve Emmerson Ticket Details =================== Ticket ID: WKI-561206 Department: Support LDM Priority: Normal Status: Closed =================== NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.