[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #UXU-834546]: Crash of NGRID feed on noaaport1/2.cod.edu



Gilbert,

The log messages on or before 2016-08-05T08:27:07Z are normal. The upstream LDM 
process terminated because the downstream LDM decided to switch the transfer 
mode from alternate to primary. This did not cause the top-level LDM server to 
terminate.

Over 3 minutes later, the top-level LDM server starts terminating -- for no 
apparent reason. I suspect that it was sent a SIGTERM.

Is there anything relevant in the system (not LDM) log file around this time?

> Unfortunately, earlier this week, using LDM 6.13.3, their NOAAport
> ingestor crashed the NGRID feed. You should know that, due to
> some satellite dish issues (water buildup in the snow cover, and wasps),
> NOAAport reception right now is not optimal.
> 
> Twice, almost back to back and simultaneously each time,
> the NGRID feed crashed on noaaport1.cod.edu and
> noaaport2.cod.edu. All other feeds stayed up.
> 
> This was the error message and the lead-in I see in the
> ldmd.log file when this happened:
> 
> 20160805T082602.072770Z climate.cod.edu(feed)[116852] NOTE
> up6.c:445:up6_run() Starting Up(6.13.3/6): 20160805072601.066574 TS_ENDT
> {{NOTHER|NGRAPH|NGRID|HDS, ".*"}}, SIG=eb0e565fa08e0c2793ca3409e5533043,
> Alternate
> 20160805T082602.072801Z climate.cod.edu(feed)[116852] NOTE
> up6.c:448:up6_run() topo:  climate.cod.edu {{NOTHER|NGRAPH|NGRID|HDS,
> (.*)}}
> 20160805T082706.633382Z climate.cod.edu(feed)[116852] NOTE
> error.c:236:err_log() Failure; COMINGSOON: RPC: Unable to receive; errno =
> Connection reset by peer
> 20160805T082706.635423Z climate.cod.edu(feed)[116852] NOTE
> ldmd.c:187:cleanup() Exiting
> 20160805T082706.636124Z ldmd[120426] NOTE ldmd.c:170:reap() child 116852
> exited with status 6
> 20160805T082707.638227Z climate.cod.edu(feed)[116868] NOTE
> up6.c:445:up6_run() Starting Up(6.13.3/6): 20160805072706.633104 TS_ENDT
> {{NOTHER|NGRAPH|NGRID|HDS, ".*"}}, SIG=48b0d339df35d179bde87bfd71cc3129,
> Primary
> 20160805T082707.638266Z climate.cod.edu(feed)[116868] NOTE
> up6.c:448:up6_run() topo:  climate.cod.edu {{NOTHER|NGRAPH|NGRID|HDS,
> (.*)}}
> 20160805T083035.871180Z ldmd[120426] NOTE ldmd.c:187:cleanup() Exiting
> 20160805T083035.871264Z ldmd[120426] NOTE ldmd.c:258:cleanup() Terminating
> process group
> 20160805T083035.871730Z noaaportIngester[120432] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.871747Z noaaportIngester[120432] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.871816Z noaaportIngester[120437] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.871898Z noaaportIngester[120437] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.871907Z noaaportIngester[120431] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.871927Z noaaportIngester[120431] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872034Z noaaportIngester[120433] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872102Z noaaportIngester[120433] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872226Z noaaportIngester[120430] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872241Z noaaportIngester[120430] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872388Z noaaportIngester[120436] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872431Z noaaportIngester[120436] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872478Z noaaportIngester[120429] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872492Z noaaportIngester[120429] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872667Z noaaportIngester[120434] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872715Z noaaportIngester[120434] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872826Z noaaportIngester[120435] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872885Z noaaportIngester[120435] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.872901Z noaaportIngester[120428] ERROR
> fifo.c:340:fifo_transferFromFd() Interrupted system call
> 20160805T083035.872914Z noaaportIngester[120428] ERROR
> fifo.c:340:fifo_transferFromFd() Couldn't read up to 65507 bytes from file
> descriptor 5
> 20160805T083035.873050Z weather.cod.edu(feed)[115655] NOTE
> ldmd.c:187:cleanup() Exiting
> 20160805T083035.873758Z noaaportIngester[120434] NOTE
> noaaportIngester.c:754:reportStats()
> ----------------------------------------
> Ingestion Statistics:
> Since Previous Report (or Start):
> Duration          P13DT7H10M23.080553S
> Raw Data:
> Octets        0
> Mean Rate:
> Octets    0/s
> Bits      0/s
> Received frames:
> Number        0
> Mean Rate     0/s
> Missed frames:
> Number        0
> %             -nan
> Full FIFO:
> Number        0
> %             -nan
> Products:
> Inserted      0
> Mean Rate     0/s
> Since Start:
> 
> (etc).
> 
> 
> Since this was done, LDM 6.13.4 was installed, and, so far,
> this hasn't happened. But, it also happened under 6.13.2.
> No core was dumped, even though it is now
> explictly allowed.
> 
> Any ideas as to what could have happened here?

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: UXU-834546
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.