[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040708: LDM - Linux Red Hat 9 - LDM death caused by upstream LDM?



Tom,

> To: address@hidden
> From: "Tom Baltzer" <address@hidden>
> Subject: LDM - Linux Red Hat 9 - LDM death caused by upstream LDM?
> Organization: UCAR/Unidata
> Keywords: 200407081333.i68DXxBv002726 LDM

The above message contained the following:

> Institution: Unidata
> Package Version: 6.0.14
> Operating System: Linux Red Hat 9
> Hardware Information: Dual AMD 2 Ghz  w/2GB main memory
> Inquiry: Hey Steve,
> 
> While we were away, the LDM on lead1 up and died seemingly triggered
> by the upstream system (emo) - here is the log info:
> 
> Jul 02 07:09:29 lead1 pqact[4124]: pbuf_flush 10: time elapsed   3.183142 
> Jul 02 08:00:34 lead1 pqact[4124]: pbuf_flush 10: time elapsed   2.211820 
> Jul 02 09:57:07 lead1 pqact[4124]: pbuf_flush 8: time elapsed   4.056640 
> Jul 02 11:00:49 lead1 pqact[4124]: pbuf_flush 8: time elapsed   4.893119 
> Jul 02 13:28:00 lead1 emo[4130]: assertion "rlix != RL_NONE" failed: file 
> "pq.c", line 4092 
> Jul 02 13:28:05 lead1 emo[4129]: assertion "rlix != RL_NONE" failed: file 
> "pq.c", line 4092 
> Jul 02 13:28:12 lead1 rpc.ldmd[4121]: child 4129 terminated by signal 6 
> Jul 02 13:28:12 lead1 rpc.ldmd[4121]: Killing (SIGINT) process group 
> Jul 02 13:28:12 lead1 rpc.ldmd[4121]: SIGINT 
> Jul 02 13:28:12 lead1 pqact[4124]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4123]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4125]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4127]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4126]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4127]: Exiting 
> Jul 02 13:28:12 lead1 rtstats[4128]: Interrupt 
> Jul 02 13:28:12 lead1 eldm4[4131]: SIGINT 
> Jul 02 13:28:12 lead1 rtstats[4128]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4124]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4125]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4126]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4123]: Exiting 
> Jul 02 13:28:14 lead1 rpc.ldmd[4121]: Terminating process group 
> Jul 02 13:28:14 lead1 rpc.ldmd[4121]: child 4130 terminated by signal 6 
> Jul 02 13:28:14 lead1 rpc.ldmd[4121]: Killing (SIGINT) process group 

Interesting.  I don't recall seeing this before.

> I did a queuecheck and it indicated that the queue was corrupt, so I
> saved it in case that might be useful.

Good.

> What do you think?

I don't know yet.

Would you please send me the output of the command "ldmadmin config".

> Thanks,
> Tom.

Regards,
Steve Emmerson
> NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publically available
> through the web.  If you do not want to have your interactions made
> available in this way, you must let us know in each email you send to us.


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.