[TIGGE #EEC-237791]: Memory problem with LDM Server at CMA

MA Qiang,

> The LDM system crashed on Dec. 16, 2007 and Jan.7, 2008. The console 
> displayed some message about swap was full and some processes such as mmfsd 
> had been killed when crashed.

Can you find that error message and send it to me.

> The current output of 'free' is as follows:
> # free
> total       used       free     shared    buffers     cached
> Mem:       8161208    8125968      35240          0     150564    2716312
> -/+ buffers/cache:    5259092    2902116
> Swap:      2097144        212    2096932
> I have attached the ldmd.conf to this email.  Please check it.

The "ldmd.conf" file looks OK and explains why you have so many connections.

> There are 7 PC Servers(tgn00 to tgn06) in TIGGE platform at CMA. The tgn01 
> has 8GB memory, and LDM is installed on it. IBM CSM is installed on all the 7 
> servers. The tgn00 is the CSM Server. IBM GPFS is installed on tgn01 to tgn06.

Is the LDM product-queue on a disk that is local to tgn01?  Can
that disk be accessed from another computer via GPFS?

Is the swap-file for tgn01 on a local disk?  Can that disk be
accessed from another computer via GPFS?

I hope to send you another email soon.

> The output of 'netstat -pat | grep ldm' is as follows:
> tgn01::/usr/local/ldm01/etc:
> # netstat -pat | grep ldm
> tcp        0      0 *:ldm                       *:*                         
> LISTEN      9896/rpc.ldmd
> tcp        0      0              tigge-ldm.ecmwf.int:47308   
> ESTABLISHED 23099/rpc.ldmd
> tcp        0      0            depot.cmc.ec.gc.ca:ldm      
> ESTABLISHED 9924/rpc.ldmd

> The domain name of tgn01 is 'tigge-ldm.cma.gov.cn'. But for some reason, you 
> can not log into this server, and I can execute some commands to collect 
> relative  information as your instructions.
> Thank you!
> Best regards,
> MA Qiang
> National Meteorological Information Center
> 2008-01-15

Steve Emmerson

