[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[TIGGE #EEC-237791]: Memory problem with LDM Server at CMA



MA Qiang,

> We have replaced the LDM server with a box which has 32GB memory and 2 Intel 
> Xeon 5050 3.0GHz CPU today.

That is a powerful system.

> The configuration of tigge-ldm.cma.gov.cn is as follows:
> 
> Linux version 2.6.9-42.ELsmp
> 2 CPU: Intel(R) Xeon(TM) 5050 CPU 3.0GHz
> Memory (output of command 'free'):
> total       used       free     shared    buffers     cached
> Mem:      32908872   14898472   18010400          0      62848   13043336
> -/+ buffers/cache:    1792288   31116584
> Swap:      2097144          0    2097144

With that much memory, I'm surprised that swap space is being
used.

> Product queue size:
> $pq_size = "12G";
> 
> ldm01@tgn06 [ /space/ldm_pq ]
> $ pqmon -q /space/ldm_pq/ldm.pq
> Feb 05 03:48:06 pqmon NOTE: Starting Up (24674)
> Feb 05 03:48:06 pqmon NOTE: nprods nfree  nempty      nbytes  maxprods  
> maxfree  minempty    maxext  age
> Feb 05 03:48:06 pqmon NOTE:  30841     1 2898845 11999572912     32795        
> 3   2896891    429136 3237
> Feb 05 03:48:06 pqmon NOTE: Exiting

Was the "pqmon" executed after the product-queue reached
equilibrium (i.e., after sever hours)?.  If so, then the
output indicates that, at that time, the product-queue was
limited by the size of the data portion and not by the
number of slots (which is much larger than necessary).  It
also indicates that the product-queue still can't hold one
hour's worth of data (if it's reached equilibrium).

I suggest that you use "gnuplot" to plot time series of the
parameters in the "uptime" output file.  I can send you
some scripts that you can adapt, if you like.

> I think this box can undertake the LDM product-queue now.
> 
> In ldmd.log, there are some error such as "Feb 05 05:00:04 tgn06 
> rpc.ldmd[10083] NOTE: Denying connection from "accessdepot-nat.cmc.ec.gc.ca". 
> I had commented the request directives about cmc.ec.gc.ca in ldmd.conf on Jan 
> 16, but I still get the error now. Could you please tell me how to resolve 
> this problem?

I tried to log onto "tgn06" to investigate but couldn't.  Would
you please send me the LDM configuration-file (etc/ldmd.conf) or
enable me to log onto the system.

> Thank you!
> 
> Best regards,
> MA Qiang
> 
> National Meteorological Information Center
> 2008-02-05

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: EEC-237791
Department: Support IDD TIGGE
Priority: Normal
Status: Closed