[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #JGM-300195]: LDM Solaris memory usage



Jeremy,

> We are running the LDM 6.0.14 on a Solaris 5.10.  We have been
> having some memory problems on this machine and we're trying to track
> down the memory usage by our various processes.  I know that the LDM
> uses a memory mapped file for the queue (which is set to 5GB for us) and
> each process spawned by the ldm should share this same memory mapped
> file.  That said, prstat is returning some strange results for us:
> 
> % prstat -a -n 16 -U ldm
> PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
> 
> 6116 ldm      4774M 1032K sleep   59    0   0:00:00 0.0% rpc.ldmd/1
> 9924 ldm      4774M 1080K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 8247 ldm      4774M  664K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 14 ldm      4774M 3528K sleep   59    0   0:00:05 0.0% rpc.ldmd/1
> 9923 ldm      4774M  648K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 4585 ldm      4774M 3360K sleep   59    0   0:00:08 0.0% rpc.ldmd/1
> 6388 ldm      4774M 3336K sleep   59    0   0:00:13 0.0% rpc.ldmd/1
> 6387 ldm      4774M  792K sleep   59    0   0:00:03 0.0% rpc.ldmd/1
> 299 ldm      4774M  768K sleep   59    0   0:00:01 0.0% rpc.ldmd/1
> 4588 ldm      4774M  792K sleep   59    0   0:00:03 0.0% rpc.ldmd/1
> 5322 ldm      4774M  672K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 998 ldm      4784M 2992K sleep   59    0   0:00:39 0.0% rpc.ldmd/1
> 994 ldm      4544K 1392K sleep   59    0   0:00:41 0.0% rpc.ldmd/1
> 1000 ldm      5784M   71M sleep   59    0   0:01:41 0.0% rpc.ldmd/1
> 997 ldm      4772M 3248K sleep   59    0   0:01:32 0.0% pqact/1
> 999 ldm      4775M 1600K sleep   59    0   0:00:27 0.0% rpc.ldmd/1
> NPROC USERNAME  SIZE   RSS MEMORY      TIME  CPU
> 
> 16 ldm        71G   96M   0.2%   0:05:41 0.0%
> 
> Is this just a quirk of prstat or is the ldm actually trying to use 71G
> of memory?  The machine has 48GB of memory plus an additional 8 of swap,
> so it won't be able to find 71GB, no matter how hard it tries.

I think you just deduced the answer to your question. :-)  If prstat(1) shows 
more memory usage than is possible, then the "problem" lies with prstat(1) and 
not the LDM.

I do recall some ps(1)s doing the same thing (not accounting for shared memory).

> Formatted version of the prstat output:
> % prstat -a -n 16 -U ldm
> PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
> 
> 6116 ldm      4774M 1032K sleep   59    0   0:00:00 0.0% rpc.ldmd/1
> 9924 ldm      4774M 1080K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 8247 ldm      4774M  664K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 14 ldm      4774M 3528K sleep   59    0   0:00:05 0.0% rpc.ldmd/1
> 9923 ldm      4774M  648K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 4585 ldm      4774M 3360K sleep   59    0   0:00:08 0.0% rpc.ldmd/1
> 6388 ldm      4774M 3336K sleep   59    0   0:00:13 0.0% rpc.ldmd/1
> 6387 ldm      4774M  792K sleep   59    0   0:00:03 0.0% rpc.ldmd/1
> 299 ldm      4774M  768K sleep   59    0   0:00:01 0.0% rpc.ldmd/1
> 4588 ldm      4774M  792K sleep   59    0   0:00:03 0.0% rpc.ldmd/1
> 5322 ldm      4774M  672K sleep   59    0   0:00:02 0.0% rpc.ldmd/1
> 998 ldm      4784M 2992K sleep   59    0   0:00:39 0.0% rpc.ldmd/1
> 994 ldm      4544K 1392K sleep   59    0   0:00:41 0.0% rpc.ldmd/1
> 1000 ldm      5784M   71M sleep   59    0   0:01:41 0.0% rpc.ldmd/1
> 997 ldm      4772M 3248K sleep   59    0   0:01:32 0.0% pqact/1
> 999 ldm      4775M 1600K sleep   59    0   0:00:27 0.0% rpc.ldmd/1
> NPROC USERNAME  SIZE   RSS MEMORY      TIME  CPU
> 16 ldm        71G   96M   0.2%   0:05:41 0.0%

Incidentally, version 6.0.14 of the LDM is quite old: it was release on 
2003-07-21. There have been very significant improvements in the LDM since 
then. I strongly suggest upgrading to the latest version.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: JGM-300195
Department: Support LDM
Priority: Normal
Status: Closed