[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #MMY-812358]: Possible memory leak in ldm



Bill,

> I have done some more testing with this problem.
> 
> First I removed the "EXEC pqexpire" from the ldmd.conf file.  That did
> change how the products were expired from the queue but did not change
> the memory usage behavior.  Now the products reside in the queue much
> longer than they did before (around 10000-13000 seconds vs the 3600 from
> before).
> 
> To ensure the queue is large enough I have a queue size set of 1.5GB.
> There are 6 files containing model data that we transfer between WFO
> Seattle and WFO Portland 2 times a day, in addition to a lot of other
> smaller files that we request from several other LDM servers.  The file
> sizes are as follows:
> 
> 815M
> 341M
> 277M
> 232M
> 187M
> 164M
> 
> 37M of other files in assorted requests
> 
> I have been using the "top" command to diagnose where the memory is
> being used (not sure that is entirely reliable).  When I start the LDM
> it immediately uses up 1.5 GB (plus a little more) of memory, which is
> the size of the queue.  I notice that it starts a new process for each
> request line in the ldmd.conf file.   Initially I had put the 6 files on
> 6 different request lines in the ldmd.conf file.  The LDM seems to then
> start 6 different processes for each of these requests.   As each of the
> large files come in at different times of the day, each one of these
> processes begins to use more memory until it reaches the size of the
> file it is requesting.  After the file is received and processed the
> memory is not released so it continues to be used even after the product
> is removed from the queue. In the end , these 6 different processes are
> using memory that totals the size of the 6 files ~2GB (plus the size of
> the other request files 37M).   Once this happens the memory usage
> stabilizes.  The total memory usage is the sum of the size of the files
> being received plus the queue size or about 3.5 GB in our case.

Your observations are consistent with the design of the LDM.  Each REQUEST 
entry results in a downstream LDM process.  A downstream LDM process will grow 
to a size determined by the largest data-product it receives.  Allocated memory 
will not be released because of the way the sbrk(2) system-function works.

> Since I noticed that there was memory allocated for each request line in
> the ldmd.conf file I modified the ldmd.conf file to whildcard all of the
> 6 large files into one request line.  After doing this only one process
> seems to be started for the large files.  As the 164M file comes in at
> the beginning of the day, 164M of ram is allocated to this process.  As
> the next file comes in the memory usage grows until finally the 815M
> file arrives and the memory used by this process is ~815M.  By doing
> this the LDM is now using 1.5G + 815M = ~2.315G (plus a little more).
> 
> If this is the normal behavior of the program then we will just need to
> have sufficient memory to hold the queue and the sum of all the largest
> files in each request line.  However, I did also note, when I stop the
> ldm only part of the memory is released.  It looks like it is the memory
> used by the individual rpc.ldmd processes and not the 1.5G allocated
> when the ldm starts?   Also of interest it appears that WFO Seattle who
> inserts the products directly into the queue and who is using a queue
> size of 1G does not have a problem (except that they are using all of
> the physical memory and about 3M of swap space).
> 
> Is this behavor what I should expect to see or am I misinterpreting
> something?  Any information you can provide would be helpful.

Your observations and interpretation are correct.  Combining the REQUEST 
entries was a great thing to do.

> Thanks,
> Bill

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: MMY-812358
Department: Support LDM
Priority: Normal
Status: Closed