Re: [ldm-users] ldm data dir question

Hi Jack,

First thing I want to point out is (barring any symlink or similar
shenanigans) your product queue is not under /home/ldm/var/data/.   As
shown by LDM's error message, the product queue is the
/home/ldm/var/queues/ldm.pq file.  That single file will house the entire
queue, so you wouldn't see excessive files from that.

That being said, the times I've had issues like yours with not being able
to log in or issue commands, it was usually because of either a full root
partition ("/"), full /tmp partition (unlikely that's relevant here, but
just FYI), full memory, or full inodes on a partition.  I see Tom already
asked about "df -h" output, and you already checked inodes and that appears
fine.  But those have been some of my experiences as well.

So what IS in /home/ldm/var/data ?  My guess is that's where LDM is saving
data to, and that configuration would be found in your pqact file(s).  One
thing you could try is running the following command to see what LDM will
attempt to save in that directory (assuming your pqact file(s) are named
"pqact..." and in that dir, otherwise adjust accordingly):  "grep var/data
~/etc/pqact* | grep -i file"  (without quotes)

Side-note to the above:  By default, relative paths with the FILE action
will start in the "/home/ldm" directory.  This is set in ~/etc/registry.xml
under /pqact/datadir-path, and you can check it with "regutil
/pqact/datadir-path" (without quotes).  If that points straight to your
/home/ldm/var/data/ dir then THAT becomes the default starting point for
relative paths (and it might make the above grep command come back empty).

If there are actions to save data there they should (hopefully but not
guaranteed to) be listed by that grep command, and that could point you
where to look next.  If it comes back empty then maybe something's getting
PIPEd to a script which is in turn saving data there, but that might be
harder to track down.  Either way, it's hard to know without looking in
that directory or your pqact(s) what might be happening, but hopefully this
will yield a clue or two.  It's possible you're getting more than you think
you're asking for, and it's leading to that directory filling up... and if
that's on the root partition it could explain the log in / lock up issues.

You also mentioned ldmadmin scour doesn't seem to be doing much.  Check
~/etc/scour.conf to see where it's doing actual scouring.  Maybe it's not
looking in that data directory, or maybe it is letting files stay too long.

I'd also be curious about the size of your product queue vs. the size of
the partition it's on.  If it's able to get made and LDM starts at all it's
probably fine, but it is worth paying attention to.  The size of the queue
gets defined in ~/etc/registry.xml, then just compare "ls -lh
/home/ldm/var/queues/ldm.pq" and "df -h" to see how the partition is
filling up the disk.  I try to ensure the partition it's on stays at 75% or
less, though I don't think that's a true hard/fast rule, just guidance.

Some reference pages that may be useful to you if you haven't seen these
already:
https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/ldmd.conf.html
https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/pqact.conf.html
https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/scour.conf.html

https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/LDM-registry.html


Per your last email:
>  just to confirm... find and rm on the data dir won't mess up / confuse
the ldm queue stuff?

It shouldn't.  Again, from what I've seen in your original email that's not
where the queue is.  And even if it were, scour shouldn't touch it as long
as it keeps updating (though rm -rf would).  I'd double-check
~/etc/registry.xml to verify the queue is housed elsewhere, but it sounds
like you should be fine on this.

Hope some of this helps you out,

-Mike

======================
Mike Zuranski
Meteorology Support Analyst
College of DuPage - Nexlab
Weather.cod.edu <http://weather.cod.edu/>
======================


On Fri, Apr 24, 2020 at 1:32 PM Jack Snodgrass <jack@xxxxxxxxxxxxxx> wrote:

> having issues with our server ( centos7 ) that runs ldm... locking up. It
> has happened 2 times in the last 3 weeks or so.
> The server is pingable... so it's not totally dead.. but you can't get a
> local or remote console to start. can't figure out if it is out of memory
> or file handles or what.... it's like a ghost of itself.
>
> After rebooting... the  /home/ldm/var/data/ has around 350,000 files in
> it.  I am not sure if that is 'ok' or a bit extra.
>
> We are running a
>
> ldmadmin scour
>
> command... via cron but I don't know what that is doing exactly or it it's
> doing much.
>
> when I try and restart ldm it says:
>
> Checking the product-queue...
> The writer-counter of the product-queue isn't zero.  Either a process
> has the product-queue open for writing or the queue might be corrupt.
> Terminate the process and recheck or use
>     pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
>     /home/ldm/var/queues/ldm.pq
> to validate the queue and set the writer-counter to zero.
> LDM not started
>
>
> In the past.... during testing and what not.. I've been able to run:
> pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
> /home/ldm/var/queues/ldm.pq
>
> and ldm would start after that. This time.. with the 350K files or so..
> that pqcat stuff fails.
>
> I am deleting older ( than a day ) files from the /home/ldm/var/data/
> direcory... going to see if
>
> pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
> /home/ldm/var/queues/ldm.pq
>
>
> will work or if I have to rm -rf /home/ldm/var/data/ and start a new q.
>
>
> If  ldmadmin scour does not let us remove enough files from
> /home/ldm/var/data/ can I use find and rm to remove files or do they have
> to be removed using ldm to keep and queses or indexes  in sync?
>
> - jack
>
> --
> *jack* - Southlake Texas - http://mylinuxguy.net - *817-601-7338*
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web.  Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> ldm-users mailing list
> ldm-users@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> https://www.unidata.ucar.edu/mailing_lists/
>
  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: