Dave, > Yeah, I might have been the one who originally encountered this > problem and asked you'all for help with it. You are (now that you reminded me -- it's truly been a while). > Apple's lack of response > has been disappointing. To say the least. > Maybe they'll be more responsive during this > period immediately following Snow Leopard's release, while they try to > get the bugs out of it. (Either that, or they'll be more overwhelmed > than usual with bug reports.) From your fingers to their eyes. :-) I suspect that there just aren't that many programs running on Mac OS X 10.4 (or higher) systems that use fcntl() file-locking and mmap() memory-mapping as much as the LDM. > Unidata's "known bug" entry for this problem notes that there is no > workaround. That's true in a sense, but if it were strictly true then > I'd never be able to stop the LDM at all, even when processes hang > (which eventually some of them do on a semi-regular basis). I think a hung downstream LDM will, nevertheless, terminate upon reception of a SIGTERM, which is what the top-level LDM server sends all child processes when it's told to terminate. I could be wrong, however. One thing I have noticed is that attaching to the hung process with gdb(1) and then exiting gdb(1) will free the process from its hung state. I'm at a loss to understand how that happens without intervention by the operating system. > I've > written scripts that try to deal with the inability to run "ldmadmin > stop" to stop the LDM; maybe you could comment on whether or not I've > got the bases covered acceptably: > > (1) Run "ldmadmin stop", redirecting the output to a file. > > (2) Check that file for the word "isn't" (as in "the LDM isnt > running", or something like that). > > (3a) If the LDM isn't running, check to see if there's a ldmd.pid > file. > -- If there's no ldmd.pid file, run pqcat & pqcheck, then run > "ldmadmin clean". > > (3b) If the LDM is running, wait 30 seconds to give it a chance to > shut down. (This is typically doomed, at least for some rpc processes.) Maybe you should give it a minute. > (4) Get a list of pids for rpc processes owned by the ldm account. > > (5) If this list isn't empty: > -- Run "kill -9" on them all. > -- Run "ldmadmin clean". > -- Run pqcat and pqcheck. If this doesn't produce a "tallied > consistent" message, run "ldmadmin delqueue" and "ldmadmin mkqueue". > > (6) Run "ldmadmin start". This procedure should result in a restarted LDM. I just wish it wasn't necessary. > (Script attached.) > > -- Dave Regards, Steve Emmerson Ticket Details =================== Ticket ID: USJ-914724 Department: Support LDM Priority: Normal Status: Closed
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.