>From: "Dan A. Dansereau" <address@hidden> >Organization: USU >Keywords: 200305201449.h4KEnaLd000625 McIDAS-X 2002 ldm-mcidas Tru64 Hi Dan, re: why images keep getting decoded into the same output AREA > Thanks - I'm at the wrong end of a rope! Well, I think I just joined you at the end of that rope! >P.S. I really want to know what I did or did not do! The decoding on climatemine is working now, but I am not sure what I did to get it to this point (if anything). Here is what I did: - mucked around with the directory permissions under /var/data. I had found that /var/data was owned by root, and that 'mcidas' couldn't change the permissions on ROUTE.SYS and SYSKEY.TAB in /var/data/mcidas. This doesn't make much sense since 'mcidas' owned the files I was trying to change permission on! - changed permissions on ROUTE.SYS and SYSKEY.TAB to rw-rw-r; they were rwx-rwx-rwx - created the ~ldm/mcidas/data directory - brought over the source for ldm-mcidas v2002b and built the package from source. I then added more and more debug output to pnga2area.c and pngsubs.c to try and get a handle on what was happening. During this process, I changed one line of code that compares two strings. The check that was there checked the first two characters, and I changed it to check 4 characters. This _should not_ have had any effect on the decoding of images When I left work yesterday afternoon, the decoding was not working correctly. The decoder, pnga2area, would read the routing table, ROUTE.SYS, to get the last AREA number that an image of the type being processed was stored in. What it was not doing -- for some unknown reason -- was incrementing that number by 1 so that the new image would be decoded into a different AREA (this was the crux of the problem, btw). The debug statements that I added were to find out why that output AREA number was not being incremented. I suspected that the code _was_ actually incrementing the number, but the information was not getting written back to the routing table for some reason (hence the mucking with file/directory permissions). This would cause the decoder to think that the image being processed was the first one ever received, so new images would keep getting decoded into the same AREA numbers. After I got home after dinner, I logged back onto climatemine and found to my great surprise that there were multiple images of each type on disk indicating that the information was successfully getting written back to the routing table. The only thing I changed just before leaving work was the permissions on the ~ldm directory itself. It had been rwx--, and I changed it to be rwxrwxr--. I did not expect that this would make any difference, but, combined with the creation of the ~ldm/mcidas/data directory, it might have. In fact, since all of the debug statements and the one line change was in place before changing the ~ldm directory permissions and decoding was not working correctly, and then decoding started working after my change of the directory permissions, this is the only thing that could have made things start working (unless you did something different to the OS in the interim). The _REALLY_ puzzling thing for me is that the compositing of GOES-East and West images _was_ working throughout this entire process and the routing table was getting updated to reflect those changes. This means that the the processes being run by 'ldm' had to be able to write to the routing table. This all gives me a headache, and makes me feel that I am at the end of that rope with you :-( Let's move on. I some more things on climatemine that had nothing to do with the decoding, but did have a lot to do with keeping things running. 1) Your /var file system ran out of room while I was working. I recognized this since I got a message while editing using vi. I changed the number of days of GRID data being kept online by modifying ~ldm/decoders/mcscour.sh and by deleting by hand all GRID files in /var/data/xcd that were one day old. 2) I added a cron entry to rotate the ldm-mcidas.log files. I did this since ~ldm/logs/ldm-mcidas was getting excessively large (the size before rotation gred to 1.7 MB). 3) there were a number of orphaned shared memory segments (indicated by running 'ipcs') and associated subdirectories in ~ldm/.mctmp. These were created by McIDAS processes (like compositing of East and West images), but were not removed for some reason when the processes exited. I removed those segments (using 'ipcrm -m <segno>') and the .mctmp subdirectories (using 'rm') while I had the LDM shut down (important to not do this while the LDM is running since you might be deleting a segment/directory that is in use) 4) while I was on climatemine, I took the opportunity to upgrade the LDM to LDM-6.0.11. I did this to see if it eliminated a problem which I mention below. Some observations: - you are currently decoding imagery into /var/data/mcidas and XCD files into /var/data/xcd. I recommend combining the output directories so that everything goes into /var/data/mcidas. The reason for this is that with the current setup (that works), you have to have copies of SCHEMA, ROUTE.SYS, and SYSKEY.TAB in both of these directories AND really the copies of ROUTE.SYS and SYSKEY.TAB should be the same. The only way to do that now is to have ROUTE.SYS and SYSKEY.TAB in one directory and then make links to those copies in the other directory. It is just simplier in the long run to combine the output directories. - I am seeing a mysterious memory fault when running 'ldmadmin pqactcheck': sh: 134600 Memory fault The error is causing a core dump of pqact when the limit on coredump is changed from its default size of 0 to unlimited. This memory fault is associated with the ldmadmin action that checks the pqact.conf file's use of /dev/null. pqact is running normally when processing actions from ~ldm/etc/pqact.conf, so there is no urgent need to find out what the problem is. I don't understand this memory problem, but I think that it must be looked into fairly soon. I suspect that it has something to do with an OS configuration/permission. Further investigations: - I want to continue to try to understand why the ldm-mcidas image decoding was not working correctly, and what actually changed to make it start working. With your permission, I will continue to logon to climatemine over the next few days to poke around, - We need to understand what is causing the memory fault problem when running 'ldmadmin pqactcheck'. Lastly, I am hoping that you will upgrade the LDM on allegan to 6.0.11 or, if it gets cut, 6.0.12 today, this weekend, or Monday. I have got to run right now... Tom >From address@hidden Fri May 23 09:58:04 2003 Tom I have not changed a thing on the OS, so some of your magic must of worked, however - all of the composites (mdrtopo, gwvistopo, gew-vis) are now blank/black. Anyway FEEL FREE to logon, and do whatever is needed to fix this thing! And - what can I do to help??, or payback you/unidata for your help?? Dan
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.