Hi Gilbert, OK, I am just about done tweaking the setup on weather3... re: FILEing of NEXRAD Level III products on weather > Well the CPU load is incorporated into the overall load average, and > that's what's critical. I am suspicious that the negative effects you were seeing (e.g., high load averages) may have been caused by having too many processing actions for too many feeds in your ~ldm/etc/pqact.gempak file. The reason I say that is the following observation on weather3: 1) the list of feeds for which there are actions in pqact.gempak is: CMC|CONDUIT|FNEXRAD|FSL2|GPS|NNEXRAD|NGRID|NIMAGE|NLDN|NOGAPS|PCWS|UNIDATA|WSI 2) the number of actions in pqact.gempak is: /home/ldm/etc% grep -v ^# pqact.gempak | grep -v ^" " | grep -v ^$ | wc -l 487 3) the Data Volume Summary page for weather3.admin.niu.edu is as follows: http://www.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?weather3.admin.niu.edu Data Volume Summary for weather3.admin.niu.edu Maximum hourly volume 623.716 M bytes/hour Average hourly volume 411.020 M bytes/hour Average products per hour 49993 prods/hour Feed Average Maximum Products (M byte/hour) (M byte/hour) number/hour HDS 191.371 [ 46.560%] 376.626 18196.435 NEXRAD2 79.296 [ 19.293%] 129.694 4481.261 FNEXRAD 69.901 [ 17.007%] 88.611 70.674 NNEXRAD 24.180 [ 5.883%] 29.013 2054.261 UNIWISC 20.784 [ 5.057%] 32.314 23.826 IDS|DDPLUS 17.690 [ 4.304%] 21.914 25127.891 DIFAX 5.543 [ 1.349%] 22.425 6.957 FSL2 2.026 [ 0.493%] 2.156 21.848 NLDN 0.229 [ 0.056%] 0.577 9.696 This listing shows that there are about 49500 products each hour that are checked for processing by the pqact that is handling the pqact.gempak actions. Before I changed the list of feeds that would be processed by the pqact that is responsible for pqact.gempak actions, it would have to scan ALL products in ALL feeds each hour, or almost 50000 products on average. This means that that pqact has to do 49500 * 487 = 24106500 comparisons each hour, and it acts on some fraction of these. NOTE that pqact does not stop working its way through a pattern/action file when a match is found, it continues looking for additional matches. The amount of processing that this single pqact would have to do would lead me to believe the following: - it would likely fall behind in its processing if it was tasked with FILEing all NEXRAD Level III products - it should consume a lot of CPU Now, splitting the actions into more pqact.conf files will help keep any one pqact from falling behind in the processing it is attempting to do. It should not, however, decrease the overall CPU use, in fact, it should increase it over a shorter time interval. So what's my point? - Chiz added the ability to generate multiple pqact.conf files for GEMPAK processing based on his observation that if one leaves all processing in one pqact.conf file, then one might see the processing fall behind enough so that products will not get processed out of the LDM queue before they are overwritten by newly received ones. - it may be the case that moving the NIMAGE processing to a pqact that is not already overloaded would result in weather3's being able to process the data without the very high load averages you experienced re: I just logged into weather as 'mcidas' and: - pointed at weather2 for RTNEXRAD data: - removed the ADDE definitions for the RTNEXRAD dataset from the server mapping table, $MCDATA/RESOLV.SRV > OK, great. Thanks! No worries. re: weather3 is either a dual 3 Ghz machine or a single with hyper threading > It's the latter, so is weather2. They're identical. OK. re: weather, on the other hand has a single 3 Ghz processor. > Yep. OK. re: Given the hardware I see, I would think that weather would struggle more than the other two machines > Yes, and... re: One of the biggest loads on any machine is X Windows -- it is a HUGE memory user. > Unfortunately, for WXP I have to use it. Hmm... Can't you use a virtual frame buffer for generation of WXP products for your web site? re: processing of NEXRAD Level II data > Correct, but the limited amount keeps the load from getting too high. I > used to have all LEVEL2 data on weather3, and when I get the new machines, > I will do so again. OK. re: I finished adjusting processing being done by McIDAS pqact.conf actions to remove duplication of those being done for GEMPAK > Good! Yes, this will save disk AND CPU. re: I propose that we investigate the high load averages seen when processing NIMAGE data. > I did a "yum -y install *iostat*" but didn't find any packages. Any clues? Yup. I installed the package containing iostat as follows: yum install sysstat-7.0.4-3.fc7 This installed /usr/bin/iostat and /usr/bin/sar. I then copied over the script we use for system monitoring, ~ldm/util/uptime.tcl and adjusted some entries to work on your system (like the PATH defined in uptime.tcl). I then added running of the script once-per-minute from cron: # # Monitor system performance # * * * * * util/uptime.tcl logs/weather3.uptime 0 0 1 * * bin/newlog logs/weather3.uptime 12 The items listed are: 20070827.2121 0.51 1.00 1.27 10 18 28 7481 39M 6M 38.00 18.50 0.50 43.00 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | | | | | | | | |_ %idle | | | | | | | | | | | | | |_ I/O wait | | | | | | | | | | | | |_ % system | | | | | | | | | | | |_ %user | | | | | | | | | | |_ swap in use | | | | | | | | | |_ free memory | | | | | | | | |_ age of oldest product in LDM queue [s] | | | | | | | |_ total # connections | | | | | | |_ # upstream connections | | | | | |_ # downstream connections | | | | |_ 15 minute load average | | | |_ 5 minute load average | | |_ 1 minute load average | |_ time [UTC] |_ date [ccyymmdd] The output from this file will give us a time history of the performance on weather3. I have adjusted things on the McIDAS ADDE side to use GEMPAK-processed images where needed. I believe that the redundancy in processing/disk use between GEMPAK and McIDAS is now gone. > Great. re: I think that weather3 should easily be able to handle the processing load you have on it AND file the NIMAGE products. The fact that it can't leads me to suspect that something is wrong somewhere. The thing to do is find out where the problem(s) is(are) and fix it(them). > OK. I will turn on NIMAGE processing in the combined McIDAS pqact.conf file, ~ldm/etc/pqact.conf_mcidas to see what happens on weather3. I will write the NIMAGE data into the directory structure needed for GEMPAK, but the action now in pqact.gempak will be commented out. re: I can't see how what you have right now in weather3 is not able to keep up with what you are trying to do. > Hmm. OK. re: take care with overclocking > I do it now, no problems so far, but I only go 5% over. Yes, but you ran into a heat problem on weather... > Gotta run... More as the NIMAGE testing proceeds. Cheers, Tom **************************************************************************** Unidata User Support UCAR Unidata Program (303) 497-8642 P.O. Box 3000 address@hidden Boulder, CO 80307 ---------------------------------------------------------------------------- Unidata HomePage http://www.unidata.ucar.edu **************************************************************************** Ticket Details =================== Ticket ID: KDN-271049 Department: Support McIDAS Priority: Normal Status: Closed
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.