[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20020724: LDM / CPU Issue



Patrick,

It looks like most of the resources are being taken up by decoders
(as you see with the high IOWAIT).

When you first start your LDM, you will have a backlog of data to digest
(generally 1 hours worth, unless you have modified the values).
This usually means the LDM is hitting the disk pretty hard filing and 
kicking off decoders.

If this CPU usage is temporary, eg goes down after your LDM has
caught up (use ldmadmin watch to see when your arriving
productshave cught up), then it probably can't be avoided.

If this remains the condition, then you may need to look at the
2 biggest CPU users which are the McIDAS programs. Could
they be looking for something that you have to recreate after your
file system was fixed? Are you seeing your McIDAS surface data?
That seems to be the decoders of top usage.

I'll pass on to Tom Yoksas to see if he has input on 
your xcd decoders.

Steve Chiswell
Unidata User Support




>From: "Patrick O'Reilly" <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200207242009.g6OK9o901665

>This is a multi-part message in MIME format.
>
>------=_NextPart_000_00A2_01C23324.274AFBE0
>Content-Type: text/plain;
>       charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
>Hi there,
>
>I had my ldm machine go down, and brought it back up after fixing the =
>root filesystem.  This happened while I was away (of course).  After =
>getting things up, I noticed the ldm making almost 100% of the use of =
>the CPU.  Below, the results from top with the ldm running:
>
>load averages:  0.88,  1.24,  1.00                                       =
>                  15:01:30
>62 processes:  61 sleeping, 1 on cpu
>CPU states:  0.0% idle, 27.3% user, 11.4% kernel, 61.3% iowait,  0.0% =
>swap
>Memory: 256M real, 86M free, 211M swap in use, 487M swap free
>
>   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>   732 ldm        1  60    0 5824K 4704K sleep    3:30 28.02% dmsyn.k
>   730 ldm        1  60    0 3616K 2416K sleep    1:20 12.16% dmsfc.k
>   756 ldm        1  58    0   25M 5160K sleep    0:16  2.07% dcgrib2
>   699 ldm        1  59    0  245M   33M sleep    0:40  1.25% pqact
>  1267 ldm        1  58    0 2256K 1696K cpu      0:00  0.36% top
>   228 root       8  55    0 2992K 2256K sleep    0:00  0.23% nscd
>   728 ldm        1  59    0 2616K 1440K sleep    0:09  0.22% ingetext.k
>   701 ldm        1  59    0  244M   33M sleep    0:05  0.22% rpc.ldmd
>  1254 root       1  58    0 2840K 1856K sleep    0:00  0.21% sshd
>   765 ldm        1  59    0 2480K 1280K sleep    0:00  0.20% ingebin.k
>   698 ldm        1  59    0  244M   21M sleep    0:08  0.17% pqbinstats
>  1258 ldm        1  52    0 1448K 1256K sleep    0:00  0.12% csh
>  1256 ldm        1  52    2 1864K 1416K sleep    0:00  0.07% traceroute
>  1251 ldm        1  52    2 2696K 2328K sleep    0:00  0.06% netcheck
>   745 ldm        1  59    0   21M 2176K sleep    0:22  0.02% dcmetr
>
>And now with the ldm stopped:
>
>load averages:  0.04,  0.36,  0.65                                       =
>                  15:08:44
>36 processes:  35 sleeping, 1 on cpu
>CPU states: 99.2% idle,  0.2% user,  0.6% kernel,  0.0% iowait,  0.0% =
>swap
>Memory: 256M real, 112M free, 28M swap in use, 670M swap free
>
>   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  1321 ldm        1  50    0 2232K 1672K cpu      0:00  0.40% top
>  1254 root       1  58    0 2840K 1856K sleep    0:00  0.04% sshd
>   363 root      12  58    0 3176K 2904K sleep    0:00  0.02% mibiisa
>   440 root       1  38    0 6592K 4888K sleep    0:01  0.00% Xvfb
>   177 root       1  55    0 2808K 1472K sleep    0:01  0.00% sshd
>   151 root       1  13    0 1808K 1176K sleep    0:00  0.00% inetd
>   265 root       1  31    0  968K  792K sleep    0:00  0.00% htt
>   185 daemon     4  33    0 2640K 1928K sleep    0:00  0.00% statd
>   186 root       1  33    0 2064K 1336K sleep    0:00  0.00% lockd
>  1208 root       1  39    0 2448K 1584K sleep    0:00  0.00% fbconsole
>    53 root       5  40    0 1344K  808K sleep    0:00  0.00% =
>syseventconfd
>   281 root       1  42    0  320K  320K sleep    0:00  0.00% rc3
   437 root       1  42    0  320K  312K sleep    0:00  0.00% sh
>   259 root       1  46    0 3168K 1384K sleep    0:00  0.00% sendmail
>   288 root       4  48    0 5176K 2520K sleep    0:00  0.00% dtlogin
>
>Any ideas about why it would be hogging the resources as such?
>
>Thanks ....
>
>Patrick
>
>_______________________________________
>Patrick O'Reilly                               =20
>Meteorological Decision Support Scientist
>The STORM Project - University of Northern Iowa
>address@hidden  ~  ph: 319-273-3789
>
>
>
>_______________________________________
>Patrick O'Reilly                               =20
>Meteorological Decision Support Scientist
>The STORM Project - University of Northern Iowa
>address@hidden  ~  ph: 319-273-3789
>
>------=_NextPart_000_00A2_01C23324.274AFBE0
>Content-Type: text/html;
>       charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
><HTML><HEAD>
><META http-equiv=3DContent-Type content=3D"text/html; =
>charset=3Diso-8859-1">
><META content=3D"MSHTML 6.00.2600.0" name=3DGENERATOR>
><STYLE></STYLE>
></HEAD>
><BODY bgColor=3D#efefef>
><DIV><FONT face=3DVerdana size=3D2>Hi there,</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>I had my ldm machine go down, and =
>brought it back=20
>up after fixing the root filesystem.&nbsp; This happened while I was =
>away (of=20
>course).&nbsp; After getting things up, I noticed the ldm making almost =
>100% of=20
>the use of the CPU.&nbsp; Below, the results from top with the ldm=20
>running:</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>load averages:&nbsp; 0.88,&nbsp; =
>1.24,&nbsp;=20
>1.00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
>sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
>p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
>;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>15:01:30<BR>62 processes:&nbsp; 61 sleeping, 1 on cpu<BR>CPU =
>states:&nbsp; 0.0%=20
>idle, 27.3% user, 11.4% kernel, 61.3% iowait,&nbsp; 0.0% swap<BR>Memory: =
>256M=20
>real, 86M free, 211M swap in use, 487M swap free</FONT></DIV>
><DIV>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>&nbsp;&nbsp; PID USERNAME THR PRI =
>NICE&nbsp;=20
>SIZE&nbsp;&nbsp; RES STATE&nbsp;&nbsp;&nbsp; TIME&nbsp;&nbsp;&nbsp; CPU=20
>COMMAND<BR>&nbsp;&nbsp; 732 =
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>1&nbsp; 60&nbsp;&nbsp;&nbsp; 0 5824K 4704K sleep&nbsp;&nbsp;&nbsp; 3:30 =
>28.02%=20
>dmsyn.k<BR>&nbsp;&nbsp; 730 =
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>1&nbsp; 60&nbsp;&nbsp;&nbsp; 0 3616K 2416K sleep&nbsp;&nbsp;&nbsp; 1:20 =
>12.16%=20
>dmsfc.k<BR>&nbsp;&nbsp; 756 =
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>1&nbsp; 58&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 25M 5160K =
>sleep&nbsp;&nbsp;&nbsp;=20
>0:16&nbsp; 2.07% dcgrib2<BR>&nbsp;&nbsp; 699=20
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; =
>59&nbsp;&nbsp;&nbsp;=20
>0&nbsp; 245M&nbsp;&nbsp; 33M sleep&nbsp;&nbsp;&nbsp; 0:40&nbsp; 1.25%=20
>pqact<BR>&nbsp; 1267 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>58&nbsp;&nbsp;&nbsp; 0 2256K 1696K cpu&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>0:00&nbsp;=20
>0.36% top<BR>&nbsp;&nbsp; 228 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>8&nbsp;=20
>55&nbsp;&nbsp;&nbsp; 0 2992K 2256K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.23%=20
>nscd<BR>&nbsp;&nbsp; 728 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>59&nbsp;&nbsp;&nbsp; 0 2616K 1440K sleep&nbsp;&nbsp;&nbsp; 0:09&nbsp; =
>0.22%=20
>ingetext.k<BR>&nbsp;&nbsp; 701 =
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>1&nbsp; 59&nbsp;&nbsp;&nbsp; 0&nbsp; 244M&nbsp;&nbsp; 33M=20
>sleep&nbsp;&nbsp;&nbsp; 0:05&nbsp; 0.22% rpc.ldmd<BR>&nbsp; 1254=20
>root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; 58&nbsp;&nbsp;&nbsp; 0 =
>2840K=20
>1856K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; 0.21% sshd<BR>&nbsp;&nbsp; 765=20
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; =
>59&nbsp;&nbsp;&nbsp; 0=20
>2480K 1280K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; 0.20% =
>ingebin.k<BR>&nbsp;&nbsp;=20
>698 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; =
>59&nbsp;&nbsp;&nbsp;=20
>0&nbsp; 244M&nbsp;&nbsp; 21M sleep&nbsp;&nbsp;&nbsp; 0:08&nbsp; 0.17%=20
>pqbinstats<BR>&nbsp; 1258 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>52&nbsp;&nbsp;&nbsp; 0 1448K 1256K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.12%=20
>csh<BR>&nbsp; 1256 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; =
>
>52&nbsp;&nbsp;&nbsp; 2 1864K 1416K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.07%=20
>traceroute<BR>&nbsp; 1251 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>52&nbsp;&nbsp;&nbsp; 2 2696K 2328K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.06%=20
>netcheck<BR>&nbsp;&nbsp; 745 =
>ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>1&nbsp; 59&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 21M 2176K =
>sleep&nbsp;&nbsp;&nbsp;=20
>0:22&nbsp; 0.02% dcmetr</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>And now with the ldm =
>stopped:</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>load averages:&nbsp; 0.04,&nbsp; =
>0.36,&nbsp;=20
>0.65&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
>sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
>p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
>;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>15:08:44<BR>36 processes:&nbsp; 35 sleeping, 1 on cpu<BR>CPU states: =
>99.2%=20
>idle,&nbsp; 0.2% user,&nbsp; 0.6% kernel,&nbsp; 0.0% iowait,&nbsp; 0.0%=20
>swap<BR>Memory: 256M real, 112M free, 28M swap in use, 670M swap=20
>free</FONT></DIV>
><DIV>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>&nbsp;&nbsp; PID USERNAME THR PRI =
>NICE&nbsp;=20
>SIZE&nbsp;&nbsp; RES STATE&nbsp;&nbsp;&nbsp; TIME&nbsp;&nbsp;&nbsp; CPU=20
>COMMAND<BR>&nbsp; 1321 ldm&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>50&nbsp;&nbsp;&nbsp; 0 2232K 1672K cpu&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>0:00&nbsp;=20
>0.40% top<BR>&nbsp; 1254 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>58&nbsp;&nbsp;&nbsp; 0 2840K 1856K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.04%=20
>sshd<BR>&nbsp;&nbsp; 363 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 12&nbsp;=20
>58&nbsp;&nbsp;&nbsp; 0 3176K 2904K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.02%=20
>mibiisa<BR>&nbsp;&nbsp; 440 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>38&nbsp;&nbsp;&nbsp; 0 6592K 4888K sleep&nbsp;&nbsp;&nbsp; 0:01&nbsp; =
>0.00%=20
>Xvfb<BR>&nbsp;&nbsp; 177 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>55&nbsp;&nbsp;&nbsp; 0 2808K 1472K sleep&nbsp;&nbsp;&nbsp; 0:01&nbsp; =
>0.00%=20
>sshd<BR>&nbsp;&nbsp; 151 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>13&nbsp;&nbsp;&nbsp; 0 1808K 1176K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.00%=20
>inetd<BR>&nbsp;&nbsp; 265 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>31&nbsp;&nbsp;&nbsp; 0&nbsp; 968K&nbsp; 792K sleep&nbsp;&nbsp;&nbsp; =
>0:00&nbsp;=20
>0.00% htt<BR>&nbsp;&nbsp; 185 daemon&nbsp;&nbsp;&nbsp;&nbsp; 4&nbsp;=20
>33&nbsp;&nbsp;&nbsp; 0 2640K 1928K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.00%=20
>statd<BR>&nbsp;&nbsp; 186 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
>1&nbsp;=20
>33&nbsp;&nbsp;&nbsp; 0 2064K 1336K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.00%=20
>lockd<BR>&nbsp; 1208 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp;=20
>39&nbsp;&nbsp;&nbsp; 0 2448K 1584K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; =
>0.00%=20
>fbconsole<BR>&nbsp;&nbsp;&nbsp; 53 =
>root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
>5&nbsp; 40&nbsp;&nbsp;&nbsp; 0 1344K&nbsp; 808K sleep&nbsp;&nbsp;&nbsp;=20
>0:00&nbsp; 0.00% syseventconfd<BR>&nbsp;&nbsp; 281=20
>root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; 42&nbsp;&nbsp;&nbsp; =
>0&nbsp;=20
>320K&nbsp; 320K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; 0.00% =
>rc3<BR>&nbsp;&nbsp; 437=20
>root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; 42&nbsp;&nbsp;&nbsp; =
>0&nbsp;=20
>320K&nbsp; 312K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; 0.00% =
>sh<BR>&nbsp;&nbsp; 259=20
>root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp; 46&nbsp;&nbsp;&nbsp; 0 =
>3168K=20
>1384K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; 0.00% sendmail<BR>&nbsp;&nbsp; =
>288=20
>root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4&nbsp; 48&nbsp;&nbsp;&nbsp; 0 =
>5176K=20
>2520K sleep&nbsp;&nbsp;&nbsp; 0:00&nbsp; 0.00% dtlogin<BR></FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2>Any ideas about why it would be =
>hogging the=20
>resources as&nbsp;such?</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>Thanks ....</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana size=3D2>Patrick</FONT></DIV>
><DIV><FONT face=3DVerdana size=3D2></FONT>&nbsp;</DIV>
><DIV><FONT face=3DVerdana=20
>size=3D2>_______________________________________<BR>Patrick=20
>O'Reilly&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
>;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
><BR>Meteorological Decision Support Scientist<BR>The STORM Project - =
>University=20
>of Northern Iowa<BR><A=20
>href=3D"mailto:address@hidden";>address@hidden</A>&nbsp;=
> ~&nbsp;=20
>ph: 319-273-3789</DIV></FONT>
><DIV><FONT face=3DVerdana size=3D2>&nbsp;</DIV>
><DIV><BR></DIV></FONT>
><DIV><FONT face=3DVerdana=20
>size=3D2>_______________________________________<BR>Patrick=20
>O'Reilly&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
>;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
><BR>Meteorological Decision Support Scientist<BR>The STORM Project - =
>University=20
>of Northern Iowa<BR><A=20
>href=3D"mailto:address@hidden";>address@hidden</A>&nbsp;=
> ~&nbsp;=20
>ph: 319-273-3789</FONT></DIV></BODY></HTML>
>
>------=_NextPart_000_00A2_01C23324.274AFBE0--
>