[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[McIDAS #TAL-270196]: Not getting MD data



Hi Heather,

re:
> So we stopped decoding the MD data again.  Exactly at the end of the day 
> yesterday.  Not
> good!

I agree, this is not good!

re:
> I did the greps that you ask me to do.  The first one came up with nothing, 
> and
> the second grep came up just fine.

I should have had you do one or two more 'ps' invocations to make sure that
what I was expecting to see would be listed.  On all of our systems, the
'ps -eaf | grep DM | grep -v' invocation will list out all of the McIDAS-XCD
data monitors running on the system.  I assumed that this would be the
case for your system as ** I think ** you are running either RedHat Enterprise
6.x or CentOS 6.x.  I guess it might possible that the data monitors wouldn't 
show up
in this 'ps' invocation, but the real executables (e.g., dmsfc.k, etc.) would in
that case.

re:
> Yesterday while everything was running okay, both
> came up with the processes that you said they would.

OK, good.  This answered a couple of questions that I was
ready to pose.

re:
> Here is a screen grab of what I am seeing:
> 
> [ldm@npxcd ~]$ cd /data/
> [ldm@npxcd data]$ ls -ltr MDXX*
> -rw-rw-r-- 1 ldm ldm  5604680 Jul 27 09:13 MDXX0018
> -rw-rw-r-- 1 ldm ldm  2897136 Jul 27 09:13 MDXX0028
> -rw-rw-r-- 1 ldm ldm   542880 Jul 27 09:13 MDXX0057
> -rw-rw-r-- 1 ldm ldm  7097768 Jul 27 10:50 MDXX0109
> -rw-rw-r-- 1 ldm ldm  5961852 Jul 27 13:25 MDXX0068
> -rw-rw-r-- 1 ldm ldm 14156120 Jul 27 18:07 MDXX0119
> -rw-rw-r-- 1 ldm ldm 12630748 Jul 27 20:00 MDXX0038
> -rw-rw-r-- 1 ldm ldm 49844936 Jul 27 20:00 MDXX0008
> -rw-rw-r-- 1 ldm ldm  8819736 Jul 27 20:03 MDXX0058
> -rw-rw-r-- 1 ldm ldm  6020504 Jul 27 21:02 MDXX0069
> -rw-rw-r-- 1 ldm ldm  6615720 Jul 27 22:05 MDXX0019
> -rw-rw-r-- 1 ldm ldm  4817376 Jul 27 22:07 MDXX0029
> -rw-rw-r-- 1 ldm ldm  3569768 Jul 27 22:50 MDXX0110
> -rw-rw-r-- 1 ldm ldm 50471136 Jul 27 23:02 MDXX0009
> -rw-rw-r-- 1 ldm ldm 12976672 Jul 27 23:47 MDXX0039
> -rw-rw-r-- 1 ldm ldm   657952 Jul 27 23:53 MDXX0030
> -rw-rw-r-- 1 ldm ldm  1020352 Jul 27 23:53 MDXX0020
> -rw-rw-r-- 1 ldm ldm 10408436 Jul 27 23:59 MDXX0010
> -rw-rw-r-- 1 ldm ldm  8863248 Jul 27 23:59 MDXX0059
> -rw-rw-r-- 1 ldm ldm  2319312 Jul 27 23:59 MDXX0060
> -rw-rw-r-- 1 ldm ldm 12438832 Jul 27 23:59 MDXX0040
> -rw-rw-r-- 1 ldm ldm  1095628 Jul 27 23:59 MDXX0070

OK.  For reference, here is a long listing of the sizes of our
SFCHOURLY MD files from one of our motherlode-class machines:

% ls -alt /data/ldm/pub/decoded/mcidas/RTPTSRC/SFCHOURLY
total 860468
-rw-rw-r--   1 ldm      ustaff   35359336 Jul 28 15:53 MDXX0010
-rw-rw-r--   1 ldm      ustaff   50468136 Jul 28 15:51 MDXX0009
-rw-rw-r--   1 ldm      ustaff   50517136 Jul 28 00:00 MDXX0008
-rw-rw-r--   1 ldm      ustaff   50553936 Jul 27 00:00 MDXX0007
-rw-rw-r--   1 ldm      ustaff   50517136 Jul 25 21:48 MDXX0006
-rw-rw-r--   1 ldm      ustaff   50468136 Jul 24 21:55 MDXX0005
-rw-rw-r--   1 ldm      ustaff   50546936 Jul 24 00:00 MDXX0004
-rw-rw-r--   1 ldm      ustaff   50530636 Jul 22 21:18 MDXX0003
-rw-rw-r--   1 ldm      ustaff   50530636 Jul 21 21:05 MDXX0002

Notice that a complete SFCHOURLY (METAR) MD file should be
on the order of 50 MB per day.

re:
> [ldm@npxcd data]$ date
> Thu Jul 28 06:42:31 EDT 2016
> [ldm@npxcd data]$ ps -eaf | grep DM | grep -v grep

The fact that this listing was empty is disturbing.  XCD is designed
to restart data monitors that exit automatically, so even if they
died/were killed, they should be restarted.

re:
> [ldm@npxcd data]$ ps -eaf | grep inge | grep -v grep
> ldm      21115 21102  0 Jul27 ?        00:00:00 ingetext.k DDS
> ldm      21116 21102  0 Jul27 ?        00:00:00 ingebin.k HRS
> ldm      21135 21116  1 Jul27 ?        00:12:13 ingebin.k HRS
> ldm      21138 21115  0 Jul27 ?        00:01:55 ingetext.k DDS
> [ldm@npxcd data]$

This looks correct.

re:
> Any idea why my mcidas decoder is stopping? How can I fix this?

Unfortunately, the answer to both of these questions is no.

Comment:

- if your McIDAS-XCD decoders continue to run with no problems
  when using the previously installed version of the LDM on
  your machine, you should switch back to it immediately

  Again, I have _no_ idea why/how this would/could be the case,
  but we can sort those issues out later.

re:
> Thanks!

Sorry for your problems...

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: TAL-270196
Department: Support McIDAS
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.