[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #BRX-675440]: Incomplete files on wheezy OS



George,

Thanks for the information.

Which system misses the data-products?

> Hi,
> 
> It will be easier to give you this information instead of giving you
> access to the machines.  We may be able to if this is not enough.  Also
> attached is the pqact.conf file used by both machines.
> 
> Thanks,
> George
> 
> nikara.rap.ucar.edu (128.117.196.12)
> load average: 2.68, 2.63, 2.60
> nikara:~/cvs/third_party/open/apps/unisys_decoders/src/ucsat% ldmadmin
> config
> 
> hostname:              nikara.rap.ucar.edu
> os:                    Linux
> release:               3.2.0-4-amd64
> ldmhome:               /home/ldm
> LDM version:           6.11.5
> PATH:
> /home/ldm/ldm-6.11.5/bin:.:/home/ldm/bin:/home/ldm/util:/home/ldm/decoders:/home/ldm/rap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11
> LDM conf file:         /home/ldm/etc/ldmd.conf
> pqact(1) conf file:    /home/ldm/etc/pqact.conf
> scour(1) conf file:    /home/ldm/etc/scour.conf
> product queue:         /home/ldm/var/queues/ldm.pq
> queue size:            500M bytes
> queue slots:           default
> reconciliation mode:   do nothing
> pqsurf(1) path:        /home/ldm/var/queues/pqsurf.pq
> pqsurf(1) size:        2M
> IP address:            0.0.0.0
> port:                  388
> PID file:              /home/ldm/ldmd.pid
> Lock file:             /home/ldm/.ldmadmin.lck
> maximum clients:       256
> maximum latency:       3600
> time offset:           3600
> log file:              /home/ldm/var/logs/ldmd.log
> numlogs:               7
> log_rotate:            1
> netstat:               /bin/netstat -A inet -t -n
> top:                   /usr/bin/top -b -n 1
> metrics file:          /home/ldm/var/logs/metrics.txt
> metrics files:         /home/ldm/var/logs/metrics.txt*
> num_metrics:           4
> check time:            1
> delete info files:     0
> ntpdate(1):            /usr/sbin/ntpdate
> ntpdate(1) timeout:    5
> time servers:          ntp.ucsd.edu ntp1.cs.wisc.edu ntppub.tamu.edu
> otc1.psu.edu timeserver.unidata.ucar.edu
> time-offset limit:     10
> 
> REQUEST NIMAGE  "satz.*EAST-CONUS.*"    khufu.rap.ucar.edu
> REQUEST NIMAGE  "satz.*WEST-CONUS.*"    khufu.rap.ucar.edu
> 
> nikara:~/logs% egrep 'ERR|WARN' `ls -rt ldmd.log*` | tail -22
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12986, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12986, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_prodput: trying
> again:   840869 20131009221042.784  NIMAGE 211512
> satz/ch2/GOES-15/3.9/20131009 2200/WEST-CONUS/4km/ TIGW04 KNES 092200
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200): Broken
> pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12987, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12987, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/WV/20131009/WV_20131009_2200): Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_sync():
> pid=12988, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/WV/20131009/WV_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200): Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12989, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12989, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_prodput: trying
> again:   819854 20131009221044.789  NIMAGE 211514
> satz/ch2/GOES-15/IR/20131009 2200/WEST-CONUS/4km/ TIGW02 KNES 092200
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200): Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12990, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12990, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200):
> Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12991, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12991, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_prodput: trying
> again:   673834 20131009221046.023  NIMAGE 211515
> satz/ch2/GOES-15/13.3/20131009 2200/WEST-CONUS/4km/ TIGW06 KNES 092200
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200):
> Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12992, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12992, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200"
> 
> 
> 
> khafra.rap.ucar.edu (128.117.196.23)
> load average: 4.94, 7.82, 7.39
> khafra:~% ldmadmin config
> 
> hostname:      khafra.rap.ucar.edu
> os:            Linux
> release:       2.6.32-5-amd64
> ldmhome:       /home/ldm
> bin path:      /home/ldm/bin
> conf file:     /home/ldm/etc/ldmd.conf
> log file:      /home/ldm/logs/ldmd.log
> numlogs:       7
> log_rotate:    1
> data path:     /home/ldm/data
> product queue: /home/ldm/data/ldm.pq
> queue size:    400M bytes
> queue slots:   default
> IP address:    all
> port:          388
> PID file:      /home/ldm/ldmd.pid
> LDMHOSTNAME:   khafra.rap.ucar.edu
> PATH:
> /home/ldm/bin:/bin:/usr/bin:/usr/sbin:/sbin:/usr/ucb:/usr/usb:/usr/etc:/etc:.:/home/ldm/bin:/home/ldm/util:/home/ldm/decoders:/home/ldm/rap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/rap/bin
> 
> REQUEST NIMAGE  "satz.*EAST-CONUS.*"    khufu.rap.ucar.edu
> REQUEST NIMAGE  "satz.*WEST-CONUS.*"    khufu.rap.ucar.edu
> 
> khafra:~/logs% egrep 'ERR|WARN' `ls -rt ldmd.log*` | tail -22
> ldmd.log:Oct  9 21:51:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:53:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:54:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:55:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:56:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:57:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:59:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:00:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:01:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:02:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:03:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:05:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:06:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:07:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:08:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:09:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:11:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:12:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:13:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:14:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:15:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:17:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: BRX-675440
Department: Support LDM
Priority: Normal
Status: Closed