[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050701: Linux LDM server problem



>From: "James M. Pelagatti" <address@hidden>
>Organization: MIT Lincoln Laboratory, Group 43
>Keywords: 200507012042.j61Kgpjo019119 LDM

Hi James,

Our best guess is that you did not run 'make install_setuids'
as 'root' when you installed LDM-6.1.0 on tornado.  Doing this step
will insure that the LDM can use port 388 (ports less than 1024 are
accessible to 'root' or processes that have setuid root privilege).
It may well be the case that you did not run 'make install_setuids'
on your other systems as well.

So, our recommendation is:

<as 'root'>
cd ~ldm
cd ldm-6.1.0/src
make install_setuid
exit

After installing setuid root privilege (on rpc.ldmd and hupsyslog),
you should be able to restart your LDM and have the downstreams
connect:

<as 'ldm'>
ldmadmin stop
ldmadmin start

Please let us know if this fixes the problem you are seeing.

Cheers,

Tom Yoksas

James M. Pelagatti said:

>We're having two problems feeding data from a Linux LDM server to another
>machine on our network. We've been using LDM here for many years in a mixed
>Solaris/Linux environment but I think this is the first time we've attempted t
> o
>use a Linux machine as a server and likely we've set something up wrong on tha
> t
>system.
>
>First, here's some technical information:
>
>   o LDM server 1: tornado.ll.mit.edu (129.55.60.17), Linux 2.4.21
>   o LDM server 2: whirl.ll.mit.edu   (129.55.60.9),  Solaris 8
>   o LDM client:   typhoon.ll.mit.edu (129.55.60.16), Linux 2.4.21
>   o all machines run LDM 6.1.0
>   o we're feeding NEXRD2 data (only radar KPUX for now)
>   o no firewalls or IP filters among these machines
>
>*** PROBLEM 1: "typhoon" cannot receive data from "tornado" using port 388.
>
>Here are some log messages (debugging on):
>
>   Jul 01 19:44:26 rpc.ldmd[2220]: Starting Up (version: 6.1.0; built: Nov 22 
>2004 12:07:43)
>           main(): Opening product-queue
>           main(): Creating service portal
>           create_ldm_tcp_svc(): Checking for another LDM
>           create_ldm_tcp_svc(): Getting TCP socket
>           create_ldm_tcp_svc(): Eliminating EADDRINUSE problem.
>           create_ldm_tcp_svc(): Getting root privs
>           create_ldm_tcp_svc(): Binding socket
>           create_ldm_tcp_svc(): Calling getsockname()
>           port 44526
>           create_ldm_tcp_svc(): Calling listen()
>           create_ldm_tcp_svc(): Checking portmapper
>           create_ldm_tcp_svc(): Registering
>   Jul 01 19:44:26 rpc.ldmd[2220]: Can't register TCP service 300029 on port 4
> 4526
>   Jul 01 19:44:26 rpc.ldmd[2220]: Downstream LDMs won't be able to connect vi
> a 
>the RPC portmapper daemon (rpcbind(8), portmap(8), etc.)
>           create_ldm_tcp_svc(): Releasing root privs
>           tcp sock: 0
>           main(): Reading configuration-file
>   Jul 01 19:44:26 129.55.60.17[2222]: Starting Up(6.1.0): 129.55.60.17: TS_ZE
> RO 
>TS_ENDT {{NEXRAD2,  "^L2.*KPUX"}}
>           main(): Serving socket
>   Jul 01 19:44:26 129.55.60.9[2223]: Starting Up(6.1.0): 129.55.60.9: TS_ZERO
>  
>TS_ENDT {{NEXRAD2,  "^L2.*KPUX"}}
>   Jul 01 19:44:26 129.55.60.17[2222]: Desired product class: 20050701192926.9
> 19 
>TS_ENDT {{NEXRAD2,  "^L2.*KPUX"}}
>   Jul 01 19:44:26 129.55.60.17[2222]: INFO: ldm_clnt.c:226: Couldn't connect 
> to 
>LDM 6 on 129.55.60.17 using port 388; ldm_clnt.c:113: : RPC: Remote system err
> or 
>- Connection refused
>   Jul 01 19:44:26 129.55.60.17[2222]: INFO: ldm_clnt.c:245: Couldn't connect 
> to 
>LDM 6 on 129.55.60.17 using portmapper; ldm_clnt.c:113: : RPC: Remote system 
>error - Connection refused
>   Jul 01 19:44:26 129.55.60.17[2222]: ERROR: requester6.c:455; ldm_clnt.c:256
> : 
>Couldn't connect to LDM 6 on 129.55.60.17
>   Jul 01 19:44:26 129.55.60.17[2222]: Desired product class: 20050701192926.9
> 19 
>TS_ENDT {{NEXRAD2,  "^L2.*KPUX"}}
>   Jul 01 19:44:26 129.55.60.17[2222]: INFO: ldm_clnt.c:226: Couldn't connect 
> to 
>LDM 6 on 129.55.60.17 using port 388; ldm_clnt.c:113: : RPC: Remote system err
> or 
>- Connection refused
>   Jul 01 19:44:26 129.55.60.17[2222]: INFO: ldm_clnt.c:245: Couldn't connect 
> to 
>LDM 6 on 129.55.60.17 using portmapper; ldm_clnt.c:113: : RPC: Remote system 
>error - Connection refused
>   Jul 01 19:44:26 129.55.60.17[2222]: ERROR: requester6.c:455; ldm_clnt.c:256
> : 
>Couldn't connect to LDM 6 on 129.55.60.17
>   Jul 01 19:44:26 129.55.60.17[2222]: Sleeping 1 seconds before retrying...
>   Jul 01 19:44:26 129.55.60.9[2223]: Desired product class: 20050701192926.92
> 5 
>TS_ENDT {{NEXRAD2,  "^L2.*KPUX"}}
>   Jul 01 19:44:26 129.55.60.9[2223]: Connected to upstream LDM-6
>           requester6.c:268: Calling feedme_6(...)
>   Jul 01 19:44:26 129.55.60.9[2223]: Upstream LDM is willing to feed
>
>Typhoon gets a "connection refused" from tornado (*.17). (It also sometimes
>fails using the portmapper; see PROBLEM 2 below.) But it successfully connects
>to whirl (*.9) and later receives data from there. We cannot figure out why it
>can't similarly connect using 388 on tornado. Do you have any ideas?
>
>Here are the contents of the LDM config files (whirl included for comparison):
>
>tornado:
>
>   exec "/ll/ciws/projects/ldm/bin/Linux/pqexpire -q 
>/ll/ciws/projects/ldm/data/address@hidden -a 0.5"
>   allow ANY ^(localhost|loopback)|^((127\.0\.0\.1\.?)|greco|typhoon)$
>   allow NONE ^.*
>   request NEXRD2 "^L2.*KPUX" 129.55.66.13  # llwxldm2
>
>whirl:
>
>   exec "/ll/ciws/projects/ldm/bin/SunOS/pqexpire -q 
>/ll/ciws/projects/ldm/data/address@hidden -a 0.5"
>   allow ANY ^(localhost|loopback)|^((127\.0\.0\.1\.?)|greco|typhoon)$
>   allow NONE ^.*
>   request NEXRD2 "^L2.*KPUX" 129.55.66.13  # llwxldm2
>
>typhoon:
>
>   exec "/ll/ciws/projects/ldm/bin/Linux/pqexpire -q 
>/ll/ciws/projects/ldm/data/address@hidden -a 0.5"
>   allow ANY ^(localhost|loopback)|^((127\.0\.0\.1\.?))$
>   allow NONE ^.*
>   request NEXRD2 "^L2.*KPUX" 129.55.60.17  PRIMARY # tornado
>   request NEXRD2 "^L2.*KPUX" 129.55.60.9   PRIMARY # whirl
>
>And here's some sundry info (whirl again included for comparison):
>
>tornado:
>
>   [ciws]tornado:/ldm/etc 37 % grep ldm /etc/services
>   ldm             388/tcp         ldmd            # UCAR Unidata LDM
>   [ciws]tornado:/ldm/etc 38 % grep ldm /etc/rpc
>   ldmd            300029  ldm
>   [ciws]tornado:/ldm/etc 39 % cat /etc/hosts.allow
>   #
>   # hosts.allow   This file describes the names of the hosts which are
>   #               allowed to use the local INET services, as decided
>   #               by the '/usr/sbin/tcpd' server.
>   #
>
>   ALL: LOCAL
>   [ciws]tornado:/ldm/etc 40 % ls -las /ll/ciws/projects/ldm/bin/Linux/rpc.ldm
> d
>    168 -rwsrwxr-x    1 root     ciws       158643 Nov 22  2004 
>/ll/ciws/projects/ldm/bin/Linux/rpc.ldmd*
>
>whirl:
>
>   [ciws]whirl:/ldm/etc 82 % grep ldm /etc/services
>   [ciws]whirl:/ldm/etc 83 % grep ldm /etc/rpc
>   [ciws]whirl:/ldm/etc 84 % cat /etc/hosts.allow
>   ALL: LOCAL
>   [ciws]whirl:/ldm/etc 85 % ls -las /ll/ciws/projects/ldm/bin/SunOS/rpc.ldmd
>    352 -rwsrwxr-x   1 root     ciws      171768 Nov 18  2004 
>/ll/ciws/projects/ldm/bin/SunOS/rpc.ldmd*
>
>typhoon:
>
>   [ciws]typhoon:/ldm/etc 119 % grep ldm /etc/services
>   ldm             388/tcp         ldmd            # UCAR Unidata LDM
>   [ciws]typhoon:/ldm/etc 120 % grep ldm /etc/rpc
>   ldmd            300029  ldm
>   [ciws]typhoon:/ldm/etc 121 % cat /etc/hosts.allow
>   #
>   # hosts.allow   This file describes the names of the hosts which are
>   #               allowed to use the local INET services, as decided
>   #               by the '/usr/sbin/tcpd' server.
>   #
>
>   ALL: LOCAL
>   [ciws]typhoon:/ldm/etc 122 % ls -las /ll/ciws/projects/ldm/bin/Linux/rpc.ld
> md
>    168 -rwsrwxr-x    1 root     ciws       158643 Nov 22  2004 
>/ll/ciws/projects/ldm/bin/Linux/rpc.ldmd*
>
>*** PROBLEM 2: "typhoon" can receive data from "tornado" via the portmapper
>scheme but if the tornado LDM is restarted this stops working.
>
>We have coaxed typhoon into receiving data from tornado using the portmapper
>scheme. In this case, the log shows the port 388 error but then the LDM connec
> ts
>and starts receiving data. However, if we restart the LDM on tornado, then
>typhoon can no longer connect to it using the portmapper. Instead, we get the
>errors you see in the logs above.
>
>The only way we know to fix this is to reboot tornado. But if we later restart
>the LDM on tornado, then the feed breaks and we have to reboot again. Do you
>think we need to do something else regarding the portmapper to make this work
>reliably?
>
>We are aware that using this scheme is less desirable than using port 388 but 
> we
>want to do some fail-over testing involving both methods of access.
>
>Thanks for any help you can give us.
>
>---------------------------+---------------------------
>James M. Pelagatti (Jamie) | MIT Lincoln Laboratory
>   Software Engineer        | Group 43 (Weather Sensing)
>   (781) 981-1886           | 244 Wood St., Room S1-611
>   FAX: (781) 981-0632      | Lexington, MA 02420-9108
>   mailto:address@hidden  | http://www.ll.mit.edu
>
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.