[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030903: CAM and linux x86-64



Michael,

>Date: Wed, 3 Sep 2003 09:06:55 -0700 (PDT)
>From: "Michael R. Hanulec" <address@hidden>
>Organization: UCAR/Unidata
>To: Steve Emmerson <address@hidden>
>Subject: CAM and linux x86-64

The above message contained the following:

> Hello Steve,
> 
> Yesterday I began working with Jeff Johnson on getting NetCDF and CAM
> working on linux x86-64.  He had passed along your contact information as
> you were helpful to him in the past.
> 
> I was finally able to compile NetCDF w/ PGI 5.0 and now I began working on
> CAM.  After editing and running cam2.0.1/models/atm/cam/bld/run-pc.csh as
> a normal user (non-root, who has a tcsh shell) I receive the following
> error after about a minute of running (i've removed all of the other
> output):
> 
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=533 Min. mixing ratio violated at   41 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  16 19
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=534 Min. mixing ratio violated at   32 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  16 18
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=535 Min. mixing ratio violated at   31 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  16 17
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=536 Min. mixing ratio violated at   25 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  16 16
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=537 Min. mixing ratio violated at   17 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  16 15
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=538 Min. mixing ratio violated at   12 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  10 14
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=539 Min. mixing ratio violated at    4 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  10 13
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=540 Min. mixing ratio violated at    3 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=  10 12
>  QNEG3 from pcond/Q:m=  1 lat/lchnk=542 Min. mixing ratio violated at    1 
> points.  Reset to  1.0E-12 Worst =    -nan at i,k=   8 10
> 0: _mp_barrier: bad lcpu 3488
> 56.720u 1.300s 0:58.01 100.0%   0+0k 0+0io 6182pf+0w
> CAM failed

I'm afraid that the above error messages mean nothing to me.

When building the netCDF package, the "make test" command tests the
package quite comprehensively.  If the command was successful, then the
netCDF library is very likely to be correct.

Can you reduce the problem to something that shows a definite netCDF 
error?

I'm afraid I'll be incommunicado until 2003-09-10.

> This failure point is repeatable as both root and a non-root user.  
> During this minute of execution CAM only ends up using one of the four
> available processors even though 'nthreads' is set to four.
> 
> If you could provide us any assitance with this problem we would 
> appreciate it.  
> 
> Cheers!
> 
> Michael Hanulec
> 
> -- 
> <address@hidden> / o: 858.565.6699 x216

Regards,
Steve Emmerson