[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem on SGI



>To: address@hidden
>From: Matthew Bettencourt <address@hidden>
>Subject: Re: 20030903: Problem on SGI 
>Organization: 
>Keywords: C++, sync, bug, SGI, File exists

Matt,

> I think I may have fixed (read kluged) it on my end.  Here is what I 
> have done.  I have wrapped all netcdf commands in locks via
> 
> #define ncCommand(cmd,str) { \
> string s2;                                                      \
> s2 = "running "; s2 += str; err.pushError(s2.c_str());          \
> netCDFLock.getNetCDFLock();                                     \
> s2 = "Got lock "; s2 += str; err.pushError(s2.c_str());         \
> cmd                                                             \
> s2 = "Done ";s2 += str;  err.pushError(s2.c_str());             \
> netCDFLock.freeNetCDFLock(); err.pushError("Freed"); }
> 
> 
> That didn't work, ugh... So, what I did is I wrapped all my output in 
> the same lock (cout, cerr) and that looks like it have fixed the 
> problem.  What this is saying to be is tht the i/o layer is not thread 
> safe on the SGI.
> 
> I have ran my code twice now and gotten to the end both times.

Great.  I'll assume that means we don't have to look for a bug in the
netCDF C++ interface.

> There is a problem with this as a long term soln however for me, it 
> kills my performance becuase if I am writting one file I can't be 
> reading another which happens all the time in my code (Only on the SGI 
> however)

Have you considered using the parallel netcdf library (pnetcdf)
developed by the Argonne/Northwestern group that uses MPI?  It's
available from

  http://www-unix.mcs.anl.gov/parallel-netcdf/

and may be a reasonable long-term solution.

> One thing that I noticed about the compiler is that I don't know if it 
> conforms to the std when it comes to static object.  Example, if I was to
> 
> 
> file.hh
> 
> class foo{
>       int i;
> };
> 
> static foo bar;
> 
> 
> I notice that all things that
> 
> include "file.hh"
> have different bar's.  This is also only on the SGI, I have gone through 
> and replaced those with extern and defined them elsewhere and that fixed 
> the problem.
> 
> Do you know if this is a bug or a feature..???  I might be offbase on 
> this one, but I dont think I am

It looks like a bug to me, but I also know that static data at
file-scope is deprecated in C++, because the initialization order is
indeterminable.  See

  http://www.parashift.com/c++-faq-lite/classes-and-objects.html#faq-7.5

for example.

--Russ