[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #OYM-954504]: netcdf/lustre bug



Hi Avi,

There were a number of read/write/general I/O issues fixed, although
nothing lustre specific.  I'll see if we have an anon ftp site you can
upload the file to, it would be interesting to look at it.  If you have the
binary netcdf file and it is of manageable size, you could send that and I
would extract it myself.

Thanks!

-Ward

address@hidden> wrote:

> New Client Reply: netcdf/lustre bug
>
> Hi Ward, Dennis
> I’ll be happy to try the more recent versions. But do you know if any of
> the read issues were addressed in recent releases?
>
> I have a file — auxhist2 — extracted into an ascii format using ncdump,
> for a couple of variables which show 0’s from lines 786677-796041. However
> the file is 70 MB and my mailer refuses to send it thru’. If you are still
> interested point me to an anon ftp site and I’ll dump it there.
>
> As far as patch for lustre F/S, I would say to have it in the configure
> option for users on lustre systems and to have the default no.
>
> Thanks for your help.
>
>         — Avi
>
>
> address@hidden> wrote:
>
> > Hi Avi,
> >
> > Just to jump in, there have been several releases since 4.3.0.  It would
> be helpful if you guys could test this in the current release, 4.4.1.1, or
> even the just-released 4.5.0 release candidate 1, so that we can confirm
> that this is still in issue.
> >
> > Thanks!
> >
> > -Ward
> >
> >> Just to update this issue — the version of Netcdf is incorrectly stated
> — the version where this bug is reported or seen is 4.3.0. So the enquiry
> is for later version which may have addressed the improper read issue or
> have workaround.
> >>
> >> — Avi
> >>
> >> On Jun 6, 2017, at 1:45 PM, Avi Purkayastha <address@hidden<
> mailto:address@hidden>> wrote:
> >>
> >> Hello,
> >> recently we became aware of a netcdf/lustre issue: both softwares
> contributed in the following way:
> >> The bug on the lustre filesystem software (versions 2.7, which we have
> at NREL and at other sites using this version) is related to a client
> losing the
> >> layout lock on a multi-stripe IO.  At some sites with multiple Lustre
> file systems, they only see the problem on some of the lustre file systems,
> but race
> >> conditions are funny that way.
> >>
> >> Netcdf (v1.4.1) is also at fault because it doesn’t properly fail or
> retry on error. Instead,  it zeros all the unread part of the buffer
> instead of trying another read
> >> when the read doesn’t complete properly due to the bug, and then it
> continues without saying there was an error!
> >>
> >> I was wondering if you were aware of this issue and connection between
> the two softwares and if independently some recent version of netcdf
> addressed this read
> >> issue or there is some workaround for it.
> >>
> >> Thanks
> >> — Avi
> >>
> >> _________________________________
> >> Avi Purkayastha, PhD
> >> Computational Science Center
> >> National Renewable Energy Laboratory
> >> 15013 Denver West Parkway
> >> Golden, CO 80401-3305
> >> Phone: (303)275-4243 Fax: (303)275-4007
> >> address@hidden<mailto:address@hidden>
> >>
> >>
> >>
> >>
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: OYM-954504
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> > ===================
> > NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publicly available through
> the web.  If you do not want to have your interactions made available in
> this way, you must let us know in each email you send to us.
> >
> >
> >
>
> _________________________________
> Avi Purkayastha, PhD
> Computational Science Center
> National Renewable Energy Laboratory
> 15013 Denver West Parkway
> Golden, CO 80401-3305
> Phone: (303)275-4243 Fax: (303)275-4007
> address@hidden
>
>
>
> Ticket Details
> ===================
> Ticket ID: OYM-954504
> Department: Support netCDF
> Priority: Normal
> Status: Open
> Link:  https://andy.unidata.ucar.edu/esupport/staff/index.php?_m=
> tickets&_a=viewticket&ticketid=28356
>
>



Ticket Details
===================
Ticket ID: OYM-954504
Department: Support netCDF
Priority: Normal
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.