Mark, > I've been looking closer at the cause of the second problem, and have a > hypothesis. When you look at how nccopy iterates through a variable when > making the copy (ie. in up_start_by_chunks() in nciter.c), it goes in reverse > order of the dimensions. e.g. for CHL1_mean[date,lon,lat] it scans first > through lat first, then lon, then date. However, this can be very memory > inefficient in the situation where you are trying to make the rearrangement > along the date dimension - you essentially have to load the entire file to > get enough data to write an entire date chunk.... > > I could see two solutions. > > 1. automagically work out which dimension to scan in (hard to implement > robustly) > 2. infer the scan direction from the -c argument i.e. if you only specify > date/5186 (and nothing else), and you have a variable with > date/1,lat/30,lon/30, then the most efficient way to rechunk it would be to > read along the date dimension first, then the lon and lats..... > > Hmmm. I'm not sure that makes any sense - it's kind of hard to explain. Can > you follow my logic? Yes, but I see some complications that make my head hurt. If you want to rechunk a variable, it's not clear whether it's better to access the input one input chunk at a time to write the output in an inefficient order, or to access the input in an inefficient order so that you can write the output one output chunk at a time. Currently the nc_next_iter() function in nciter.c does the former, but it sounds like you think it would be better if it did the latter. I think you can construct examples where either strategy is efficient or horribly inefficient, depending on the shapes of chunks in the input and output files. I think the right thing to do would be to determine, from the chunk shapes of input and output, which strategy to implement, or even whether to use a hybrid strategy involving multiple passes and an intermediate file or in-memory structure. I tried to determine whether this research has already been done, but couldn't find a paper that provided a clear solution. Maybe it's easier than I'm making it out to be, and there's a clear and simple solution. If so, I'd like to implement it! --Russ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: AWT-862217 Department: Support netCDF Priority: Normal Status: Closed
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.