netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.
Mike At 12:23 PM 3/16/2006, Albert Cheng wrote:
Hi, Ed and everyone, I would like to share some of my experience installing HDF5 on various platforms, some classic like Linux/AIX/Sun and some one of a kind new platforms like IBM Blue Gene, Cray XT3. On the classics platforms, nearly all hosts of them would zlib installed. So, for them, it is a non-issue. So those that don't have zlib installed, some have a policy of no non-supported software. Gnu Zlib is considered non-supported according to them. In those cases, I usually just build HDF5 without zlib since their policy is no non-supported software. (You may ask "Is HDF5 software supported?" For them, yes, because they can contact the HDF group for support.) On some new platforms, things are still in flux. Zlib compression is a luxury for them at the moment. If it works, great. If not, well, they have other higher priorities like file systems or compilers. :-) I would have to build HDF5 without zlib in order for HDF5 users to test their software on these new platforms. Yes, it is true that HDF5 files with data compression produced in some other systems will not be quite usable in the above platforms. (Actually, one can still access parts of the file that are not compressed, just not able to read the data of any compressed datasets.) On the other hand, HDF5 files without compression will be totally useable in the above platforms. Then there is a different dimension of the zlib due to its own success. Since zlib is so popular now, 95% of machines I run into would have a version of zlib installed by the system. If HDF5 brings in its own version, HDF5 users now have a new confusion--which version of zlib is my application linking with--the system version or the HDF5 version? There are also applications that already use zlib on their own and has been linking with their system version of zlib. Now they want to use HDF5 library in their application and if HDF5 insists on "inserting" their own version of Zlib, these groups of users would be annoyed. To further complicate the issue, if shared lib versions of zlib is available, the user applications code will use different versions of zlib depending on how their runtime environment such as $LD_LIBRARY_PATH. It is confusing. (Honestly, the shared lib, though a great tool, could be confusing at times.)There are valid pros and cons on each approach. I think the current HDF5 setting is a reasonable compromise for the current conditions. Since netCDF-4 must run with zlib compression, would it work for netCDF4 to be distributed with a version of zlib code? I know, it is not quite right for netCDF4, being the upper layer,to provide the source code of a library two layers down. But is it a possible solution? -Albert At 09:47 AM 3/16/2006, you wrote:Quincey Koziol <koziol@xxxxxxxxxxxxx> writes: Howdy Quincey!>> I agree with Ed's concern about the netCDF library not being able to read>> netCDF files that are created using zlib. Same goes for HDF5 >> itself. However it's done, the libraries should have zlib available. > > If we want to make zlib a requirement, we should decide that for > certain. By "we" in this, you mean the HDF5 team, right? Because for netCDF-4, zlib is a requirement. We have the following netCDF-4 requirement: "Compression can be applied on a per-variable basis." I think this is an important feature of netCDF-4. I don't think we want the user confusion that would go with some installations having zlib, others not. So I don't want to build netCDF-4 without zlib. It's a firm requirement for netCDF-4. For HDF5, my opinion would be that you are making it harder for the users. With the current situation, it is possible to write a HDF5 file on one system, and be unable to read it on another. It is possible to write a HDF5 program that works great on one platform, but won't compile on another. Does this not cause a lot of user confusion? I suspect most people just install HDF5 without zlib. But per-variable compression is a *GREAT* feature, and one that I think most researchers would very much like to use (my opinion).> Then, if it's a requirement, we should make the absolute best attempt we can > to use the system's installed zlib - that's what it's there for. Only if there > is no zlib installed on the system should we attempt to help the user install > a copy of zlib from the _zlib_ web-site. We should _not_ be in the "business"> of distributing zlib. Here I must disagree. Your current method makes it considerably harder for the user to ensure that zlib is installed, and the HDF5 is built with the --with-zlib option. In fact, I would speculate that the vast majority of users will screw this part up. (I often do myself!) It is usual in autotools to provide what cannot be found on the system. For example, if I want to use the "strtok" function, the autotools way would be to look for it on the host system, and, if not found, to provide one, *not* to have my code work one way if it's there, and another way if it's not. Similarly, it would make sense for HDF5 to look for zlib, and, if not found, to provide it. The amount of extra code in your distribution is trivial. The amount of extra work to get it working this way is zero for you, since I am going to do it anyway, and you can just take my solution if you want it. In general, this also guards against versioning problems. What if the user has a version of zlib other than 1.2.3 installed? Surely you don't test against every possible version of zlib. I understand your concerns about taking on some responsibility for zlib. But that is already the case - it is already an important component of our products. Making it harder to install does not really help this problem. If there is a bug is zlib, we all will be hurting - no matter how it was installed. In fact, by taking a known, tested version of zlib inside our products, we guard against zlib risks. If version 1.2.4 of zlib comes out, and it sucks, we will be protected. All our users will be using 1.2.3, which they got with their tarball. >> At first blush, Ed's solution seems to me like a good one. >> It's _very much_ against all the current open source "package" installation> systems' philosophy and I don't think it's a good idea. All of package> installation systems allow each package in the system to list other packages as> prerequisites that must be installed on the system before the package the > user is interested in is installed. We just _finished_ removing zlib and > libjpeg from the HDF4 distribution... :-) In the world of package distribution, if zlib is required for HDF5, that means that HDF5 will not build without zlib. The problem is that HDF5 *will* build without zlib, yielding a HDF5 build which is not usable for netCDF-4. To fit the current package model, you need to make the HDF5 configure script dumb enough to just give up when zlib is not present. There there would be only one way to build HDF5 - with zlib. Also, package distribution, while wonderful, is not the usual way for people to get either HDF5 or netCDF - they get these as tarballs from our site. So we cannot rely on every user having a package manager to hide this complexity. I would be interested to hear why you removed zlib from HDF4. Was there a problem? > > Shipping someone else's library with our package and installing it > on a user's system (most likely _in place of_ the system version of that > library) has caused _lots_ of headaches for us in the past. Installation is different from building. I would not suggest that zlib be installed from the HDF5 build. You can build it and use it internal to the HDF5 package, use it within HDF5 but not expose it to anyone else. Then you have not installed anything on the user's system other than HDF5. > Frankly, I think this is the course that netCDF-4 should take > with HDF5 also, but that'll ultimately be up to them. I would love to. Instead of programming autoconf and automake I would be out on a bike ride! But that's just going to kill uptake of netCDF-4. Our users are Earth scientists, not computer scientists. I intend that users will be able to install netCDF-4 in one step, just like netCDF-3. It makes extra work for me up front, but also saves me hundreds or thousands of support emails later. (Recall - we don't have a large support team here at netCDF - just me and Russ!) In this new build method, users will not get HDF5 from you and netCDF-4 from me, they will get the whole tarball from me, and build it all at once. (There will also be provision for users who want to use an already-installed version of HDF5 - but this will be a small minority of users. However, for them, everything can work just as it does now.) It was also not my intention to install HDF5 on their machines with netCDF-4, but just to use it internally, and end up installing netCDF-4 only. (This also would mean that the user would not have to provide the link arguments "-lhdf5 -lhdf5_hl -lz -lnetcdf", just "-lnetcdf". It will make the netcdf library files larger, but so what?) If the HDF5 build can take on the task of building zlib when not present that would make things a lot easier for me. I can give you the changes that you would need for that to happen. If not, then I will possibly have to hack every HDF5 release before including it in the master tarball with netCDF-4. This will add some work for me every time you do a release, and some delay in adopting new HDF5 versions in netCDF-4. My goal is to make life easier for the netCDF user. Current netCDF4 installation is far too complicated - it *must* be simplified. Otherwise we will lose half our users or more before they even get to try it. Thanks, Ed -- Ed Hartnett -- ed@xxxxxxxxxxxxxxxx-- Mike Folk, Scientific Data Tech (HDF) http://hdf.ncsa.uiuc.edu NCSA/U of Illinois at Urbana-Champaign Voice: 217-244-06471205 W. Clark St, Urbana IL 61801 Fax: 217-244-5521