Re: [netcdfgroup] netcdf4 python

ahh yes, so many options!

I think there is a key different use-case. I tlooks like spack was designed
with super computers in mind, which means, as you say, a very particular
configuration, so collections of binary packages aren't very useful.

conda, on the other hand ,was designed with "standard" systems in mind --
pretty easy on Windows and OS-X, and they have taken a least common
denominator approach that works pretty well for Linux. But yeah, the
repositories of binaries are only useful because of this standardization.

and building is a pain, so conda also is all about enabling  a mall number
of experts to build packages that the rest of us can all use -- not so
useful for highly customized systems (as above).

>> Yes.  Every installed Spack package is uniquely versioned based on the
> version of that package, plus all its dependencies.  So...

hmm -- that may be worth looking into -- seems quite robust. conda does
this with python and numpy versions, and the versions of dependencies is
encoded in the meta data, but you can't tell, at a glance whether a given
package depends on, for example netcdf4.1 or 4.2. though I'm still not sure
how important that is. It is in the meta data, and it will be resolved (or
the conflict reported) when you try to install.

> if you have GCC and Clang, and Python 2.7 and Python 3.5, then you can
> install 4 versions of Numpy, etc. --- all simultaneously.  If you upgrade
> to a new version of GCC, you can install another 2 versions of Numpy.  One
> Spack installation uses this feature to provide 24 versions of a particular
> package to the system's users, based on different choices of compiler, MPI,
> etc.

conda can handle this as well, though the C runtime is usually standarized.
though theres nothign about conda that requires that -- it's convention,
and the fact that one of conda's goals is to share binaries.

>> As far as I can tell, a single Conda recipe builds a single version of a
> package.  A repo of Conda recipes will build a single version of your
> software stack, analogous to the single set of packages you get with your
> Linux distro.

more or less, yes, but we're working on auto-updating of versions in
recipes -- you have to specify the version SOMEWHERE, and changing it in
the recipe's yaml is a bit klunky, but accomplishes similar ends.

I think the reason for that klunkyness is that you don't actually know
apriori if you can use the exact same build procedure for different
versions of a package -- by tying the version to the build procedure you
are assured not to have problems with that.

In fact, the build procedures don't really change that much, at least on
minor version bumps, so this may be solving a minimal problem in exchange
for additional complication (or more need for hand-maintenance, anyway)

A single Spack recipe can build many versions of a package.  And if the
> same nominal version of a package is built with different dependencies,
> that is considered a different version as well.
> This ability to handle combinatorial complexity is Spack's killer feature.

that does look pretty cool.

>> No.  There are combinatorially many combinations of toolchains, package
> versions, etc. that one MIGHT want to use.


> I have some C/Fortran libraries I've built, and use largely in a C/Fortran
> context.  They rely on a software stack of about 50 dependencies.  I've
> also built Python extensions (using Cython) that allow these libraries to
> be scripted from within Python.  In order for my Python extensions to work,
> everything must be built with the same software stack.  Not just the same
> compiler, but also the same zlib, NetCDF, etc.

yup - and that is a problem conda solves as well.

> I tell people if they just want a Python that works, install with
> Anaconda.  But if they need to build Python extensions, use Spack.

I'm going to extend that -- building python extensions with conda is VERY
doable -- it really helps make it much easier (than not using conda or
spack...). And you can build conda packages that you can then distribute to
users that may not have the "chops" to do the building themselves.

IIUC, the big difference, and use case for spack is that it makes all this
doable on a hihgly specialized system.

So I would edit your recommendations:

If you are running on a "standard" system: OS-X Windows, common Linux
distros, conda is an excellent option.

If you are running a specialized system -- spak sounds like the way to go.

I'm sure their use cases overlap quite a bit as well.

Thanks for all the into -- I'll make a point of keeping an eye on spack.



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception