Re: [ldm-users] the grid

To: patrick <patrick@xxxxxxxxxxxxxx>
Subject: Re: [ldm-users] the grid
From: Gerry Creager <gerry.creager@xxxxxxxx>
Date: Tue, 08 Apr 2008 08:28:59 -0500

patrick wrote:

an interesting article about 'the grid,' which
we will likely be using in the future.

http://www.techtree.com/India/News/The_Grid_to_Render_the_Web_Obsolete/551-88299-643.html

Or maybe not. Neither grid, nor cloud, computing is nearly so easy toenable/enact as our friends at CERN would have you believe. The conceptof ubiquitous data apparently local to you is something we shouldinvestigate, and there are several decent (grid-related!) projects ofnote. However, CERN researchers fail mention that they've worked withtheir less fortunate partners (do you know how much CERN and the EUspend on basic research, compared to the US, especially in physics? Or,for that matter, weather prediction?) to standardize software across allthe "grid" nodes wherever they may be. This to the point that, notknowing that Scientific Linux, already based on recompiling RedHatEnterprise, had decided to go with CentOS, researchers on our campusflatly stated they'd not use any resources my organization stood up,because it didn't meet their software requirements. All they had toremember was, "Scientific Linux" to make CERN compliance happy...

Our experiences with "grid" enablement, so far, have been both good andbad. On one hand, we've beenworking with a group that feels they canmake any app work on any system as long as the Globus Tool Kit isinstalled. Let's just say that their progress to date on the gridresources we've got in that group is, well, slow. Another group we workwith has adopted LDM to get weather model data and share other databetween sites. They employ a standard minimum software "stack" (whichdoes include Globus... but not a random version, rather, they specifywhich is required) and publish guidelines for porting code to differentresources. Their results are a bit better.

Really, the high energy physics guys have it right, though: StandardizeALL the software and require it be kept reasonably up to date, thendistribute the applications they expect second and third tierresearchers to use. In other words, they control the OS, utilities,compilers, and applications. It's a homogeneous computing environment,which for grid enablement, is a good environment.

We've looked at similar approaches for the atmospheric sciences: Whatabout a common-hardware, common-software, 100 TF distributed environment(that's 0.1 petaFlop, or a significant chunk of computing horsepower).What if it were set up to handle on-demand, near-real-timeforecasting, e.g., LEAD-ish event-driven models, triggered when SPC wereto issue a mesoscale discussion, or a watch, and updated in a RUC-ishmanner. What if that were available for competetive scheduling for YOURweather model run. Oh, you don't run a model? Select from extantmodels and request a graphics run. You don't have a project that needsthat much computing time but you're developing a proposal? Ask for adevelopment allocation.

Using a homogeneous approach, grid enablement becomes much moremanageable. Commonizing the hardware and interconnect between nodesmakes the software interactions easier to manage. But the investment isstill non-trivial and who's gonna do it is still a problem.

One other thing unmentioned in the article is scheduling. In thescenario I described above, the idea of preemptive, prioritized, andreservation-enabled scheduling is implied... and pretty well mandatory.Today's implementation of "the grid", and the implementations Ianticipate for the next 5 years or more, are well suited to batch jobswith no element of urgency. You enqueue your job and when you get theresult, you analyze it. You might enqueue several jobs, andpost-process the results all together. You care about the job(s)running to completion and being programmatically sound, in that, if yourun the job several times, with the same input, you get the same output.You don't want that job to get in right now, run fast on enough CPUsto complete in minutes, and then get out. In batch mode, you're not ina hurry. That model works for retrospective analysis in our field, butnot really well for forecasting. Forecasting has a deadline by whichthe job has to be done, post-processing completed, and a human hasreviewed the results. Batch processing has no such guarantees (today).

There are metaschedulers (Spruce) that purport to do things like thingslike this for you but they honestly need a lot more work to be "right".

So: Will we all be using "the grid" in the future? Maybe, almostcertainly, yes. (As to why, I suspect that at some point NSF will say,"Enough!" to every project buying a sub-teraflop cluster to do specificproject computation, and then to be aged out at the end of theproject... say, in three years... rather than being sustained.) I doubtit will be TeraGrid facilities in the near-term, although some will tryto support this sort of usage, and NSF will likely foster it. But theyare used to, and understand batch processing, and lack our sense ofreal-time urgency. A dedicated infrastructure for Atmospheric Sciences(or Ocean/Atmosphere)? Makes sense to me, but NSF has denied this once,at least.

Short form is, I think we have a long way to go to make "the grid" ourstandard computing environment.


gerry
--
Gerry Creager -- gerry.creager@xxxxxxxx
Texas Mesonet -- AATLT, Texas A&M University        
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843

Follow-Ups:
- Re: [ldm-users] the grid
  - From: patrick

References:
- [ldm-users] the grid
  - From: patrick

2008 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the ldm-users archives: