[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030314: FreeBSD as a Unidata-supported platform (cont.)



>From: Jim Koermer <address@hidden>
>Organization: Plymouth State
>Keywords: 200303121902.h2CJ2EB2009171 LDM FreeBSD mmap

Hi Jim,

>Here is what Ted was able to find out about the 2GB memory map
>limitation.

Thanks for sending along Ted's thoughts/findings.  We looked hard at
system C include files and see that the show stopper is not easily
gotten around:

o the mmap prototype shows that the size of the object that can be
  memory mapped is a size_t:

  void *
  mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset);

o the size of a size_t is 32 bits

It may be possible to change these values and rebuild the kernel, but
it is not likely.

As to being able to mmap files larger than 2 GB on Linux, it is possible.
Daryl Hertzman at Iowa State is running an LDM under Redhat 7.? Linux
using a queue that is something like 4 GB.  At one time he was using
a 10 GB queue, so it must be possible.

As to the cost difference between a Sun 280R and a PC running FreeBSD:
this is exactly why we and the rest of the world are so interested
in PC solutions.

Please pass along our thanks to Ted for the information.  Also, if
you/he catches wind of any changes in FreeBSD that would help us
get around the 2 GB limit, we would really appreciate it if you would
pass the info along.

Tom

>Jim
>
>-------- Original Message --------
>Subject: Re: [Fwd: 20030313: FreeBSD as a Unidata-supported platform]
>Date: Thu, 13 Mar 2003 23:15:07 -0500 (EST)
>From: Ted Wisniewski <address@hidden>
>To: Jim Koermer <address@hidden>
>
>Ok.   I heard back from the mailing list;  the real reason behind the
>limit
>appears to be the i386 architecture being able to only address 4GB of
>total
>addessable space, 1GB is reseved for the kernel leaving only 3 total
>remaining, some of which appears to be reserved.  My contact on the list
>was not 100% sure what it is reseved for.   He sent me a test program,
>which I ran on my BSD box and it did have problems at about 2GB.  I ran
>the
>same program on a Linux box and it had the same issue.  So, it
>definitely
>points to an architectural limit.  Now, if you install FreeBSD or 
>Linux on a Sparc or Alpha or an IA64 machine you would get by the limit.
>The linux mmap man page does not give you the warning, which could be 
>misleading....  I suspect that Intel Solaris would behave the same way.
>
>I hope that helps to frame the issue.   Bottom line, the man page is
>correct,
>unless ldm makes sure not to mmap more than the limit without freeing
>some
>of the mmaped memory before asking for more...   It will fail at around
>the 2GB size (unless they keep the total below the limit).
>
>One thing is certain about their tests...  A Sunfire 280 is far more
>expensive than a comparable Intel Box.. ;->
>
>Ted
>
>
>
>(* 
>(* I passed on the info of your e-mail to Tom Yoksas at Unidata. I
>thought
>(* that you would like to hear the details of their test results.
>(* 
>(* Jim
>(* 
>(* -------- Original Message --------
>(* Subject: 20030313: FreeBSD as a Unidata-supported platform
>(* Date: Thu, 13 Mar 2003 08:20:11 -0700
>(* From: Unidata Support <address@hidden>
>(* Organization: UCAR/Unidata
>(* To: Jim Koermer <address@hidden>
>(* CC: address@hidden
>(* 
>(* >From: Jim Koermer <address@hidden>
>(* >Organization: Plymouth State
>(* >Keywords: 200303121902.h2CJ2EB2009171 LDM FreeBSD mmap
>(* 
>(* Hi Jim,
>(* 
>(* >I've been trying to tell everyone for years (since 1996) that I
>thought
>(* >FreeBSD was a great OS.
>(* 
>(* You were right!
>(* 
>(* >I'm glad it nearing adoption at Unidata.
>(* 
>(* We are very impressed with FreeBSD after running it hard on an amply
>(* configured machine.  Right now, I am doing a stress test of LDM-6 on
>(* our FreeBSD box: the machine has 10 inbound feeds and 50 outbound
>(* feeds, mostly of the full CONDUIT datastream.  CONDUIT is the high
>(* resolution model output from NCEP.  Its volume is roughly 75% of the
>(* entire IDD volume for all other feeds including CRAFT (which is the
>(* NEXRAD Level II/full volume scan data).  Relaying CONDUIT data is
>_the_
>(* acid test for LDM performance, and our FreeBSD box is doing a great
>job
>(* at it (but, so is our Sun Sunfire 280 SPARC box, thelma).  The bit
>(* rates on our test machine are peaking at about 59 Mbps.  As a
>(* comparison, our stress test of our Sunfire 280 (thelma), we ran it at
>(* speeds averaging 54 Mbps and peaking at over 100 Mbps for several
>(* days.  Doing the numbers shows that this is moving 520 GB/day of data
>(* off of the server.  This was done without introducing any noticable
>(* latency to the products!
>(* 
>(* >Here is what my ITS FreeBSD guru has to say about the memory map
>(* >contraint that you mentioned:
>(* >
>(* >"I will look into the MMAP thing a little more.  The man page
>(* >of mmap says there is a limitation imposed, however, I cannot find
>(* >the limitation in the source code.  It could be that the limitation
>(* >is not actually there anymore;  meaning that the issue was resolved
>(* >a while back and just left in the manpage.   I have posted a
>Question
>(* >about this to the FreeBSD mailing list to find out for sure.  I'll
>(* >get back to you when the response comes."
>(* >
>(* >We'll keep you updated when we find out any more information.
>(* 
>(* We tried to get this to work, but stopped short of doing some code
>(* modification after reading the mmap man page.  If you find that the
>man
>(* page is out of date and there really is no limitation, I will be even
>(* more keen on FreeBSD.  Since the Weather Service is now investigating
>(* hardware for relaying all of the Level II NEXRAD data, it would be
>(* _very_ useful to learn that FreeBSD did not have a 2 GB mmap limit. 
>If
>(* it doesn't we will lean towards recommending a PC/FreeBSD solution
>(* rather than a Sun/Solaris one.
>(* 
>(* Thanks for the input, and I will be eagerly looking for follow-up
>(* information from you.
>(* 
>(* Tom
>(*
>****************************************************************************
>(* Unidata User Support                                    UCAR Unidata
>(* (303)497-8643                                                  P.O.
>Box
>(* address@hidden                                   Boulder,
>CO
>(*
>----------------------------------------------------------------------------
>(* Unidata WWW Service                       
>(*
>****************************************************************************
>(* 
>
>
>-- 
>|   Ted Wisniewski                  E-Mail:  address@hidden        |
>|   Manager, Systems Group           WEB:    
>http://oz.plymouth.edu/~ted/ |
>|   Information Technology
>Services                                        |
>|   Plymouth State College           Phone:   (603)
>535-2661               |
>|   Plymouth NH, 03264               Fax:     (603)
>535-2263               |
>

>From address@hidden Fri Mar 14 09:34:15 2003
Subject: [Fwd: Re: [Fwd: 20030314: FreeBSD as a Unidata-supported platform 
(cont.)]]

Tom,

Here is some additional info from Ted.

Jim

-------- Original Message --------
Subject: Re: [Fwd: 20030314: FreeBSD as a Unidata-supported platform
(cont.)]
Date: Fri, 14 Mar 2003 11:24:14 -0500 (EST)
From: Ted Wisniewski <address@hidden>
To: Jim Koermer <address@hidden>

The guy I got E-Mail from indicated that you can use smaller windows of
say (256MB) and just map the areas that are actually being used vs. the
whole
file all the time.  It is possible they are doing something special on
the Linux machine, I got RedHat 7.1 to fail in the same way it did on
BSD.  The Value in a 32 bit integer can be up to 4GB,  however, some
of that is reseved (in terms of mmap) for the kernel (probably half).
At any rate both the BSD and Linux mmap routines use a 32 bit size_t
value.  So, I don't know how it would get above the 4GB limit on
Linux...
I can see some creative ways to go over 2GB actual file size by not
mapping the whole file all the time.  There is also to possiblity
of not using mmap() at all and handle the file read/writes in the
way you would do it without mmap.  In which case file offsets are
of type "off_t" which is a 64 bit integer, which would give you limits
out at the max filesize the System supports.  On BSD that is about
1TB, I am not sure what it is on Linux.  There may be a performance
penalty in not using mmaped files.

I believe I did see an option in the LDM code which would allow you not
to use
mmap;  perhaps this is how the problem was addressed on the Linux
machine?
The best choice, is to use a 64bit harware platform, if you want to use
mmap on very large files.

Hope this helps.


Ted




(* I passed on your info to Tom and he is appreciative. He indicates
that
(* there could be away around the limitation.
(* 
(* Jim
(* 
(* -------- Original Message --------
(* 
(* >Here is what Ted was able to find out about the 2GB memory map
(* >limitation.
(* 
(* Thanks for sending along Ted's thoughts/findings.  We looked hard at
(* system C include files and see that the show stopper is not easily
(* gotten around:
(* 
(* o the mmap prototype shows that the size of the object that can be
(*   memory mapped is a size_t:
(* 
(*   void *
(*   mmap(void *addr, size_t len, int prot, int flags, int fd, off_t
(* offset);
(* 
(* o the size of a size_t is 32 bits
(* 
(* It may be possible to change these values and rebuild the kernel, but
(* it is not likely.
(* 
(* As to being able to mmap files larger than 2 GB on Linux, it is
(* possible.
(* Daryl Hertzman at Iowa State is running an LDM under Redhat 7.? Linux
(* using a queue that is something like 4 GB.  At one time he was using
(* a 10 GB queue, so it must be possible.
(* 
(* As to the cost difference between a Sun 280R and a PC running
FreeBSD:
(* this is exactly why we and the rest of the world are so interested
(* in PC solutions.
(* 
(* Please pass along our thanks to Ted for the information.  Also, if
(* you/he catches wind of any changes in FreeBSD that would help us
(* get around the 2 GB limit, we would really appreciate it if you would
(* pass the info along.
(* 
(* Tom
(* 
(* >Jim
(* >
(* >-------- Original Message --------
(* >Subject: Re: [Fwd: 20030313: FreeBSD as a Unidata-supported
platform]
(* >Date: Thu, 13 Mar 2003 23:15:07 -0500 (EST)
(* >From: Ted Wisniewski <address@hidden>
(* >To: Jim Koermer <address@hidden>
(* >
(* >Ok.   I heard back from the mailing list;  the real reason behind
the
(* >limit
(* >appears to be the i386 architecture being able to only address 4GB
of
(* >total
(* >addessable space, 1GB is reseved for the kernel leaving only 3 total
(* >remaining, some of which appears to be reserved.  My contact on the
list
(* >was not 100% sure what it is reseved for.   He sent me a test
program,
(* >which I ran on my BSD box and it did have problems at about 2GB.  I
ran
(* >the
(* >same program on a Linux box and it had the same issue.  So, it
(* >definitely
(* >points to an architectural limit.  Now, if you install FreeBSD or 
(* >Linux on a Sparc or Alpha or an IA64 machine you would get by the
limit.
(* >The linux mmap man page does not give you the warning, which could
be 
(* >misleading....  I suspect that Intel Solaris would behave the same
way.
(* >
(* >I hope that helps to frame the issue.   Bottom line, the man page is
(* >correct,
(* >unless ldm makes sure not to mmap more than the limit without
freeing
(* >some
(* >of the mmaped memory before asking for more...   It will fail at
around
(* >the 2GB size (unless they keep the total below the limit).
(* >
(* >One thing is certain about their tests...  A Sunfire 280 is far more
(* >expensive than a comparable Intel Box.. ;->
(* >
(* >Ted
(* >
(* >
(* >
(* >(* 
(* >(* I passed on the info of your e-mail to Tom Yoksas at Unidata. I
(* >thought
(* >(* that you would like to hear the details of their test results.
(* >(* 
(* >(* Jim
(* >(* 
(* >(* -------- Original Message --------
(* >(* Subject: 20030313: FreeBSD as a Unidata-supported platform
(* >(* Date: Thu, 13 Mar 2003 08:20:11 -0700
(* >(* From: Unidata Support <address@hidden>
(* >(* Organization: UCAR/Unidata
(* >(* To: Jim Koermer <address@hidden>
(* >(* CC: address@hidden
(* >(* 
(* >(* >From: Jim Koermer <address@hidden>
(* >(* >Organization: Plymouth State
(* >(* >Keywords: 200303121902.h2CJ2EB2009171 LDM FreeBSD mmap
(* >(* 
(* >(* Hi Jim,
(* >(* 
(* >(* >I've been trying to tell everyone for years (since 1996) that I
(* >thought
(* >(* >FreeBSD was a great OS.
(* >(* 
(* >(* You were right!
(* >(* 
(* >(* >I'm glad it nearing adoption at Unidata.
(* >(* 
(* >(* We are very impressed with FreeBSD after running it hard on an
amply
(* >(* configured machine.  Right now, I am doing a stress test of LDM-6
on
(* >(* our FreeBSD box: the machine has 10 inbound feeds and 50 outbound
(* >(* feeds, mostly of the full CONDUIT datastream.  CONDUIT is the
high
(* >(* resolution model output from NCEP.  Its volume is roughly 75% of
the
(* >(* entire IDD volume for all other feeds including CRAFT (which is
the
(* >(* NEXRAD Level II/full volume scan data).  Relaying CONDUIT data is
(* >_the_
(* >(* acid test for LDM performance, and our FreeBSD box is doing a
great
(* >job
(* >(* at it (but, so is our Sun Sunfire 280 SPARC box, thelma).  The
bit
(* >(* rates on our test machine are peaking at about 59 Mbps.  As a
(* >(* comparison, our stress test of our Sunfire 280 (thelma), we ran
it at
(* >(* speeds averaging 54 Mbps and peaking at over 100 Mbps for several
(* >(* days.  Doing the numbers shows that this is moving 520 GB/day of
data
(* >(* off of the server.  This was done without introducing any
noticable
(* >(* latency to the products!
(* >(* 
(* >(* >Here is what my ITS FreeBSD guru has to say about the memory map
(* >(* >contraint that you mentioned:
(* >(* >
(* >(* >"I will look into the MMAP thing a little more.  The man page
(* >(* >of mmap says there is a limitation imposed, however, I cannot
find
(* >(* >the limitation in the source code.  It could be that the
limitation
(* >(* >is not actually there anymore;  meaning that the issue was
resolved
(* >(* >a while back and just left in the manpage.   I have posted a
(* >Question
(* >(* >about this to the FreeBSD mailing list to find out for sure. 
I'll
(* >(* >get back to you when the response comes."
(* >(* >
(* >(* >We'll keep you updated when we find out any more information.
(* >(* 
(* >(* We tried to get this to work, but stopped short of doing some
code
(* >(* modification after reading the mmap man page.  If you find that
the
(* >man
(* >(* page is out of date and there really is no limitation, I will be
even
(* >(* more keen on FreeBSD.  Since the Weather Service is now
investigating
(* >(* hardware for relaying all of the Level II NEXRAD data, it would
be
(* >(* _very_ useful to learn that FreeBSD did not have a 2 GB mmap
limit. 
(* >If
(* >(* it doesn't we will lean towards recommending a PC/FreeBSD
solution
(* >(* rather than a Sun/Solaris one.
(* >(* 
(* >(* Thanks for the input, and I will be eagerly looking for follow-up
(* >(* information from you.
(* >(* 
(* >(* Tom
(* >(*
(*
>****************************************************************************
(* >(* <
(* >(* Unidata User Support                                    UCAR
Unidata
(* >(* Program <
(* >(* (303)497-8643                                                 
P.O.
(* >Box
(* >(* 3000 <
(* >(* address@hidden                                  
Boulder,
(* >CO
(* >(* 80307 <
(* >(*
(*
>----------------------------------------------------------------------------
(* >(* <
(* >(* Unidata WWW Service                       
(* >(* http://www.unidata.ucar.edu/      <
(* >(*
(*
>****************************************************************************
(* >(* <
(* >(* 
(* >
(* >
(* >-- 
(* >|   Ted Wisniewski                       E-Mail:  address@hidden        |
(* >|   Manager, Systems Group           WEB:    
(* >http://oz.plymouth.edu/~ted/ |
(* >|   Information Technology
(* >Services                                        |
(* >|   Plymouth State College           Phone:   (603)
(* >535-2661               |
(* >|   Plymouth NH, 03264               Fax:     (603)
(* >535-2263               |
(* >
(* 
(*
****************************************************************************
(* <
(* Unidata User Support                                    UCAR Unidata
(* Program <
(* (303)497-8643                                                  P.O.
Box
(* 3000 <
(* address@hidden                                   Boulder,
CO
(* 80307 <
(*
----------------------------------------------------------------------------
(* <
(* Unidata WWW Service                       
(* http://www.unidata.ucar.edu/      <
(*
****************************************************************************
(* <
(* 


-- 
|   Ted Wisniewski                   E-Mail:  address@hidden        |
|   Manager, Systems Group           WEB:    
http://oz.plymouth.edu/~ted/ |
|   Information Technology
Services                                        |
|   Plymouth State College           Phone:   (603)
535-2661               |
|   Plymouth NH, 03264               Fax:     (603)
535-2263               |