NetCDF  4.7.1
inmemory.md
1 NetCDF In-Memory Support
2 ====================================
3 
4 <!-- double header is needed to workaround doxygen bug -->
5 
6 NetCDF In-Memory Support {#inmemory}
7 ====================================
8 
9 [TOC]
10 
11 Introduction {#inmemory_intro}
12 --------------
13 
14 It can be convenient to operate on a netcdf file whose
15 content is held in memory instead of in a disk file.
16 The netcdf API has been modified in a number of ways
17 to support this capability.
18 
19 Actually, three distinct but related capabilities are provided.
20 
21 1. DISKLESS -- Read a file into memory, operate on it, and optionally
22 write it back out to disk when nc_close() is called.
23 2. INMEMORY -- Tell the netcdf-c library to treat a provided block
24 of memory as if it were a netcdf file. At close, it is possible to ask
25 for the final contents of the memory chunk. Be warned that there is
26 some complexity to this as described below.
27 4. MMAP -- Tell the netcdf-c library to use the *mmap()* operating
28 system functionality to access a file.
29 
30 The first two capabilities are intertwined in the sense that the
31 *diskless* capability makes use internally of the *inmemory*
32 capability (for netcdf classic only). But, the *inmemory*
33 capability can be used independently of the *diskless*
34 capability.
35 
36 The *mmap()* capability provides a capability similar to *diskless* but
37 using special capabilities of the underlying operating system.
38 Note also that *diskless* and *inmemory* can be used for both
39 *netcdf-3* (classic) and *netcdf-4* (enhanced) data. The *mmap*
40 capability can only be used with *netcdf-3*.
41 
42 Enabling Diskless File Access {#Enable_Diskless}
43 --------------
44 The *diskless* capability can be used relatively transparently
45 using the *NC_DISKLESS* mode flag.
46 
47 Note that since the file is stored in memory, size limitations apply.
48 If you are on using a 32-bit pointer then the file size must be less than 2^32
49 bytes in length. On a 64-bit machine, the size must be less than 2^64 bytes.
50 
51 Also note that for a diskless file, there are two notions of
52 *write* with respect to the file. The first notion is that the
53 file is writeable through the netCDF API, but on disk, the file is
54 read-only. This means a call to, for example, _nc_def_dim()_ will succeed,
55 but no changes will be written to disk.
56 The second notion of *write* refers to the file on disk to which
57 the contents of memory might be persisted.
58 
59 WARNING: control of the two kinds of *write* has changed since
60 release 4.6.1.
61 
62 The mode flag NC_WRITE determines the first kind of *write*.
63 If set, then NC_WRITE means that the file can be modified through
64 the netCDF API, otherwise it is read-only. This is a change since
65 release 4.6.1.
66 
67 The new mode flag NC_PERSIST now determines the second kind of
68 *write*. If set, then NC_PERSIST means that the memory contents
69 will be persisted to disk, possibly overwriting the previous
70 file contents. Otherwise, the default is to throw away the
71 in-memory contents.
72 
73 ### Diskless File Open
74 Calling *nc_open()* using the mode flag *NC_DISKLESS* will cause
75 the file being opened to be read into memory. When calling *nc_close()*,
76 the file will optionally be re-written (aka "persisted") to disk. This
77 persist capability will be invoked if and only if *NC_PERSIST* is specified
78 in the mode flags at the call to *nc_open()*.
79 
80 ### Diskless File Create
81 Calling *nc_create()* using the mode flag *NC_DISKLESS* will cause
82 the file to initially be created and kept in memory.
83 When calling *nc_close()*, the file will be written
84 to disk if and only if *NC_PERSIST* is specified
85 in the mode flags at the call to *nc_create()*.
86 
87 Enabling Inmemory File Access {#Enable_Inmemory}
88 --------------
89 
90 The netcdf API has been extended to support the inmemory capability.
91 The relevant API is defined in the file `netcdf_mem.h`.
92 
93 The important data structure to use is `NC_memio`.
94 ````
95 typedef struct NC_memio {
96  size_t size;
97  void* memory;
98  int flags;
99 } NC_memio;
100 
101 ````
102 An instance of this data structure is used when providing or
103 retrieving a block of data. It specifies the memory and its size
104 and also some relevant flags that define how to manage the memory.
105 
106 Current only one flag is defined -- *NC_MEMIO_LOCKED*.
107 This tells the netcdf library that it should never try to
108 *realloc()* the memory nor to *free()* the memory. Note
109 that this does not mean that the memory cannot be modified, but
110 only that the modifications will be within the confines of the provided
111 memory. If doing such modifications is impossible without
112 reallocating the memory, then the modification will fail.
113 
114 ### In-Memory API
115 
116 The new API consists of the following functions.
117 ````
118 int nc_open_mem(const char* path, int mode, size_t size, void* memory, int* ncidp);
119 
120 int nc_create_mem(const char* path, int mode, size_t initialsize, int* ncidp);
121 
122 int nc_open_memio(const char* path, int mode, NC_memio* info, int* ncidp);
123 
124 int nc_close_memio(int ncid, NC_memio* info);
125 
126 ````
127 ### The **nc_open_mem** Function
128 
129 The *nc_open_mem()* function is actually a convenience
130 function that internally invokes *nc_open_memio()*.
131 It essentially provides simple read-only access to a chunk of memory
132 of some specified size.
133 
134 ### The **nc_open_memio** Function
135 
136 This function provides a more general read/write capability with respect
137 to a chunk of memory. It has a number of constraints and its
138 semantics are somewhat complex. This is primarily due to limitations
139 imposed by the underlying HDF5 library.
140 
141 The constraints are as follows.
142 
143 1. If the *NC_MEMIO_LOCKED* flag is set, then the netcdf library will
144 make no attempt to reallocate or free the provided memory.
145 If the caller invokes the *nc_close_memio()* function to retrieve the
146 final memory block, it should be the same
147 memory block as was provided when *nc_open_memio* was called.
148 Note that it is still possible to modify the in-memory file if the NC_WRITE
149 mode flag was set. However, failures can occur if an operation
150 cannot complete because the memory needs to be expanded.
151 2. If the *NC_MEMIO_LOCKED* flag is <b>not</b> set, then
152 the netcdf library will take control of the incoming memory.
153 This means that the user should not make any attempt to free
154 or even read the incoming memory block in this case.
155 The newcdf library is free to reallocate the incomming
156 memory block to obtain a larger block when an attempt to modify
157 the in-memory file requires more space. Note that implicit in this
158 is that the old block -- the one originally provided -- may be
159 free'd as a side effect of re-allocating the memory using the
160 *realloc()* function.
161 The caller may invoke the *nc_close_memio()* function to retrieve the
162 final memory block, which may not be the same as the originally block
163 provided by the caller. In any case, the returned block must always be freed
164 by the caller and the original block should not be freed.
165 
166 ### The **nc_create_mem** Function
167 
168 This function allows a user to create an in-memory file, write to it,
169 and then retrieve the final memory using *nc_close_memio()*.
170 The *initialsize* argument to *nc_create_mem()* tells the library
171 how much initial memory to allocate. Technically, this is advisory only
172 because it may be ignored by the underlying HDF5 library.
173 It is used, however, for netcdf-3 files.
174 
175 ### The **nc_close_memio** Function
176 
177 The ordinary *nc_close()* function can be called to close an in-memory file.
178 However, it is often desirable to obtain the final size and memory block
179 for the in-memory file when that file has been modified.
180 The *nc_close_memio()* function provides a means to do this.
181 Its second argument is a pointer to an *NC_memio* object
182 into which the final memory and size are stored. WARNING,
183 the returned memory is owned by the caller and so the caller
184 is responsible for calling *free()* on that returned memory.
185 
186 ### Support for Writing with *NC_MEMIO_LOCKED*
187 
188 When the NC_MEMIO_LOCKED flag is set in the *NC_memio* object
189 passed to *nc_open_memio()*, it is still possible to modify
190 the opened in-memory file (using the NC_WRITE mode flag).
191 
192 The big problem is that any changes must fit into the memory provided
193 by the caller via the *NC_memio* object. This problem can be
194 mitigated, however, by using the "trick" of overallocating
195 the caller supplied memory. That is, if the original file is, say, 300 bytes,
196 then it is possible to allocate, say, 65000 bytes and copy the original file
197 into the first 300 bytes of the larger memory block. This will allow
198 the netcdf-c library to add to the file up to that 65000 byte limit.
199 In this way, it is possible to avoid memory reallocation while still
200 allowing modifications to the file. You will still need to call
201 *nc_close_memio()* to obtain the size of the final, modified, file.
202 
203 Enabling MMAP File Access {#Enable_MMAP}
204 --------------
205 
206 Some operating systems provide a capability called MMAP.
207 This allows disk files to automatically be mapped to chunks of memory.
208 It operates in a fashion somewhat similar to operating system virtual
209 memory, except with respect to a file.
210 
211 By setting mode flag NC_MMAP, it is possible to do the equivalent
212 of NC_DISKLESS but using the operating system's mmap capabilities.
213 
214 Currently, MMAP support is only available when using netcdf-3 or cdf5
215 files.
216 
217 Known Bugs {#Inmemory_Bugs}
218 --------------
219 
220 1. If you are modifying a locked memory chunk (using
221  NC_MEMIO_LOCKED) and are accessing it as a netcdf-4 file, and
222  you overrun the available space, then the HDF5 library will
223  fail with a segmentation fault.
224 
225 2. You will get an HDF5 error under the following conditions.
226 
227  1. You call nc_open on a file with the flags NC_DISKLESS|NC_WRITE
228  but without NC_PERSIST.
229  2. The file to be read is read-only (i.e. mode 0444).
230 
231  Note that this should be ok because the modifications to the file
232  are not intended to pushed back into the disk file. However, the
233  HDF5 core driver does not allow this.
234 
235 References {#Inmemory_References}
236 --------------
237 
238 1. https://support.hdfgroup.org/HDF5/doc1.8/Advanced/FileImageOperations/HDF5FileImageOperations.pdf
239 
240 Point of Contact
241 --------------
242 
243 __Author__: Dennis Heimbigner<br>
244 __Email__: dmh at ucar dot edu
245 __Initial Version__: 2/3/2018<br>
246 __Last Revised__: 2/5/2018
247 
248 

Return to the Main Unidata NetCDF page.
Generated on Tue Aug 27 2019 15:28:55 for NetCDF. NetCDF is a Unidata library.