Memory Management and Free Lists
At Release 1.2.2, free list management code was introduced to the HDF5
library. This included one user-level function, H5garbage_collect, which
garbage collects on all the free-lists. H5garbage_collect is the only user-
accessible (i.e., application developer-accessible) element of this
functionality.
The free-lists generally reduce the amount of dynamic memory used to around
75% of the pre-optimized amount as well as speed up the time in the library
code by ~5% The free-lists also help linearize the amount of memory used with
increasing numbers of datasets or re-writes on the data, so the amount of
memory used for the 1500/45 free-list case is only 66% of the memory used for
the unoptimized case.
Overall, the introduction of free list management is a win: the library is
slightly faster and uses much less system resources than before. Most of the
emphasis has been focused on the main "thouroughfares" through the code;
less attention was paid to the "back streets" which are used much less
frequently and offer less potential for abuse.
Adding a free-list for a data structure in the HDF5 library code is easy:
Old code:
---------
int foo(void)
{
H5W_t *w;
for(i=0; i<x; i++) {
w=H5MM_malloc(sizeof(H5W_t));
<use w>
H5MM_xfree(w);
}
}
New code:
---------
H5FL_DEFINE(H5W_t);
int foo(void)
{
H5W_t *w;
for(i=0; i<x; i++) {
w=H5FL_ALLOC(H5W_t,0);
<use w>
H5FL_FREE(H5W_t,w);
}
}
There are three kinds of free-lists:
-- for "regular" objects,
-- arrays of fixed size object (both fixed length and unknown length), and
-- "blocks" of bytes.
"Regular" free-lists use the H5FL_<*> macros in H5FLprivate.h and are
designed for single, fixed-size data structures like typedef'ed structs,
etc.
Arrays of objects use the H5FL_ARR_<*> macros and are designed for arrays
(both fixed in length and varying lengths) of fixed length data structures
(like typedef'ed types).
"Block" free-lists use the H5FL_BLK_<*> macros and are designed to hold
varying sized buffers of bytes, with no structure.
H5S.c contains examples for "regular" and fixed-sized arrays;
H5B.c contains examples for variable-sized arrays and "blocks".
A free-list doesn't have to be used for every data structure allocated and
freed, just for those which are prone to abuse when multiple operations are
being performed. It is important to use the macros for declaring and
manipulating the free-lists however; they allow the free'd objects on the
lists to be garbage collected by the library at the library's termination
or at the user's request.
One public API function has been added: H5garbage_collect, which garbage
collects on all the free-lists of all the different types. It's not required
to be called and is only necessary in situations when the application
performs actions which cause the library to allocate many objects and then
the application eventually releases those objects and wants to reduce the
memory used by the library from the peak usage required. The library
automatically garbage collects all the free lists when the application ends.
Questions should be sent to the THG Help Desk at 'help' at hdfgroup.org.
===========================================
BENCHMARK INFORMATION
===========================================
New version with free lists:
Datasets=500, Data Rewrites=15:
Peak Heap Usage: 18210820 bytes
Time in library code: 2.260 seconds
# of malloc calls: 22864
Datasets=1000, Data Rewrites=15:
Peak Heap Usage: 31932420 bytes
Time in library code: 5.090 seconds
# of malloc calls: 43045
Datasets=1500, Data Rewrites=15:
Peak Heap Usage: 41566212 bytes
Time in library code: 8.623 seconds
# of malloc calls: 60623
Datasets=500, Data Rewrites=30:
Peak Heap Usage: 19456004 bytes
Time in library code: 4.274 seconds
# of malloc calls: 23353
Datasets=1000, Data Rewrites=30:
Peak Heap Usage: 33988612 bytes
Time in library code: 9.955 seconds
# of malloc calls: 43855
Datasets=1500, Data Rewrites=30:
Peak Heap Usage: 43950084 bytes
Time in library code: 17.413 seconds
# of malloc calls: 61554
Datasets=500, Data Rewrites=45:
Peak Heap Usage: 20717572 bytes
Time in library code: 6.326 seconds
# of malloc calls: 23848
Datasets=1000, Data Rewrites=45:
Peak Heap Usage: 35807236 bytes
Time in library code: 15.146 seconds
# of malloc calls: 44572
Datasets=1500, Data Rewrites=45:
Peak Heap Usage: 46022660 bytes
Time in library code: 27.140 seconds
# of malloc calls: 62370
Older version with no free lists:
Datasets=500, Data Rewrites=15:
Peak Heap Usage: 25370628 bytes
Time in library code: 2.329 seconds
# of malloc calls: 194991
Datasets=1000, Data Rewrites=15:
Peak Heap Usage: 39550980 bytes
Time in library code: 5.251 seconds
# of malloc calls: 417971
Datasets=1500, Data Rewrites=15:
Peak Heap Usage: 68870148 bytes
Time in library code: 8.913 seconds
# of malloc calls: 676564
Datasets=500, Data Rewrites=30:
Peak Heap Usage: 31670276 bytes
Time in library code: 4.435 seconds
# of malloc calls: 370320
Datasets=1000, Data Rewrites=30:
Peak Heap Usage: 44646404 bytes
Time in library code: 10.325 seconds
# of malloc calls: 797125
Datasets=1500, Data Rewrites=30:
Peak Heap Usage: 68870148 bytes
Time in library code: 18.057 seconds
# of malloc calls: 1295336
Datasets=500, Data Rewrites=45:
Peak Heap Usage: 33906692 bytes
Time in library code: 6.577 seconds
# of malloc calls: 545656
Datasets=1000, Data Rewrites=45:
Peak Heap Usage: 56778756 bytes
Time in library code: 15.720 seconds
# of malloc calls: 1176285
Datasets=1500, Data Rewrites=45:
Peak Heap Usage: 68870148 bytes
Time in library code: 28.138 seconds
# of malloc calls: 1914097
===========================================
Last Modified: 3 May 2000
Last Modified: 27 March 2007 (Help Desk contact only)
THG Help Desk: 'help' at hdfgroup.org