HDF5 supports a filter pipeline that provides the capability for standard and customized raw data processing during I/O operations. HDF5 is distributed with a small set of standard filters such as compression (gzip, SZIP, and a shuffling algorithm) and error checking (Fletcher32 checksum). For further flexibility, the library allows a user application to extend the pipeline through the creation and registration of customized filters.
The flexibility of the filter pipeline implementation enables the definition of additional filters by a user application. A filter
H5D_CHUNKED
storage layout), and
The HDF5 Library does not support filters for contiguous datasets because of the difficulty of implementing random access for partial I/O. Compact dataset filters are not supported because it would not produce significant results.
Filter identifiers for the filters distributed with the HDF5 Library are as follows:
H5Z_FILTER_DEFLATE | The gzip compression, or deflation, filter |
H5Z_FILTER_SZIP | The SZIP compression filter |
H5Z_FILTER_NBIT | The N-bit compression filter |
H5Z_FILTER_SCALEOFFSET | The scale-offset compression filter |
H5Z_FILTER_SHUFFLE | The shuffle algorithm filter |
H5Z_FILTER_FLETCHER32 | The Fletcher32 checksum, or error checking, filter |
Custom filters that have been registered with the library will have additional unique identifiers.
See HDF5 Dynamically Loaded Filters for more information on how an HDF5 application can apply a filter that is not registered with the HDF5 Library.
H5Zfilter_avail
(H5Z_filter_t filter
)
H5Zfilter_avail
determines if the filter
specified in filter
is available to the application. If the filter is
a dynamic plugin it will load and register the filter.
H5Z_filter_t filter |
IN: Filter identifier. See the introduction to this section of the reference manual for a list of valid filter identifiers. |
SUBROUTINE h5zfilter_avail_f(filter, status, hdferr) IMPLICIT NONE INTEGER, INTENT(IN) :: filter ! Filter ! Valid values are: ! H5Z_FILTER_DEFLATE_F ! H5Z_FILTER_SHUFFLE_F ! H5Z_FILTER_FLETCHER32_F ! H5Z_FILTER_SZIP_F LOGICAL, INTENT(OUT) :: status ! Flag indicating whether ! filter is available: ! .TRUE. ! .FALSE. INTEGER, INTENT(OUT) :: hdferr ! Returns 0 if successful and -1 if fails END SUBROUTINE h5zfilter_avail_f
Release | C |
1.6.0 | Function introduced in this release. |
H5Zget_filter_info
(
H5Z_filter_t filter
,
unsigned int *filter_config
)
H5Zget_filter_info
retrieves information about a filter.
At present, this means that the function retrieves a
filter's configuration flags, indicating whether the filter is
configured to decode data, to encode data, neither, or both.
If filter_config
is not set to NULL
prior to the function call, the returned parameter contains a
bit field specifying the available filter configuration.
The configuration flag values can then be determined through
a series of bitwise AND operations, as described below.
Valid filter configuration flags include the following:
|
H5Z_FILTER_CONFIG_ENCODE_ENABLED |
Encoding is enabled for this filter.
In Fortran, H5Z_FILTER_ENCODE_ENABLED_F .
|
H5Z_FILTER_CONFIG_DECODE_ENABLED |
Decoding is enabled for this filter.
In Fortran, H5Z_FILTER_DECODE_ENABLED_F .
| |
(These flags
are defined for C in the HDF5 Library source code file
H5Zpublic.h .)
|
filter_config
and a valid
filter configuration flag will reveal whether
the related configuration option is available.
For example, if the value of
H5Z_FILTER_CONFIG_ENCODE_ENABLED
&
filter_config
0
(zero),
the queried filter is configured to encode data;
if the value is FALSE
,
i.e., equal to 0
(zero),
the filter is not so configured.
If a filter is not encode-enabled, the corresponding
H5Pset_*
function will return an error if the
filter is added to a dataset creation property list (which is
required if the filter is to be used to encode that dataset).
For example, if the H5Z_FILTER_CONFIG_ENCODE_ENABLED
flag is not returned for the SZIP filter,
H5Z_FILTER_SZIP
, a call to H5Pset_szip
will fail.
If a filter is not decode-enabled, the application will not be able to read an existing file encoded with that filter.
This function should be called, and the returned
filter_config
analyzed, before calling
any other function, such as H5Pset_szip
,
that might require a particular filter configuration.
filter
filter_config
SUBROUTINE h5zget_filter_info_f(filter, config_flags, hdferr) IMPLICIT NONE INTEGER, INTENT(IN) :: filter ! Filter, may be one of the ! following: ! H5Z_FILTER_DEFLATE_F ! H5Z_FILTER_SHUFFLE_F ! H5Z_FILTER_FLETCHER32_F ! H5Z_FILTER_SZIP_F INTEGER, INTENT(OUT) :: config_flags ! Bit field indicating whether ! a filter's encoder and/or ! decoder are available INTEGER, INTENT(OUT) :: hdferr ! Error code END SUBROUTINE h5zfilter_avail_f
Release | C |
1.6.3 |
Function introduced in this release. Fortran subroutine introduced in this release. |
H5Zregister
(const H5Z_class_t *filter_class
)
)
H5Zregister
registers a new filter with the
HDF5 library.
Making a new filter available to an application is a two-step
process. The first step is to write
the three filter callback functions described below:
can_apply
, set_local
, and filter
.
This call to H5Zregister
,
registering the filter with the
library, is the second step.
The can_apply
and set_local
fields can be set to NULL
if they are not required for the filter being registered.
H5Zregister
accepts a single parameter, a pointer to
a buffer for the filter_class
data structure.
That data structure must conform to one of the following
definitions:
typedef struct H5Z_class1_t { H5Z_filter_t id; const char *name; H5Z_can_apply_func_t can_apply; H5Z_set_local_func_t set_local; H5Z_func_t filter; } H5Z_class1_t; typedef struct H5Z_class2_t { int version; H5Z_filter_t id; unsigned encoder_present; unsigned decoder_present; const char *name; H5Z_can_apply_func_t can_apply; H5Z_set_local_func_t set_local; H5Z_func_t filter; } H5Z_class2_t;
version
is a libray-defined value reporting
the version number of the H5Z_class_t struct.
This currently must be set to H5Z_CLASS_T_VERS.
id
is the identifier for the new filter.
This is a user-defined value between
H5Z_FILTER_RESERVED
and H5Z_FILTER_MAX
.
These values are defined in the HDF5 source file
H5Zpublic.h
, but the symbols
H5Z_FILTER_RESERVED
and H5Z_FILTER_MAX
should always be used instead of the literal values.
encoder_present
is a library-defined value indicating
whether the filter’s encoding capability is available to
the application.
decoder_present
is a library-defined value indicating
whether the filter’s encoding capability is available to
the application.
name
is a descriptive comment used for debugging,
may contain a descriptive name for the filter,
and may be the null pointer.
can_apply
, described in detail below,
is a user-defined callback function which determines whether
the combination of the dataset creation property list values,
the datatype, and the dataspace represent a valid combination
to apply this filter to.
set_local
, described in detail below,
is a user-defined callback function which sets any parameters that
are specific to this dataset, based on the combination of the
dataset creation property list values, the datatype, and the
dataspace.
filter
, described in detail below,
is a user-defined callback function which performs the action
of the filter.
The statistics associated with a filter are not reset by this function; they accumulate over the life of the library.
H5Z_class_t
is a macro which maps to either
H5Z_class1_t
or H5Z_class2_t
, depending on the
needs of the application. To affect only this macro,
H5Z_class_t_vers
may be defined to either 1
or
2
. Otherwise, it will behave in the same manner as other API
compatibility macros. See “API
Compatibility Macros in HDF5” for more information.
H5Z_class1_t
matches the H5Z_class_t
structure
that is used in the 1.6.x versions of the HDF5 library.
H5Zregister will automatically detect which structure type has been passed
in, regardless of the mapping of the H5Z_class_t
macro.
However, the application must make sure that the fields are filled in
according to the correct structure definition if the macro is used to
declare the structure.
The callback functions
Before H5Zregister
can link a filter into an
application, three callback functions must be defined
as described in the HDF5 Library header file H5Zpublic.h
.
When a filter is applied to the fractal heap for a group
(e.g., when compressing group metadata) and if the can apply
and set local callback functions have been defined for that
filter, HDF5 passes the value -1
for all parameters
for those callback functions. This is done to ensure that the filter
will not be applied to groups if it relies on these parameters,
as they are not applicable to group fractal heaps;
to operate on group fractal heaps, a filter must be capable of
operating on an opaque block of binary data.
The can apply callback function is defined as follows:
H5Z_can_apply_func_t
)
(hid_t dcpl_id
,
hid_t type_id
,
hid_t space_id
)
Before a dataset is created, the can apply callbacks for
any filters used in the dataset creation property list are called
with the dataset's dataset creation property list, dcpl_id
,
the dataset's datatype, type_id
, and
a dataspace describing a chunk, space_id
,
(for chunked dataset storage).
This callback must determine whether the combination of the
dataset creation property list settings, the datatype, and the
dataspace represent a valid combination to which to apply this filter.
For example, an invalid combination may involve
the filter not operating correctly on certain datatypes,
on certain datatype sizes, or on certain sizes of the chunk dataspace.
If this filter is enabled through
H5Pset_filter
as optional and the can apply function returns
FALSE, the library will skip the filter in the filter
pipeline.
This callback can be the NULL
pointer, in which case
the library will assume that the filter can be applied to a dataset with
any combination of dataset creation property list values, datatypes,
and dataspaces.
The can apply callback function must return a positive value for a valid combination, zero for an invalid combination, and a negative value for an error.
The set local callback function is defined as follows:
H5Z_set_local_func_t
)
(hid_t dcpl_id
,
hid_t type_id
,
hid_t space_id
)
After the can apply callbacks are checked for a new dataset,
the set local callback functions for any filters used in the
dataset creation property list are called.
These callbacks receive
dcpl_id
, the dataset's private copy of the dataset
creation property list passed in to H5Dcreate
(i.e. not the actual property list passed in to H5Dcreate
);
type_id
, the datatype identifier passed in to
H5Dcreate
,
which is not copied and should not be modified; and
space_id
, a dataspace describing the chunk
(for chunked dataset storage), which should also not be modified.
The set local callback must set any filter parameters that are specific to this dataset, based on the combination of the dataset creation property list values, the datatype, and the dataspace. For example, some filters perform different actions based on different datatypes, datatype sizes, numbers of dimensions, or dataspace sizes.
The set local callback may be the NULL
pointer,
in which case, the library will assume that there are
no dataset-specific settings for this filter.
The set local callback function must return a non-negative value on success and a negative value for an error.
The filter operation callback function, defining the filter's operation on the data, is defined as follows:
H5Z_func_t
)
(unsigned int flags
,
size_t cd_nelmts
,
const unsigned int cd_values[]
,
size_t nbytes
,
size_t *buf_size
,
void **buf
)
The parameters flags
, cd_nelmts
,
and cd_values
are the same as for the function
H5Pset_filter
.
The one exception is that an additional flag,
H5Z_FLAG_REVERSE
, is set when
the filter is called as part of the input pipeline.
The parameter *buf
points to the input buffer
which has a size of *buf_size
bytes,
nbytes
of which are valid data.
The filter should perform the transformation in place if
possible. If the transformation cannot be done in place,
then the filter should allocate a new buffer with
malloc()
and assign it to *buf
,
assigning the allocated size of that buffer to
*buf_size
.
The old buffer should be freed by calling free()
.
If successful, the filter operation callback function
returns the number of valid bytes of data contained in *buf
.
In the case of failure, the return value is 0
(zero)
and all pointer arguments are left unchanged.
If a C routine that takes a function pointer as an argument is called from within C++ code, the C routine should be returned from normally.
Examples of this kind of routine include callbacks such as
H5Pset_elink_cb
and H5Pset_type_conv_cb
and functions such as H5Tconvert
and
H5Ewalk2
.
Exiting the routine in its normal fashion allows the HDF5 C Library to clean up its work properly. In other words, if the C++ application jumps out of the routine back to the C++ “catch” statement, the library is not given the opportunity to close any temporary data structures that were set up when the routine was called. The C++ application should save some state as the routine is started so that any problem that occurs might be diagnosed.
const H5Z_class_t *filter_class |
IN: A pointer to a buffer for the struct containing filter-definition information. |
Release | Change |
1.6.0 |
This function was substantially revised in
Release 1.6.0 with a new H5Z_class_t
struct and new set local and can apply
callback functions. |
1.8.0 |
The fields version , encoder_present ,
and decoder_present were added to the
H5Z_class_t struct in this release. |
1.8.3 |
H5Z_class_t renamed to H5Z_class2_t ,
H5Z_class1_t structure introduced for backwards
compatibility with release 1.6.x, and H5Z_class_t macro
introduced in this release. Function modified to accept either
structure type. |
1.8.5 | Semantics of the can apply and set local callback functions changed to accommodate the use of filters with group fractal heaps. |
1.8.6 |
Return type for the can apply callback function,
H5Z_can_apply_func_t ,
changed to htri_t . |
H5Zunregister
(H5Z_filter_t filter
)
H5Zunregister
unregisters the filter
specified in filter
.
This function first iterates through all opened datasets and groups. If an open object that uses this filter is found, the function will fail with a message indicating that an object using the filter is still open. All open files are then flushed to make sure that all cached data that may use this filter are written out.
If the application is a parallel program, all processes that participate in collective data write should call this function to ensure that all data is flushed.
After a call to H5Zunregister
, the filter
specified in filter
will no longer be
available to the application.
H5Z_filter_t filter |
IN: Identifier of the filter to be unregistered. See the introduction to this section of the reference manual for a list of identifiers for standard filters distributed with the HDF5 Library. |
SUBROUTINE h5zunregister_f(filter, hdferr) IMPLICIT NONE INTEGER, INTENT(IN) :: filter ! Filter; one of the possible values: ! H5Z_FILTER_DEFLATE_F ! H5Z_FILTER_SHUFFLE_F ! H5Z_FILTER_FLETCHER32_F ! H5Z_FILTER_SZIP_F INTEGER, INTENT(OUT) :: hdferr ! Error code ! 0 on success, and -1 on failure END SUBROUTINE h5zunregister_f
Release | C |
1.8.12 | Function modified to check for open objects using the filter. |
1.6.0 | Function introduced in this release. |
The HDF Group Help Desk:
Describes HDF5 Release 1.8.20, November 2017. |
Copyright by
The HDF Group
and the Board of Trustees of the University of Illinois |