Please, help us to better serve our user community by answering the following short survey: https://www.hdfgroup.org/website-survey/
HDF5  1.14.4.3
API Reference
 
Loading...
Searching...
No Matches
Dataset Creation Properties

Detailed Description

Use dataset creation properties to control aspects of dataset creation such as fill time, storage layout, compression methods, etc. Unlike dataset access and transfer properties, creation properties are stored with the dataset, and cannot be changed once a dataset has been created.

Dataset creation property list functions (H5P)
Function Purpose
H5Pset_layout Sets the type of storage used to store the raw data for a dataset.
H5Pget_layout Returns the layout of the raw data for a dataset.
H5Pset_chunk Sets the size of the chunks used to store a chunked layout dataset.
H5Pget_chunk Retrieves the size of chunks for the raw data of a chunked layout dataset.
H5Pset_chunk_opts/H5Pget_chunk_opts Sets/gets the edge chunk option setting from a dataset creation property list.
H5Pset_deflate Sets compression method and compression level.
H5Pset_fill_value Sets the fill value for a dataset.
H5Pget_fill_value Retrieves a dataset fill value.
H5Pfill_value_defined Determines whether the fill value is defined.
H5Pset_fill_time Sets the time when fill values are written to a dataset.
H5Pget_fill_time Retrieves the time when fill value are written to a dataset.
H5Pset_alloc_time Sets the timing for storage space allocation.
H5Pget_alloc_time Retrieves the timing for storage space allocation.
H5Pset_filter Adds a filter to the filter pipeline.
H5Pall_filters_avail Verifies that all required filters are available.
H5Pget_nfilters Returns the number of filters in the pipeline.
H5Pget_filter Returns information about a filter in a pipeline. The C function is a macro:
See also
API Compatibility Macros.
H5Pget_filter_by_id Returns information about the specified filter. The C function is a macro:
See also
API Compatibility Macros.
H5Pmodify_filter Modifies a filter in the filter pipeline.
H5Premove_filter Deletes one or more filters in the filter pipeline.
H5Pset_fletcher32 Sets up use of the Fletcher32 checksum filter.
H5Pset_nbit Sets up use of the n-bit filter.
H5Pset_scaleoffset Sets up use of the scale-offset filter.
H5Pset_shuffle Sets up use of the shuffle filter.
H5Pset_szip Sets up use of the Szip compression filter.
H5Pset_external Adds an external file to the list of external files.
H5Pget_external_count Returns the number of external files for a dataset.
H5Pget_external Returns information about an external file.
H5Pset_char_encoding Sets the character encoding used to encode a string. Use to set ASCII or UTF-8 character encoding for object names.
H5Pget_char_encoding Retrieves the character encoding used to create a string.
H5Pset_virtual Sets the mapping between virtual and source datasets.
H5Pget_virtual_count Gets the number of mappings for the virtual dataset.
H5Pget_virtual_dsetname Gets the name of a source dataset used in the mapping.
H5Pget_virtual_filename Gets the filename of a source dataset used in the mapping.
H5Pget_virtual_srcspace Gets a dataspace identifier for the selection within the source dataset used in the mapping.
H5Pget_virtual_vspace Gets a dataspace identifier for the selection within the virtual dataset used in the mapping.
H5Pset_dset_no_attrs_hint/H5Pget_dset_no_attrs_hint Sets/gets the flag to create minimized dataset object headers.

Functions

htri_t H5Pall_filters_avail (hid_t plist_id)
 Verifies that all required filters are available.
 
herr_t H5Pset_deflate (hid_t plist_id, unsigned level)
 Sets deflate (GNU gzip) compression method and compression level.
 
herr_t H5Pfill_value_defined (hid_t plist, H5D_fill_value_t *status)
 Determines whether fill value is defined.
 
herr_t H5Pget_alloc_time (hid_t plist_id, H5D_alloc_time_t *alloc_time)
 Retrieves the timing for storage space allocation.
 
int H5Pget_chunk (hid_t plist_id, int max_ndims, hsize_t dim[])
 Retrieves the size of chunks for the raw data of a chunked layout dataset.
 
herr_t H5Pget_chunk_opts (hid_t plist_id, unsigned *opts)
 Retrieves the edge chunk option setting from a dataset creation property list.
 
herr_t H5Pget_dset_no_attrs_hint (hid_t dcpl_id, hbool_t *minimize)
 Retrieves the setting for whether or not to create minimized dataset object headers.
 
herr_t H5Pget_external (hid_t plist_id, unsigned idx, size_t name_size, char *name, off_t *offset, hsize_t *size)
 Returns information about an external file.
 
int H5Pget_external_count (hid_t plist_id)
 Returns the number of external files for a dataset.
 
herr_t H5Pget_fill_time (hid_t plist_id, H5D_fill_time_t *fill_time)
 Retrieves the time when fill values are written to a dataset.
 
herr_t H5Pget_fill_value (hid_t plist_id, hid_t type_id, void *value)
 Retrieves a dataset fill value.
 
H5D_layout_t H5Pget_layout (hid_t plist_id)
 Returns the layout of the raw data for a dataset.
 
herr_t H5Pget_virtual_count (hid_t dcpl_id, size_t *count)
 Gets the number of mappings for the virtual dataset.
 
ssize_t H5Pget_virtual_dsetname (hid_t dcpl_id, size_t index, char *name, size_t size)
 Gets the name of a source dataset used in the mapping.
 
ssize_t H5Pget_virtual_filename (hid_t dcpl_id, size_t index, char *name, size_t size)
 Gets the filename of a source dataset used in the mapping.
 
hid_t H5Pget_virtual_srcspace (hid_t dcpl_id, size_t index)
 Gets a dataspace identifier for the selection within the source dataset used in the mapping.
 
hid_t H5Pget_virtual_vspace (hid_t dcpl_id, size_t index)
 Gets a dataspace identifier for the selection within the virtual dataset used in the mapping.
 
herr_t H5Pset_alloc_time (hid_t plist_id, H5D_alloc_time_t alloc_time)
 Sets the timing for storage space allocation.
 
herr_t H5Pset_chunk (hid_t plist_id, int ndims, const hsize_t dim[])
 Sets the size of the chunks used to store a chunked layout dataset.
 
herr_t H5Pset_chunk_opts (hid_t plist_id, unsigned opts)
 Sets the edge chunk option in a dataset creation property list.
 
herr_t H5Pset_dset_no_attrs_hint (hid_t dcpl_id, hbool_t minimize)
 Sets the flag to create minimized dataset object headers.
 
herr_t H5Pset_external (hid_t plist_id, const char *name, off_t offset, hsize_t size)
 Adds an external file to the list of external files.
 
herr_t H5Pset_fill_time (hid_t plist_id, H5D_fill_time_t fill_time)
 Sets the time when fill values are written to a dataset.
 
herr_t H5Pset_fill_value (hid_t plist_id, hid_t type_id, const void *value)
 Sets the fill value for a dataset.
 
herr_t H5Pset_shuffle (hid_t plist_id)
 Sets up use of the shuffle filter.
 
herr_t H5Pset_layout (hid_t plist_id, H5D_layout_t layout)
 Sets the type of storage used to store the raw data for a dataset.
 
herr_t H5Pset_nbit (hid_t plist_id)
 Sets up the use of the N-Bit filter.
 
herr_t H5Pset_scaleoffset (hid_t plist_id, H5Z_SO_scale_type_t scale_type, int scale_factor)
 Sets up the use of the scale-offset filter.
 
herr_t H5Pset_szip (hid_t plist_id, unsigned options_mask, unsigned pixels_per_block)
 Sets up use of the SZIP compression filter.
 
herr_t H5Pset_virtual (hid_t dcpl_id, hid_t vspace_id, const char *src_file_name, const char *src_dset_name, hid_t src_space_id)
 Sets the mapping between virtual and source datasets.
 
H5Z_filter_t H5Pget_filter1 (hid_t plist_id, unsigned filter, unsigned int *flags, size_t *cd_nelmts, unsigned cd_values[], size_t namelen, char name[])
 Returns information about a filter in a pipeline (DEPRECATED)
 
herr_t H5Pget_filter_by_id1 (hid_t plist_id, H5Z_filter_t id, unsigned int *flags, size_t *cd_nelmts, unsigned cd_values[], size_t namelen, char name[])
 Returns information about the specified filter.
 

Function Documentation

◆ H5Pall_filters_avail()

htri_t H5Pall_filters_avail ( hid_t  plist_id)

Verifies that all required filters are available.

Parameters
[in]plist_idProperty list identifier
Returns
Returns zero (false), a positive (true) or a negative (failure) value.

H5Pall_filters_avail() verifies that all of the filters set in the dataset or group creation property list plist_id are currently available.

Version
1.8.5 Function extended to work with group creation property lists.
Since
1.6.0

◆ H5Pfill_value_defined()

herr_t H5Pfill_value_defined ( hid_t  plist,
H5D_fill_value_t status 
)

Determines whether fill value is defined.

Parameters
[in]plistDataset creation property list identifier
[out]statusStatus of fill value in property list
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pfill_value_defined() determines whether a fill value is defined in the dataset creation property list plist. Valid values returned in status are as follows:

H5D_FILL_VALUE_UNDEFINED Fill value is undefined.
H5D_FILL_VALUE_DEFAULT Fill value is the library default.
H5D_FILL_VALUE_USER_DEFINED Fill value is defined by the application.
Since
1.6.0

◆ H5Pget_alloc_time()

herr_t H5Pget_alloc_time ( hid_t  plist_id,
H5D_alloc_time_t alloc_time 
)

Retrieves the timing for storage space allocation.

Parameters
[in]plist_idDataset creation property list identifier
[out]alloc_timeThe timing setting for allocating dataset storage space
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_alloc_time() retrieves the timing for allocating storage space for a dataset's raw data. This property is set in the dataset creation property list plist_id. The timing setting is returned in alloc_time as one of the following values:

H5D_ALLOC_TIME_DEFAULT
 
Uses the default allocation time, based on the dataset storage method.
See the alloc_time description in H5Pset_alloc_time() for default allocation times for various storage methods.
H5D_ALLOC_TIME_EARLY All space is allocated when the dataset is created.
H5D_ALLOC_TIME_INCR Space is allocated incrementally as data is written to the dataset.
H5D_ALLOC_TIME_LATE All space is allocated when data is first written to the dataset.
Note
H5Pget_alloc_time() is designed to work in concert with the dataset fill value and fill value write time properties, set with the functions H5Pget_fill_value() and H5Pget_fill_time().
Since
1.6.0

◆ H5Pget_chunk()

int H5Pget_chunk ( hid_t  plist_id,
int  max_ndims,
hsize_t  dim[] 
)

Retrieves the size of chunks for the raw data of a chunked layout dataset.

Parameters
[in]plist_idDataset creation property list identifier
[in]max_ndimsSize of the dims array
[out]dimArray to store the chunk dimensions
Returns
Returns chunk dimensionality if successful; otherwise returns a negative value.

H5Pget_chunk() retrieves the size of chunks for the raw data of a chunked layout dataset. This function is only valid for dataset creation property lists. At most, max_ndims elements of dim will be initialized.

Since
1.0.0

◆ H5Pget_chunk_opts()

herr_t H5Pget_chunk_opts ( hid_t  plist_id,
unsigned *  opts 
)

Retrieves the edge chunk option setting from a dataset creation property list.

Parameters
[in]plist_idDataset creation property list identifier
[out]optsEdge chunk option flag. Valid values are described in H5Pset_chunk_opts(). The option status can be retrieved using the bitwise AND operator ( & ). For example, the expression (opts&H5D_CHUNK_DONT_FILTER_PARTIAL_CHUNKS) will evaluate to H5D_CHUNK_DONT_FILTER_PARTIAL_CHUNKS if that option has been enabled. Otherwise, it will evaluate to 0 (zero).
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_chunk_opts() retrieves the edge chunk option setting stored in the dataset creation property list plist_id.

Since
1.10.0

◆ H5Pget_dset_no_attrs_hint()

herr_t H5Pget_dset_no_attrs_hint ( hid_t  dcpl_id,
hbool_t minimize 
)

Retrieves the setting for whether or not to create minimized dataset object headers.

Parameters
[in]dcpl_idDataset creation property list identifier
[out]minimizeFlag indicating whether the library will or will not create minimized dataset object headers
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_dset_no_attrs_hint() retrieves the no dataset attributes hint setting for the dataset creation property list dcpl_id. This setting is used to inform the library to create minimized dataset object headers when true. The setting value is returned in the boolean pointer minimize.

Since
1.10.5

◆ H5Pget_external()

herr_t H5Pget_external ( hid_t  plist_id,
unsigned  idx,
size_t  name_size,
char *  name,
off_t *  offset,
hsize_t size 
)

Returns information about an external file.

Parameters
[in]plist_idDataset creation property list identifier
[in]idxExternal file index
[in]name_sizeMaximum length of name array
[out]nameName of the external file
[out]offsetPointer to a location to return an offset value
[out]sizePointer to a location to return the size of the external file data
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_external() returns information about an external file. The external file is specified by its index, idx, which is a number from zero to N-1, where N is the value returned by H5Pget_external_count(). At most name_size characters are copied into the name array. If the external file name is longer than name_size with the null terminator, the return value is not null terminated (similar to strncpy()).

If name_size is zero or name is the null pointer, the external file name is not returned. If offset or size are null pointers then the corresponding information is not returned.

Note
On Windows, off_t is typically a 32-bit signed long value, which limits the valid offset that can be returned to 2 GiB.
Version
1.6.4 idx parameter type changed to unsigned.
Since
1.0.0

◆ H5Pget_external_count()

int H5Pget_external_count ( hid_t  plist_id)

Returns the number of external files for a dataset.

Parameters
[in]plist_idDataset creation property list identifier
Returns
Returns the number of external files if successful; otherwise returns a negative value.

H5Pget_external_count() returns the number of external files for the specified dataset.

Since
1.0.0

◆ H5Pget_fill_time()

herr_t H5Pget_fill_time ( hid_t  plist_id,
H5D_fill_time_t fill_time 
)

Retrieves the time when fill values are written to a dataset.

Parameters
[in]plist_idDataset creation property list identifier
[out]fill_timeSetting for the timing of writing fill values to the dataset
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_fill_time() examines the dataset creation property list plist_id to determine when fill values are to be written to a dataset. Valid values returned in fill_time are as follows:

H5D_FILL_TIME_IFSET Fill values are written to the dataset when storage space is allocated only if there is a user-defined fill value, i.e., one set with H5Pset_fill_value(). (Default)
H5D_FILL_TIME_ALLOC Fill values are written to the dataset when storage space is allocated.
H5D_FILL_TIME_NEVER Fill values are never written to the dataset.
Note
H5Pget_fill_time() is designed to work in coordination with the dataset fill value and dataset storage allocation time properties, retrieved with the functions H5Pget_fill_value() and H5Pget_alloc_time().type == H5FD_MEM_DRAW
Since
1.6.0

◆ H5Pget_fill_value()

herr_t H5Pget_fill_value ( hid_t  plist_id,
hid_t  type_id,
void *  value 
)

Retrieves a dataset fill value.

Parameters
[in]plist_idDataset creation property list identifier
[in]type_idDatatype identifier for the value passed via value
[out]valuePointer to buffer to contain the returned fill value
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_fill_value() returns the dataset fill value defined in the dataset creation property list plist_id. The fill value is returned through the value pointer and will be converted to the datatype specified by type_id. This datatype may differ from the fill value datatype in the property list, but the HDF5 library must be able to convert between the two datatypes.

If the fill value is undefined, i.e., set to NULL in the property list, H5Pget_fill_value() will return an error. H5Pfill_value_defined() should be used to check for this condition before H5Pget_fill_value() is called.

Memory must be allocated by the calling application.

Note
H5Pget_fill_value() is designed to coordinate with the dataset storage allocation time and fill value write time properties, which can be retrieved with the functions H5Pget_alloc_time() and H5Pget_fill_time(), respectively.
Since
1.0.0

◆ H5Pget_filter1()

H5Z_filter_t H5Pget_filter1 ( hid_t  plist_id,
unsigned  filter,
unsigned int *  flags,
size_t *  cd_nelmts,
unsigned  cd_values[],
size_t  namelen,
char  name[] 
)

Returns information about a filter in a pipeline (DEPRECATED)

Parameters
[in]plist_idProperty list identifier
[in]filterSequence number within the filter pipeline of the filter for which information is sought
[out]flagsBit vector specifying certain general properties of the filter
[in,out]cd_nelmtsNumber of elements in cd_values
[out]cd_valuesAuxiliary data for the filter
[in]namelenAnticipated number of characters in name
[out]nameName of the filter
Returns
Returns the filter identifier if successful; Otherwise returns a negative value. See: H5Z_filter_t
Deprecated:
When was this function deprecated?

H5Pget_filter1() returns information about a filter, specified by its filter number, in a filter pipeline, specified by the property list with which it is associated.

plist_id must be a dataset or group creation property list.

filter is a value between zero and N-1, as described in H5Pget_nfilters(). The function will return a negative value if the filter number is out of range.

The structure of the flags argument is discussed in H5Pset_filter().

On input, cd_nelmts indicates the number of entries in the cd_values array, as allocated by the caller; on return, cd_nelmts contains the number of values defined by the filter.

If name is a pointer to an array of at least namelen bytes, the filter name will be copied into that array. The name will be null terminated if namelen is large enough. The filter name returned will be the name appearing in the file, the name registered for the filter, or an empty string.

Version
1.8.5 Function extended to work with group creation property lists.
1.8.0 N-bit and scale-offset filters added.
1.8.0 Function H5Pget_filter() renamed to H5Pget_filter1() and deprecated in this release.
1.6.4 filter parameter type changed to unsigned.
Since
1.0.0

◆ H5Pget_filter_by_id1()

herr_t H5Pget_filter_by_id1 ( hid_t  plist_id,
H5Z_filter_t  id,
unsigned int *  flags,
size_t *  cd_nelmts,
unsigned  cd_values[],
size_t  namelen,
char  name[] 
)

Returns information about the specified filter.

Parameters
[in]plist_idProperty list identifier
[in]idFilter identifier
[out]flagsBit vector specifying certain general properties of the filter
[in,out]cd_nelmtsNumber of elements in cd_values
[out]cd_valuesAuxiliary data for the filter
[in]namelenAnticipated number of characters in name
[out]nameName of the filter
Returns
Returns a non-negative value if successful; Otherwise returns a negative value.
Deprecated:
As of HDF5-1.8 this function was deprecated in favor of H5Pget_filter_by_id2() or the macro H5Pget_filter_by_id().

H5Pget_filter_by_id1() returns information about a filter, specified in id, a filter identifier.

plist_id must be a dataset or group creation property list and id must be in the associated filter pipeline.

The id and flags parameters are used in the same manner as described in the discussion of H5Pset_filter().

Aside from the fact that they are used for output, the parameters cd_nelmts and cd_values[] are used in the same manner as described in the discussion of H5Pset_filter(). On input, the cd_nelmts parameter indicates the number of entries in the cd_values[] array allocated by the calling program; on exit it contains the number of values defined by the filter.

On input, the namelen parameter indicates the number of characters allocated for the filter name by the calling program in the array name[]. On exit name[] contains the name of the filter with one character of the name in each element of the array.

If the filter specified in id is not set for the property list, an error will be returned and this function will fail.

Version
1.8.5 Function extended to work with group creation property lists.
1.8.0 Function H5Pget_filter_by_id() renamed to H5Pget_filter_by_id1() and deprecated in this release.
Since
1.6.0

◆ H5Pget_layout()

H5D_layout_t H5Pget_layout ( hid_t  plist_id)

Returns the layout of the raw data for a dataset.

Parameters
[in]plist_idDataset creation property list identifier
Returns
Returns the layout type (a non-negative value) of a dataset creation property list if successful. Valid return values are:
  • H5D_COMPACT: Raw data is stored in the object header in the file.
  • H5D_CONTIGUOUS: Raw data is stored separately from the object header in one contiguous chunk in the file.
  • H5D_CHUNKED: Raw data is stored separately from the object header in chunks in separate locations in the file.
  • H5D_VIRTUAL: Raw data is drawn from multiple datasets in different files.
Otherwise, returns a negative value indicating failure.

H5Pget_layout() returns the layout of the raw data for a dataset. This function is only valid for dataset creation property lists.

Note that a compact storage layout may affect writing data to the dataset with parallel applications. See the H5Dwrite() documentation for details.

Version
1.10.0 H5D_VIRTUAL added in this release.
Since
1.0.0

◆ H5Pget_virtual_count()

herr_t H5Pget_virtual_count ( hid_t  dcpl_id,
size_t *  count 
)

Gets the number of mappings for the virtual dataset.

Parameters
[in]dcpl_idDataset creation property list identifier
[out]countThe number of mappings
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pget_virtual_count() gets the number of mappings for a virtual dataset that has the creation property list specified by dcpl_id.

See also
Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab()
VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()
Since
1.10.0

◆ H5Pget_virtual_dsetname()

ssize_t H5Pget_virtual_dsetname ( hid_t  dcpl_id,
size_t  index,
char *  name,
size_t  size 
)

Gets the name of a source dataset used in the mapping.

Parameters
[in]dcpl_idDataset creation property list identifier
[in]indexMapping index. The value of index is 0 (zero) or greater and less than count (0 ≤ index < count), where count is the number of mappings returned by H5Pget_virtual_count().
[out]nameA buffer containing the name of the source dataset
[in]sizeThe size, in bytes, of the name buffer. Must be the size of the dataset name in bytes plus 1 for a NULL terminator
Returns
Returns the length of the dataset name if successful; otherwise returns a negative value.

H5Pget_virtual_dsetname() takes the dataset creation property list for the virtual dataset, dcpl_id, the mapping index, index, the size of the dataset name for a source dataset, size, and retrieves the name of the source dataset used in the mapping.

Up to size characters of the dataset name are returned in name; additional characters, if any, are not returned to the user application.

If the length of the dataset name, which determines the required value of size, is unknown, a preliminary call to H5Pget_virtual_dsetname() with the last two parameters set to NULL and zero respectively can be made. The return value of this call will be the size in bytes of the dataset name. That value, plus 1 for a NULL terminator, must then be assigned to size for a second H5Pget_virtual_dsetname() call, which will retrieve the actual dataset name.

See also
Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab()
VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()
Since
1.10.0

◆ H5Pget_virtual_filename()

ssize_t H5Pget_virtual_filename ( hid_t  dcpl_id,
size_t  index,
char *  name,
size_t  size 
)

Gets the filename of a source dataset used in the mapping.

Parameters
[in]dcpl_idDataset creation property list identifier
[in]indexMapping index. The value of index is 0 (zero) or greater and less than count (0 ≤ index < count), where count is the number of mappings returned by H5Pget_virtual_count().
[out]nameA buffer containing the name of the file containing the source dataset
[in]sizeThe size, in bytes, of the name buffer. Must be the size of the filename in bytes plus 1 for a NULL terminator
Returns
Returns the length of the filename if successful; otherwise returns a negative value.

H5Pget_virtual_filename() takes the dataset creation property list for the virtual dataset, dcpl_id, the mapping index, index, the size of the filename for a source dataset, size, and retrieves the name of the file for a source dataset used in the mapping.

Up to size characters of the filename are returned in name; additional characters, if any, are not returned to the user application.

If the length of the filename, which determines the required value of size, is unknown, a preliminary call to H5Pget_virtual_filename() with the last two parameters set to NULL and zero respectively can be made. The return value of this call will be the size in bytes of the filename. That value, plus 1 for a NULL terminator, must then be assigned to size for a second H5Pget_virtual_filename() call, which will retrieve the actual filename.

See also
Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab()
VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()
Since
1.10.0

◆ H5Pget_virtual_srcspace()

hid_t H5Pget_virtual_srcspace ( hid_t  dcpl_id,
size_t  index 
)

Gets a dataspace identifier for the selection within the source dataset used in the mapping.

Parameters
[in]dcpl_idDataset creation property list identifier
[in]indexMapping index. The value of index is 0 (zero) or greater and less than count (0 ≤ index < count), where count is the number of mappings returned by H5Pget_virtual_count().
Returns
Returns a valid dataspace identifier identifier if successful; otherwise returns H5I_INVALID_HID.

H5Pget_virtual_srcspace() takes the dataset creation property list for the virtual dataset, dcpl_id, and the mapping index, index, and returns a dataspace identifier for the selection within the source dataset used in the mapping.

See also
Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab()
VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()
Since
1.10.0

◆ H5Pget_virtual_vspace()

hid_t H5Pget_virtual_vspace ( hid_t  dcpl_id,
size_t  index 
)

Gets a dataspace identifier for the selection within the virtual dataset used in the mapping.

Parameters
[in]dcpl_idDataset creation property list identifier
[in]indexMapping index. The value of index is 0 (zero) or greater and less than count (0 ≤ index < count), where count is the number of mappings returned by H5Pget_virtual_count()
Returns
Returns a valid dataspace identifier identifier if successful; otherwise returns H5I_INVALID_HID.

H5Pget_virtual_vspace() takes the dataset creation property list for the virtual dataset, dcpl_id, and the mapping index, index, and returns a dataspace identifier for the selection within the virtual dataset used in the mapping.

See also
Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab()
VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()
Since
1.10.0

◆ H5Pset_alloc_time()

herr_t H5Pset_alloc_time ( hid_t  plist_id,
H5D_alloc_time_t  alloc_time 
)

Sets the timing for storage space allocation.

Parameters
[in]plist_idDataset creation property list identifier
[in]alloc_timeWhen to allocate dataset storage space
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_alloc_time() sets up the timing for the allocation of storage space for a dataset's raw data. This property is set in the dataset creation property list plist_id. Timing is specified in alloc_time with one of the following values:

H5D_ALLOC_TIME_DEFAULT Allocate dataset storage space at the default time
(Defaults differ by storage method.)
H5D_ALLOC_TIME_EARLY Allocate all space when the dataset is created
(Default for compact datasets.)
H5D_ALLOC_TIME_INCR

Allocate space incrementally, as data is written to the dataset
(Default for chunked storage datasets.)

  • Chunked datasets: Storage space allocation for each chunk is deferred until data is written to the chunk.
  • Contiguous datasets: Incremental storage space allocation for contiguous data is treated as late allocation.
  • Compact datasets: Incremental allocation is not allowed with compact datasets; H5Pset_alloc_time() will return an error.
H5D_ALLOC_TIME_LATE Allocate all space when data is first written to the dataset
(Default for contiguous datasets.)
Note
H5Pset_alloc_time() is designed to work in concert with the dataset fill value and fill value write time properties, set with the functions H5Pset_fill_value() and H5Pset_fill_time().
See H5Dcreate() for further cross-references.
Since
1.6.0

◆ H5Pset_chunk()

herr_t H5Pset_chunk ( hid_t  plist_id,
int  ndims,
const hsize_t  dim[] 
)

Sets the size of the chunks used to store a chunked layout dataset.

Parameters
[in]plist_idDataset creation property list identifier
[in]ndimsThe number of dimensions of each chunk
[in]dimAn array defining the size, in dataset elements, of each chunk
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_chunk() sets the size of the chunks used to store a chunked layout dataset. This function is only valid for dataset creation property lists.

The ndims parameter currently must be the same size as the rank of the dataset.

The values of the dim array define the size of the chunks to store the dataset's raw data. The unit of measure for dim values is dataset elements.

As a side-effect of this function, the layout of the dataset is changed to H5D_CHUNKED, if it is not already so set.

Note
Chunk size cannot exceed the size of a fixed-size dataset. For example, a dataset consisting of a 5x4 fixed-size array cannot be defined with 10x10 chunks. Chunk maximums:
  • The maximum number of elements in a chunk is 232-1 which is equal to 4,294,967,295. If the number of elements in a chunk is set via H5Pset_chunk() to a value greater than 232-1, then H5Pset_chunk() will fail.
  • The maximum size for any chunk is 4GB. If a chunk that is larger than 4GB attempts to be written with H5Dwrite(), then H5Dwrite() will fail.
See also
H5Pset_layout(), H5Dwrite()
Since
1.0.0

◆ H5Pset_chunk_opts()

herr_t H5Pset_chunk_opts ( hid_t  plist_id,
unsigned  opts 
)

Sets the edge chunk option in a dataset creation property list.

Parameters
[in]plist_idDataset creation property list identifier
[in]optsEdge chunk option flag. Valid values are:
  • H5D_CHUNK_DONT_FILTER_PARTIAL_CHUNKS When enabled, filters are not applied to partial edge chunks. When disabled, partial edge chunks are filtered. Enabling this option will improve performance when appending to the dataset and, when compression filters are used, prevent reallocation of these chunks. Datasets created with this option enabled will be inaccessible with HDF5 library versions before Release 1.10. Default: Disabled
  • 0 (zero) Disables option; partial edge chunks will be compressed.
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_chunk_opts() sets the edge chunk option in the dataset creation property list dcpl_id.

The available option is detailed in the parameters section. Only chunks that are not completely filled by the dataset's dataspace are affected by this option. Such chunks are referred to as partial edge chunks.

Motivation: H5Pset_chunk_opts() is used to specify storage options for chunks on the edge of a dataset's dataspace. This capability allows the user to tune performance in cases where the dataset size may not be a multiple of the chunk size and the handling of partial edge chunks can impact performance.

Since
1.10.0

◆ H5Pset_deflate()

herr_t H5Pset_deflate ( hid_t  plist_id,
unsigned  level 
)

Sets deflate (GNU gzip) compression method and compression level.

Parameters
[in]plist_idObject creation property list identifier
[in]levelCompression level
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Attention
If you are planning to use compression with parallel HDF5, ensure that calls to H5Dwrite() occur in collective mode. In other words, all MPI ranks (in the relevant communicator) call H5Dwrite() and pass a dataset transfer property list with the MPI-IO collective option property set to H5FD_MPIO_COLLECTIVE_IO.
Note that data transformations are currently not supported when writing to datasets in parallel and with compression enabled.

H5Pset_deflate() sets the deflate compression method and the compression level, level, for a dataset or group creation property list, plist_id.

The filter identifier set in the property list is H5Z_FILTER_DEFLATE.

The compression level, level, is a value from zero to nine, inclusive. A compression level of 0 (zero) indicates no compression; compression improves but speed slows progressively from levels 1 through 9:

Compression Level Gzip Action
0 No compression
1 Best compression speed; least compression
2 through 8 Compression improves; speed degrades
9 Best compression ratio; slowest speed

Note that setting the compression level to 0 (zero) does not turn off use of the gzip filter; it simply sets the filter to perform no compression as it processes the data.

HDF5 relies on GNU gzip for this compression.

Version
1.8.5 Function extended to work with group creation property lists.
Since
1.0.0

◆ H5Pset_dset_no_attrs_hint()

herr_t H5Pset_dset_no_attrs_hint ( hid_t  dcpl_id,
hbool_t  minimize 
)

Sets the flag to create minimized dataset object headers.

Parameters
[in]dcpl_idDataset creation property list identifier
[in]minimizeFlag for indicating whether or not a dataset's object header will be minimized
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_dset_no_attrs_hint() sets the no dataset attributes hint setting for the dataset creation property list dcpl_id. Datasets created with the dataset creation property list dcpl_id will have their object headers minimized if the boolean flag minimize is set to true. By setting minimize to true, the library expects that no attributes will be added to the dataset. Attributes can be added, but they are appended with a continuation message, which can reduce performance.

This setting interacts with H5Fset_dset_no_attrs_hint(): if either is set to true, then the created dataset's object header will be minimized.

Since
1.10.5

◆ H5Pset_external()

herr_t H5Pset_external ( hid_t  plist_id,
const char *  name,
off_t  offset,
hsize_t  size 
)

Adds an external file to the list of external files.

Parameters
[in]plist_idDataset creation property list identifier
[in]nameName of an external file
[in]offsetOffset, in bytes, from the beginning of the file to the location in the file where the data starts
[in]sizeNumber of bytes reserved in the file for the data
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

The first call to H5Pset_external() sets the external storage property in the property list, thus designating that the dataset will be stored in one or more non-HDF5 file(s) external to the HDF5 file. This call also adds the file name as the first file in the list of external files. Subsequent calls to the function add the named file as the next file in the list.

If a dataset is split across multiple files, then the files should be defined in order. The total size of the dataset is the sum of the size arguments for all the external files. If the total size is larger than the size of a dataset then the dataset can be extended (provided the data space also allows the extending).

The size argument specifies the number of bytes reserved for data in the external file. If size is set to H5F_UNLIMITED, the external file can be of unlimited size and no more files can be added to the external files list. If size is set to 0 (zero), no external file will actually be created.

All of the external files for a given dataset must be specified with H5Pset_external() before H5Dcreate() is called to create the dataset. If one these files does not exist on the system when H5Dwrite() is called to write data to it, the library will create the file.

Note
On Windows, off_t is typically a 32-bit signed long value, which limits the valid offset that can be set to 2 GiB.
Since
1.0.0

◆ H5Pset_fill_time()

herr_t H5Pset_fill_time ( hid_t  plist_id,
H5D_fill_time_t  fill_time 
)

Sets the time when fill values are written to a dataset.

Parameters
[in]plist_idDataset creation property list identifier
[in]fill_timeWhen to write fill values to a dataset
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_fill_time() sets up the timing for writing fill values to a dataset. This property is set in the dataset creation property list plist_id. Timing is specified in fill_time with one of the following values:

H5D_FILL_TIME_IFSET Write fill values to the dataset when storage space is allocated only if there is a user-defined fill value, i.e.,one set with H5Pset_fill_value(). (Default)
H5D_FILL_TIME_ALLOC Write fill values to the dataset when storage space is allocated.
H5D_FILL_TIME_NEVER Never write fill values to the dataset.
Note
H5Pset_fill_time() is designed for coordination with the dataset fill value and dataset storage allocation time properties, set with the functions H5Pset_fill_value() and H5Pset_alloc_time(). See H5Dcreate() for further cross-references.
Since
1.6.0

◆ H5Pset_fill_value()

herr_t H5Pset_fill_value ( hid_t  plist_id,
hid_t  type_id,
const void *  value 
)

Sets the fill value for a dataset.

Parameters
[in]plist_idDataset creation property list identifier
[in]type_idDatatype of value
[in]valuePointer to buffer containing value to use as fill value
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_fill_value() sets the fill value for a dataset in the dataset creation property list. value is interpreted as being of datatype type_id. This datatype may differ from that of the dataset, but the HDF5 library must be able to convert value to the dataset datatype when the dataset is created.

The default fill value is 0 (zero), which is interpreted according to the actual dataset datatype.

Setting value to NULL indicates that the fill value is to be undefined.

Note
Applications sometimes write data only to portions of an allocated dataset. It is often useful in such cases to fill the unused space with a known fill value. This function allows the user application to set that fill value; the functions H5Dfill() and H5Pset_fill_time(), respectively, provide the ability to apply the fill value on demand or to set up its automatic application.
A fill value should be defined so that it is appropriate for the application. While the HDF5 default fill value is 0 (zero), it is often appropriate to use another value. It might be useful, for example, to use a value that is known to be impossible for the application to legitimately generate.
H5Pset_fill_value() is designed to work in concert with H5Pset_alloc_time() and H5Pset_fill_time(). H5Pset_alloc_time() and H5Pset_fill_time() govern the timing of dataset storage allocation and fill value write operations and can be important in tuning application performance.
See H5Dcreate() for further cross-references.
Since
1.0.0

◆ H5Pset_layout()

herr_t H5Pset_layout ( hid_t  plist_id,
H5D_layout_t  layout 
)

Sets the type of storage used to store the raw data for a dataset.

Parameters
[in]plist_idDataset creation property list identifier
[in]layoutType of storage layout for raw data
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_layout() sets the type of storage used to store the raw data for a dataset. This function is only valid for dataset creation property lists.

Valid values for layout are:

  • H5D_COMPACT: Store raw data in the dataset object header in file. This should only be used for datasets with small amounts of raw data. The raw data size limit is 64K (65520 bytes). Attempting to create a dataset with raw data larger than this limit will cause the H5Dcreate() call to fail.
  • H5D_CONTIGUOUS: Store raw data separately from the object header in one large chunk in the file.
  • H5D_CHUNKED: Store raw data separately from the object header as chunks of data in separate locations in the file.
  • H5D_VIRTUAL: Draw raw data from multiple datasets in different files.

Note that a compact storage layout may affect writing data to the dataset with parallel applications. See the note in H5Dwrite() documentation for details.

Version
1.10.0 H5D_VIRTUAL added in this release.
Since
1.0.0

◆ H5Pset_nbit()

herr_t H5Pset_nbit ( hid_t  plist_id)

Sets up the use of the N-Bit filter.

Parameters
[in]plist_idDataset creation property list identifier
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Attention
If you are planning to use compression with parallel HDF5, ensure that calls to H5Dwrite() occur in collective mode. In other words, all MPI ranks (in the relevant communicator) call H5Dwrite() and pass a dataset transfer property list with the MPI-IO collective option property set to H5FD_MPIO_COLLECTIVE_IO.
Note that data transformations are currently not supported when writing to datasets in parallel and with compression enabled.

H5Pset_nbit() sets the N-Bit filter, H5Z_FILTER_NBIT, in the dataset creation property list plist_id.

The HDF5 user can create an N-Bit datatype with the following code:

         hid_t nbit_datatype = H5Tcopy(H5T_STD_I32LE);
         H5Tset_precision(nbit_datatype, 16);
         H5Tset_offset(nbit_datatype, 4);
         

In memory, one value of the N-Bit datatype in the above example will be stored on a little-endian machine as follows:

byte 3 byte 2 byte 1 byte 0
???????? ????SPPP PPPPPPPP PPPP????

Note: S - sign bit, P - significant bit, ? - padding bit; For signed integer, the sign bit is included in the precision.

When data of the above datatype is stored on disk using the N-bit filter, all padding bits are chopped off and only significant bits are stored. The values on disk will be something like:

1st value 2nd value ...
SPPPPPPPPPPPPPPP SPPPPPPPPPPPPPPP ...

The N-Bit filter is used effectively for compressing data of an N-Bit datatype as well as a compound and an array datatype with N-Bit fields. However, the datatype classes of the N-Bit datatype or the N-Bit field of the compound datatype or the array datatype are limited to integer or floating-point.

The N-Bit filter supports complex situations where a compound datatype contains member(s) of a compound datatype or an array datatype that has a compound datatype as the base type. However, it does not support the situation where an array datatype has a variable-length or variable-length string as its base datatype. The filter does support the situation where a variable-length or variable-length string is a member of a compound datatype.

The N-Bit filter allows all other HDF5 datatypes (such as time, string, bitfield, opaque, reference, enum, and variable length) to pass through as a no-op.

Like other I/O filters supported by the HDF5 library, application using the N-Bit filter must store data with chunked storage.

By nature, the N-Bit filter should not be used together with other I/O filters.

Version
1.8.8 Fortran subroutine introduced in this release.
Since
1.8.0

◆ H5Pset_scaleoffset()

herr_t H5Pset_scaleoffset ( hid_t  plist_id,
H5Z_SO_scale_type_t  scale_type,
int  scale_factor 
)

Sets up the use of the scale-offset filter.

Parameters
[in]plist_idDataset creation property list identifier
[in]scale_typeFlag indicating compression method
[in]scale_factorParameter related to scale. Must be non-negative
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Attention
If you are planning to use compression with parallel HDF5, ensure that calls to H5Dwrite() occur in collective mode. In other words, all MPI ranks (in the relevant communicator) call H5Dwrite() and pass a dataset transfer property list with the MPI-IO collective option property set to H5FD_MPIO_COLLECTIVE_IO.
Note that data transformations are currently not supported when writing to datasets in parallel and with compression enabled.

H5Pset_scaleoffset() sets the scale-offset filter, H5Z_FILTER_SCALEOFFSET, for a dataset.

Generally speaking, scale-offset compression performs a scale and/or offset operation on each data value and truncates the resulting value to a minimum number of bits (MinBits) before storing it. The current scale-offset filter supports integer and floating-point datatypes.

For an integer datatype, the parameter scale_type should be set to H5Z_SO_INT (2). The parameter scale_factor denotes MinBits. If the user sets it to H5Z_SO_INT_MINBITS_DEFAULT (0), the filter will calculate MinBits. If scale_factor is set to a positive integer, the filter does not do any calculation and just uses the number as MinBits. However, if the user gives a MinBits that is less than what would be generated by the filter, the compression will be lossy. Also, the MinBits supplied by the user cannot exceed the number of bits to store one value of the dataset datatype.

For a floating-point datatype, the filter adopts the GRiB data packing mechanism, which offers two alternate methods: E-scaling and D-scaling. Both methods are lossy compression. If the parameter scale_type is set to H5Z_SO_FLOAT_DSCALE (0), the filter will use the D-scaling method; if it is set to H5Z_SO_FLOAT_ESCALE (1), the filter will use the E-scaling method. Since only the D-scaling method is implemented, scale_type should be set to H5Z_SO_FLOAT_DSCALE or 0.

When the D-scaling method is used, the original data is "D" scaled — multiplied by 10 to the power of scale_factor, and the "significant" part of the value is moved to the left of the decimal point. Care should be taken in setting the decimal scale_factor so that the integer part will have enough precision to contain the appropriate information of the data value. For example, if scale_factor is set to 2, the number 104.561 will be 10456.1 after "D" scaling. The last digit 1 is not "significant" and is thrown off in the process of rounding. The user should make sure that after "D" scaling and rounding, the data values are within the range that can be represented by the integer (same size as the floating-point type).

Valid values for scale_type are as follows:

H5Z_SO_FLOAT_DSCALE (0) Floating-point type, using variable MinBits method
H5Z_SO_FLOAT_ESCALE (1) Floating-point type, using fixed MinBits method
H5Z_SO_INT (2) Integer type

The meaning of scale_factor varies according to the value assigned to scale_type:

scale_type value scale_factor description
H5Z_SO_FLOAT_DSCALE Denotes the decimal scale factor for D-scaling and can be positive, negative or zero. This is the current implementation of the library.
H5Z_SO_FLOAT_ESCALE Denotes MinBits for E-scaling and must be a positive integer. This is not currently implemented by the library.
H5Z_SO_INT Denotes MinBits and it should be a positive integer or H5Z_SO_INT_MINBITS_DEFAULT (0). If it is less than 0, the library will reset it to 0 since it is not implemented.

Like other I/O filters supported by the HDF5 library, an application using the scale-offset filter must store data with chunked storage.

Version
1.8.8 Fortran90 subroutine introduced in this release.
Since
1.8.0

◆ H5Pset_shuffle()

herr_t H5Pset_shuffle ( hid_t  plist_id)

Sets up use of the shuffle filter.

Parameters
[in]plist_idDataset creation property list identifier
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Attention
If you are planning to use compression with parallel HDF5, ensure that calls to H5Dwrite() occur in collective mode. In other words, all MPI ranks (in the relevant communicator) call H5Dwrite() and pass a dataset transfer property list with the MPI-IO collective option property set to H5FD_MPIO_COLLECTIVE_IO.
Note that data transformations are currently not supported when writing to datasets in parallel and with compression enabled.

H5Pset_shuffle() sets the shuffle filter, H5Z_FILTER_SHUFFLE, in the dataset creation property list plist_id. The shuffle filter de-interlaces a block of data by reordering the bytes. All the bytes from one consistent byte position of each data element are placed together in one block; all bytes from a second consistent byte position of each data element are placed together a second block; etc. For example, given three data elements of a 4-byte datatype stored as 012301230123, shuffling will re-order data as 000111222333. This can be a valuable step in an effective compression algorithm because the bytes in each byte position are often closely related to each other and putting them together can increase the compression ratio.

As implied above, the primary value of the shuffle filter lies in its coordinated use with a compression filter; it does not provide data compression when used alone. When the shuffle filter is applied to a dataset immediately prior to the use of a compression filter, the compression ratio achieved is often superior to that achieved by the use of a compression filter without the shuffle filter.

Since
1.6.0

◆ H5Pset_szip()

herr_t H5Pset_szip ( hid_t  plist_id,
unsigned  options_mask,
unsigned  pixels_per_block 
)

Sets up use of the SZIP compression filter.

Parameters
[in]plist_idDataset creation property list identifier
[in]options_maskA bit-mask conveying the desired SZIP options; Valid values are H5_SZIP_EC_OPTION_MASK and H5_SZIP_NN_OPTION_MASK.
[in]pixels_per_blockThe number of pixels or data elements in each data block (max H5_SZIP_MAX_PIXELS_PER_BLOCK)
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Attention
If you are planning to use compression with parallel HDF5, ensure that calls to H5Dwrite() occur in collective mode. In other words, all MPI ranks (in the relevant communicator) call H5Dwrite() and pass a dataset transfer property list with the MPI-IO collective option property set to H5FD_MPIO_COLLECTIVE_IO.
Note that data transformations are currently not supported when writing to datasets in parallel and with compression enabled.

H5Pset_szip() sets an SZIP compression filter, H5Z_FILTER_SZIP, for a dataset. SZIP is a compression method designed for use with scientific data.

Before proceeding, all users should review the “Limitations” section below.

Users familiar with SZIP outside the HDF5 context may benefit from reviewing the Note “For Users Familiar with SZIP in Other Contexts” below.

In the text below, the term pixel refers to an HDF5 data element. This terminology derives from SZIP compression's use with image data, where pixel referred to an image pixel.

The SZIP bits_per_pixel value (see Note, below) is automatically set, based on the HDF5 datatype. SZIP can be used with atomic datatypes that may have size of 8, 16, 32, or 64 bits. Specifically, a dataset with a datatype that is 8-, 16-, 32-, or 64-bit signed or unsigned integer; char; or 32- or 64-bit float can be compressed with SZIP. See Note, below, for further discussion of the SZIP bits_per_pixel setting.

SZIP options are passed in an options mask, options_mask, as follows.

Option Description (Mutually exclusive; select one)
H5_SZIP_EC_OPTION_MASK Selects entropy coding method
H5_SZIP_NN_OPTION_MASK Selects nearest neighbor preprocessing followed by entropy coding

The following guidelines can be used in determining which option to select:

  • The entropy coding method, the EC option specified by H5_SZIP_EC_OPTION_MASK, is best suited for data that has been processed. The EC method works best for small numbers.
  • The nearest neighbor coding method, the NN option specified by H5_SZIP_NN_OPTION_MASK, preprocesses the data then the applies EC method as above.

Other factors may affect results, but the above criteria provides a good starting point for optimizing data compression.

SZIP compresses data block by block, with a user-tunable block size. This block size is passed in the parameter pixels_per_block and must be even and not greater than 32, with typical values being 8, 10, 16, or 32. This parameter affects compression ratio; the more pixel values vary, the smaller this number should be to achieve better performance.

In HDF5, compression can be applied only to chunked datasets. If pixels_per_block is bigger than the total number of elements in a dataset chunk, H5Pset_szip() will succeed but the subsequent call to H5Dcreate() will fail; the conflict can be detected only when the property list is used.

To achieve optimal performance for SZIP compression, it is recommended that a chunk's fastest-changing dimension be equal to N times pixels_per_block where N is the maximum number of blocks per scan line allowed by the SZIP library. In the current version of SZIP, N is set to 128.

SZIP compression is an optional HDF5 filter.

Limitations:

  • SZIP compression cannot be applied to compound, array, variable-length, enumeration, or any other user-defined datatypes. If an SZIP filter is set in a dataset creation property list used to create a dataset containing a non-allowed datatype, the call to H5Dcreate() will fail; the conflict can be detected only when the property list is used.
  • Users should be aware that there are factors that affect one's rights and ability to use SZIP compression by reviewing the SZIP copyright notice. (This limitation does not apply to the libaec library).
Note
For Users Familiar with SZIP in Other Contexts:
The following notes are of interest primarily to those who have used SZIP compression outside of the HDF5 context. In non-HDF5 applications, SZIP typically requires that the user application supply additional parameters:
  • pixels_in_object, the number of pixels in the object to be compressed
  • bits_per_pixel, the number of bits per pixel
  • pixels_per_scanline, the number of pixels per scan line
These values need not be independently supplied in the HDF5 environment as they are derived from the datatype and dataspace, which are already known. In particular, HDF5 sets pixels_in_object to the number of elements in a chunk and bits_per_pixel to the size of the element or pixel datatype.
The following algorithm is used to set pixels_per_scanline:
  • If the size of a chunk's fastest-changing dimension, size, is greater than 4K, set pixels_per_scanline to 128 times pixels_per_block.
  • If size is less than 4K but greater than pixels_per_block, set pixels_per_scanline to the minimum of size and 128 times pixels_per_block.
  • If size is less than pixels_per_block but greater than the number elements in the chunk, set pixels_per_scanline to the minimum of the number elements in the chunk and 128 times pixels_per_block.
The HDF5 datatype may have precision that is less than the full size of the data element, e.g., an 11-bit integer can be defined using H5Tset_precision(). To a certain extent, SZIP can take advantage of the precision of the datatype to improve compression:
  • If the HDF5 datatype size is 24-bit or less and the offset of the bits in the HDF5 datatype is zero (see H5Tset_offset() or H5Tget_offset()), the data is the in lowest N bits of the data element. In this case, the SZIP bits_per_pixel is set to the precision of the HDF5 datatype.
  • If the offset is not zero, the SZIP bits_per_pixel will be set to the number of bits in the full size of the data element.
  • If the HDF5 datatype precision is 25-bit to 32-bit, the SZIP bits_per_pixel will be set to 32.
  • If the HDF5 datatype precision is 33-bit to 64-bit, the SZIP bits_per_pixel will be set to 64.
HDF5 always modifies the options mask provided by the user to set up usage of RAW_OPTION_MASK, ALLOW_K13_OPTION_MASK, and one of LSB_OPTION_MASK or MSB_OPTION_MASK, depending on endianness of the datatype.
Since
1.6.0

◆ H5Pset_virtual()

herr_t H5Pset_virtual ( hid_t  dcpl_id,
hid_t  vspace_id,
const char *  src_file_name,
const char *  src_dset_name,
hid_t  src_space_id 
)

Sets the mapping between virtual and source datasets.

Parameters
[in]dcpl_idDataset creation property list identifier
[in]vspace_idThe dataspace identifier with the selection within the virtual dataset applied, possibly an unlimited selection
[in]src_file_nameThe name of the HDF5 file where the source dataset is located or a "." (period) for a source dataset in the same file. The file might not exist yet. The name can be specified using a C-style printf statement as described below.
[in]src_dset_nameThe path to the HDF5 dataset in the file specified by src_file_name. The dataset might not exist yet. The dataset name can be specified using a C-style printf statement as described below.
[in]src_space_idThe source dataset's dataspace identifier with a selection applied, possibly an unlimited selection
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

H5Pset_virtual() maps elements of the virtual dataset (VDS) described by the virtual dataspace identifier vspace_id to the elements of the source dataset described by the source dataset dataspace identifier src_space_id. The source dataset is identified by the name of the file where it is located, src_file_name, and the name of the dataset, src_dset_name.

C-style printf Formatting Statements:
C-style printf formatting allows a pattern to be specified in the name of a source file or dataset. Strings for the file and dataset names are treated as literals except for the following substitutions:
"%%" Replaced with a single "%" (percent) character.
"%<d>b" Where "<d>" is the virtual dataset dimension axis (0-based) and "b" indicates that the block count of the selection in that dimension should be used. The full expression (for example, "%0b") is replaced with a single numeric value when the mapping is evaluated at VDS access time. Example code for many source and virtual dataset mappings is available in the "Examples of Source to Virtual Dataset Mapping" chapter in the RFC: HDF5 Virtual Dataset.
If the printf form is used for the source file or dataset names, the selection in the source dataset's dataspace must be fixed-size.
Source File Resolutions:
When a source dataset residing in a different file is accessed, the library will search for the source file src_file_name as described below:
  • If src_file_name is a "." (period) then it refers to the file containing the virtual dataset.
  • If src_file_name is a relative pathname, the following steps are performed:
    • The library will get the prefix(es) set in the environment variable HDF5_VDS_PREFIX and will try to prepend each prefix to src_file_name to form a new src_file_name. If the new src_file_name does not exist or if HDF5_VDS_PREFIX is not set, the library will get the prefix set via H5Pset_virtual_prefix() and prepend it to src_file_name to form a new src_file_name. If the new src_file_name does not exist or no prefix is being set by H5Pset_virtual_prefix() then the path of the file containing the virtual dataset is obtained. This path can be the absolute path or the current working directory plus the relative path of that file when it is created/opened. The library will prepend this path to src_file_name to form a new src_file_name.
    • If the new src_file_name does not exist, then the library will look for src_file_name and will return failure/success accordingly.
  • If src_file_name is an absolute pathname, the library will first try to find src_file_name. If src_file_name does not exist, src_file_name is stripped of directory paths to form a new src_file_name. The search for the new src_file_name then follows the same steps as described above for a relative pathname. See examples below illustrating how src_file_name is stripped to form a new src_file_name.
Note that src_file_name is considered to be an absolute pathname when the following condition is true:
  • For Unix, the first character of src_file_name is a slash (/).
    For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be A.h5.
  • For Windows, there are 6 cases:
    1. src_file_name is an absolute drive with absolute pathname.
      For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be A.h5.
    2. src_file_name is an absolute pathname without specifying drive name.
      For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be A.h5.
    3. src_file_name is an absolute drive with relative pathname.
      For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be tmp/A.h5.
    4. src_file_name is in UNC (Uniform Naming Convention) format with server name, share name, and pathname.
      For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be A.h5.
    5. src_file_name is in Long UNC (Uniform Naming Convention) format with server name, share name, and pathname.
      For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be A.h5.
    6. src_file_name is in Long UNC (Uniform Naming Convention) format with an absolute drive and an absolute pathname.
      For example, consider a src_file_name of /tmp/A.h5. If that source file does not exist, the new src_file_name after stripping will be A.h5
See also
Virtual Dataset Overview
Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab()
VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()
Version
1.10.2 A change was made to the method of searching for VDS source files.
Since
1.10.0