February 14, 2002
Version 2.6 and earlier of HDF-EOS were built on top of the HDF4 library (this is now termed HDFEOS4) [2]. Version 5 of HDF-EOS is built on top of HDF5 [3]. Files created with HDFEOS4 cannot be read with HDFEOS5 and vice versa. In some cases, programs will use data in on or both formats, with multiple reader or writer modules. In other cases, it may be desirable to convert older files from HDFEOS4 to an equivalent HDFEOS5 file.
Since the HDF-EOS objects are equivalent, files can be translated by reading the HDFEOS4 file and writing an equivalent HDFEOS5 file. For example, the heconvert program [4] converts an HDFEOS4 file to an equivalent HDF-EOS 5 file. All the EOS objects--the Grid, Swath, and Point objects, and associated metadata--are read from the HDFEOS4 (HDF4) file and written to an equivalent HDFEOS5 (HDF5) file.
This is not the end of the story, however. Most HDF-EOS data products contain standard HDF objects as well as the HDF-EOS objects. In addition to the Grid, Swath, and Points managed by the HDFEOS library, these "hybrid" files may also contain:
The heconvert utility converts only the objects and metadata managed by the HDF-EOS library, and consequently cannot convert other HDF objects that may be present. The result is that the file created by heconvert may omit some of the objects and metadata form the original. To fully convert "hybrid" files, it is necessary to read the additional HDF4 objects, and write equivalent HDF5 objects.
The NCSA HDF4 to HDF5 Conversion Library (h3toH5 Library)[5], is a library of routines that reads individual HDF objects or groups of objects from an HDF4 file and writes equivalent HDF5 objects to an HDF5 file, using a default translation [6]. This library can be used by C applications to create a custom conversion for specific data products.
In this experiment, the heconvert utility was augmented using h3toH5 Library. In addition to the standard conversion of HDF-EOS objects, the experimental program identifies and converts non-EOS objects in an EOS dataset, creating a more complete HDFEOS5 file.
A sample of HDF-EOS files was selected from the EOS-SAMPLER CD [7] (Table 1). The heosls utility [2] was used to summarize the objects in the file (Table 1, left column). In each case, the file has one or more HDF-EOS object (Grid, Swath, Point) and corresponding StructMetadata.0. Each file also contains standard HDF4 objects (VData tables, SDS datasets, annotations). These latter objects are not managed by the HDF-EOS library. They are product-specific, created by the data processing software and written directly using the HDF4 library. Thus, these files are different examples of "hybrid" files, containing both HDF-EOS and regular HDF objects.
Case
(link to heosls) |
Original File From HDF-EOS SAMPLER | HDF4 size (bytes) |
ceres | CER_ES8_Terra-FM2_Test_SCF_016011.20000830.
subset_70_20_-140_-40.20001012_204110Z.hdf |
76,196,301 |
aster | ASTL1B_0008301851.hdf | 124,464,518 |
mod02hk | MODIS/MOD02HKM.A2000242.0140.002.2000247230108.hdf | 275,064,875 |
mod04-242 | MODIS/MOD04_L2.A2000242.0140.002.2000264223516.hdf | 10,455,467 |
mod04-243 | MODIS/MOD04_L2.A2000243.1850.002.2000252164712.hdf | 10,455,471 |
mod05 | MODIS/MOD05_L2.A2000243.1850.002.2000252164414.hdf | 20,149,501 |
mod06 | MODIS/MOD06_L2.A2000243.1850.002.2000252173103.hdf | 69,590,548 |
mod35 | MODIS/MOD35_L2.A2000243.1850.002.2000244222700.hdf | 47,404,592 |
For example, the heosls utility [2] shows the contents of ceres.hdf (slightly edited for space):
This file contains one Swath, and the corresponding StructMetadata.0. The HDF file also contains four important HDF4 objects:FILE NAME: ./ceres.hdf NCSA HDF Version 4.1 Release 2, March 1998 HDF-EOS Version: HDFEOS_V2.6 "CERES_ES8_subset" SWATH "StructMetadata.0" Global Attribute "coremetadata" Global Attribute "archivemetadata" Global Attribute "SubsetMetadata.0" Global Attribute VDATA "CERES_metadata" (CERES)
The other test files had different objects, but all had both EOS and non-EOS objects.
The experiment used the heconvert utility [4] to convert the sample data files from HDFEOS4 to HDFEOS5. The heconvert utility was augmented with calls using the prerelease of the NCSA h3toH5 Library [5], in order to additionally convert some or all of the non-EOS objects.
The software configuration is summarized in Table 2.
heconvert | ? |
HDF-EOS (4) | V. 2.6 |
HDF4 | 4.1.R5 |
HDF-EOS 5 | 5.1 |
HDF5 | 5.1.4.2-patch1 |
libh3toh5 | 1.0 beta (pre release) |
Sun SPARC 5 | Solaris 2.7 |
NFS partition | 10Mbit/s network |
In the control condition, the heconvert utility was run on the test input file, with a command similar to:
heconvert -i ceres.hdf -o ceres-cont.he5For the experimental condition, the source code to heconvert was modified to add a single subroutine that attempts to convert the regular HDF4 objects in the file into the HDFEOS5 file. The subroutine has a series of calls to the h3toH5 Library [5]. These calls locate and read the objects of interest in the HDF4 file, and write equivalent objects to the HDF5 file. The experimental code handles:
A sketch of this code is given in Appendix 3. The experimental code is called when the '-hybrid' option is selected, e.g.:
heconvert -hybrid -i ceres.hdf -o ceres-exper.he5Each conversion was repeated at least five times to estimate a best case time. The output files were examined with the standard h5dump utility and other tools.
Data file | HDF4 size | Control: HDF5 size | Conversion time (mm:sec) | Exper: HDF5 size | Conversion time (mm:ss) |
mod04-242 | 10,455,467 | 11,255,895 | 0:31 | 11,314,359 | 0:32 |
mod04-243 | 10,455,471 | 11,255,895 | 0:32 | 11,314,367 | 0:32 |
mod05 | 20,149,501 | 20,906,324 | 0:50 | 20,964,036 | 0:50 |
mod35 | 47,404,592 | 47,623,160 | 2:00 | 47,682,720 | 2:03 |
mod06 | 69,590,548 | 69,155,580 | 3:00 | 69,216,068 | 2:59 |
ceres | 76,196,301 | 77,127,264 | 2:51 | 77,183,172 | 2:13 |
aster | 124,464,518 | 123,872,432 | 5:40 | 123,982,208 | 5:39 |
mod02hk | 275,064,875 | 275,876,364 | 11:11 | 276,082,296 | 11:10 |
The conversion times varied because of system and network load. The times reported in Table 3 are the best observed time from at least 5 trials. As would be expected, these times are highly correlated with the size of the data, and show a conversion rate of about 700-800 KB/s across the different files. The network disk has a maximum theoretical speed of about 1100 KB/s, so the conversion appears to be substantially I/O bound, as might be expected.
The content of the output files was examined using the h5dump utility. As expected, in the control condition, the product specific objects were not copied by heconvert, and consequently they are missing from the output file.
For example, in the case of the ceres.hdf file, the control conversion created an output file with the Swath and the StructMetadata.0. The other objects are not present in the output. (Appendix 1.)
The other datasets were similar: the HDF-EOS objects are converted, but the product specific objects are not.
In the control condition, the output files have all the objects converted by the control condition, plus some or all of the other objects. The listings are linked from the file size in Table 3.
For example, for the ceres.hdf file, the HDF5 file contains the same HDF5 objects for the Swath and the HDF-EOS metadata as the control. In addition, the product specific annotations are copied to the output file (as attributes of "/") and the Vdata is created (as a Compound Dataset under "/"). Appendix 2 shows a summary of the HDF-5 file. The non-EOS HDF4 objects are highlighted.
In some cases, the demonstration program does not convert all of the HDF4 objects. For example, the aster.hdf dataset has several Vgroups with product-specific Vdatas and SDSs (e.g., the Vgroup "Ancillary_Data"). These objects were not converted, so the output HDF5 file is still incomplete.
The h3toH5 Library can convert Vgroups and their members. However, in order for a general purpose program to identify which Vgroups are from HDF-EOS and which are not, it is necessary to check every Vgroup individually. This was not attempted for this experiment. We would expect that user's would create product specific conversion utilities, in which case the objects that need to be converted from the HDF4 file should be well understood for each product.
It should be noted that the h3toH5 Library performs a default conversion of the HDF4 objects, which may not be the desired result in all cases. It seems likely that when a data product is developed for HDF5, it will be designed to use HDF5 most effectively, which need not and should not be expected to conform to the default mapping in [6].
For example, the way HDFEOS5 stores the Grid, Swath, and Point objects in HDF5 is not the default conversion of the constituent HDF4 objects. Note, for example, that in HDFEOS4 the 'StructMetadata.0' is stored as a global annotation. In HDFEOS5, this is stored as a Dataset (which is definitely the best choice), rather than an attribute. This is an example of a case where the design for HDF5 should not follow the default mapping.
The non-HDF-EOS objects in datasets may well deserve non-default conversions as well. For example, in the ceres.hdf dataset, the HDF4 objects are created in a default location (under "/"), and the attributes have default names: such as "coremetadata_GLOSDS", etc. For a realistic conversion, it is likely that these would be put in more appropriate locations in the output file, under the Group "/HDFEOS/ADDITIONAL", or some other place in the file. Thus, a product specific converter should design the desired HDF5 file, and then create custom conversion to implement it.
The h3toH5 Library API has optional parameters which can customize the conversion. For example, the group and name of the HDF5 object to be created can be specified. In this demonstration, the conversion used the default locations and names for the objects it created. For a product-specific conversion, parameters could be set to implement the desired layout of the HDF5 file. The h3toH5 Library cannot do all possible conversions, there will likely be cases where the conversion must be specifically designed for a dataset or project. For example, converting the annotation 'StructMetadata.0' to an HDF5 Dataset cannot be done by the h3toH5 Library.
It is important to point out that the h3toH5 Library can be mixed with other calls to HDF4 and HDF5. It would be possible to insert one or more objects from an HDF4 file (or from several different HDF4 files) into an HDF5 file along with other objects written through HDF5. Also, the converted objects can be modified after they are converted. For example, we have used the NCSA H5View [9] program to delete and rename attributes created by the conversion library that are not needed in HDF5.
In conclusion, this experiment shows that the h3toH5 Library provides a toolkit to more easily construct conversion utilities for NASA HDFEOS files. Specifically, we showed that the h3toH5 Library could be used to extend the heconvert utility to handle at least some hybrid files.
This toolkit might be used to create a standard converter for standardized data products that are defined in both HDF4 and HDF5. It might also be used for more ad hoc conversions, e.g., for a small science team that needs to convert HDF output from a program to be read using HDF5, or a future data service that needs to construct a value added data product based on data from both HDFEOS4 and HDFEOS5.
Overall, it is clear that it is feasible to convert HDFEOS4 files to HDFEOS5 when needed. For some purposes and users, it may be sufficient to continue to use current HDFEOS4 data and add future HDFEOS5 data when needed. In other cases, it may be necessary to migrate software to HDFEOS5, and to convert HDFEOS4 data into HDFEOS5. In this experiment, we have show that both of these options are technically viable.
Other support provided by NCSA and other sponsors and agencies [10].
1. HDF
4. heconvert
6. Mapping HDF4 Objects to HDF5 Objects
7. "EOSDIS Terra Data Sampler #1", 2000
9. Java-HDF5
10. Acknowledgements
Output of h5dump -H, edited for space. See also the
file.
HDF5 "ceres-control.he5" { GROUP "/" { GROUP "HDFEOS" { GROUP "ADDITIONAL" { GROUP "FILE_ATTRIBUTES" { } } GROUP "SWATHS" { GROUP "CERES_ES8_subset" { GROUP "Data Fields" { DATASET "CERES LW flux at TOA" {} DATASET "CERES LW unfiltered radiance" {} DATASET "CERES SW filtered radiance" {} DATASET "CERES SW flux at TOA" {} DATASET "CERES SW unfiltered radiance" {} DATASET "CERES TOT filtered radiance" {} DATASET "CERES WN filtered radiance" {} DATASET "CERES WN unfiltered radiance" {} DATASET "CERES relative azimuth at TOA" {} DATASET "CERES solar zenith at TOA" {} DATASET "CERES viewing zenith at TOA" {} DATASET "Colatitude of Sun at observation" {} DATASET "Colatitude of satellite nadir at record end" {} DATASET "Colatitude of satellite nadir at record start" {} DATASET "ERBE scene identification at observation" {} DATASET "Earth-Sun distance at record start" {} DATASET "Longitude of Sun at observation" {} DATASET "Longitude of satellite nadir at record end" {} DATASET "Longitude of satellite nadir at record start" {} DATASET "Rapid retrace flag words" {} DATASET "SW channel flag words" {} DATASET "Scanner FOV flag words" {} DATASET "TOT channel flag words" {} DATASET "WN channel flag words" {} DATASET "X component of satellite position at record end" {} DATASET "X component of satellite position at record start" {} DATASET "X component of satellite velocity at record end" {} DATASET "X component of satellite velocity at record start" {} DATASET "Y component of satellite position at record end" {} DATASET "Y component of satellite position at record start" {} DATASET "Y component of satellite velocity at record end" {} DATASET "Y component of satellite velocity at record start" {} DATASET "Z component of satellite position at record end" {} DATASET "Z component of satellite position at record start" {} DATASET "Z component of satellite velocity at record end" {} DATASET "Z component of satellite velocity at record start" {} } GROUP "Geolocation Fields" { DATASET "Colatitude of CERES FOV at TOA" {} DATASET "Longitude of CERES FOV at TOA" {} DATASET "Time of observation" {} } GROUP "Profile Fields" { } } } } GROUP "HDFEOS INFORMATION" { ATTRIBUTE "HDFEOSVersion" { } DATASET "StructMetadata.0" { } } } }
HDF5 "ceres-exper.he5" { GROUP "/" { ATTRIBUTE "coremetadata_GLOSDS" { } ATTRIBUTE "archivemetadata_GLOSDS" { } ATTRIBUTE "SubsetMetadata.0_GLOSDS" { } DATASET "CERES_metadata" { DATATYPE H5T_COMPOUND { "SHORTNAME"; "RANGEBEGINNINGDATE"; "RANGEBEGINNINGTIME"; "RANGEENDINGDATE"; "RANGEENDINGTIME"; "AUTOMATICQUALITYFLAG"; "AUTOMATICQUALITYFLAGEXPLANATION"; "ASSOCIATEDPLATFORMSHORTNAME"; "ASSOCIATEDINSTRUMENTSHORTNAME"; "LOCALGRANULEID"; "LOCALVERSIONID"; "CERPRODUCTIONDATETIME"; "NUMBEROFRECORDS"; "PRODUCTGENERATIONLOC"; } ATTRIBUTE "HDF4_OBJECT_TYPE" { } ATTRIBUTE "HDF4_OBJECT_NAME" { } ATTRIBUTE "HDF4_REF_NUM" {} } GROUP "HDFEOS" { GROUP "ADDITIONAL" { GROUP "FILE_ATTRIBUTES" { } } GROUP "SWATHS" { GROUP "CERES_ES8_subset" { GROUP "Data Fields" { DATASET "CERES LW flux at TOA" {} DATASET "CERES LW unfiltered radiance" {} DATASET "CERES SW filtered radiance" {} DATASET "CERES SW flux at TOA" {} DATASET "CERES SW unfiltered radiance" {} DATASET "CERES TOT filtered radiance" {} DATASET "CERES WN filtered radiance" {} DATASET "CERES WN unfiltered radiance" {} DATASET "CERES relative azimuth at TOA" {} DATASET "CERES solar zenith at TOA" {} DATASET "CERES viewing zenith at TOA" {} DATASET "Colatitude of Sun at observation" {} DATASET "Colatitude of satellite nadir at record end" {} DATASET "Colatitude of satellite nadir at record start" {} DATASET "ERBE scene identification at observation" {} DATASET "Earth-Sun distance at record start" {} DATASET "Longitude of Sun at observation" {} DATASET "Longitude of satellite nadir at record end" {} DATASET "Longitude of satellite nadir at record start" {} DATASET "Rapid retrace flag words" {} DATASET "SW channel flag words" {} DATASET "Scanner FOV flag words" {} DATASET "TOT channel flag words" {} DATASET "WN channel flag words" {} DATASET "X component of satellite position at record end" {} DATASET "X component of satellite position at record start" {} DATASET "X component of satellite velocity at record end" {} DATASET "X component of satellite velocity at record start" {} DATASET "Y component of satellite position at record end" {} DATASET "Y component of satellite position at record start" {} DATASET "Y component of satellite velocity at record end" {} DATASET "Y component of satellite velocity at record start" {} DATASET "Z component of satellite position at record end" {} DATASET "Z component of satellite position at record start" {} DATASET "Z component of satellite velocity at record end" {} DATASET "Z component of satellite velocity at record start" {} } GROUP "Geolocation Fields" { DATASET "Colatitude of CERES FOV at TOA" {} DATASET "Longitude of CERES FOV at TOA" {} DATASET "Time of observation" {} } GROUP "Profile Fields" { } } } } GROUP "HDFEOS INFORMATION" { ATTRIBUTE "HDFEOSVersion" { } DATASET "StructMetadata.0" {} } } }
void get_the_rest(char * h3file, char *h5file); int main (int argc, char *argv[]) { /* ... */ status = DoSwathConversion(hdf4Info); /* ... */ status = DoGridConversion(hdf4Info); /* ... */ status = DoPointConversion(hdf4Info); /*************************************************************************** All EOS objects have been recreated in the output file. Now pick up at least some of the regular HDF objects, i.e., not managed by HDF-EOS. ***************************************************************************/ if (convertHybrid == CONVERT_TRUE) { get_the_rest(inNameGlobal, outNameGlobal); /* check for errors ... */ } return 0; } /* * Demonstration of the use of libh3toh5 * * This routine finds some of the HDF objects that are not managed * by the HDF-EOS library and does a default conversion. * * This is a generic routine, you would write a customized * version for specific data products. */ void get_the_rest(char * h3file, char *h5file) { hid_t h5fid; hid_t h5gid; hid_t h5aid; hid_t h35id; herr_t res; int32 nvd; h35id = h3toh5open(h3file, h5file, h325_OPEN); /* * HDF-EOS doesn't use file annotations, so any we find * should be converted. */ h3toh5annofil_alldescs(h35id); h3toh5annofil_alllabels(h35id); /* * HDF-EOS uses SDS Global attributes for 'StructMetadata.0', * and HDFEOSVersion. * * Other SDS global attributes have important product * metadata but are not managed by HDF-EOS. * * This section moves them all to default HDF-5 attributes, * and then get rid of the ones moved by heconvert. */ h3toh5_glosdsattr(h35id); /* this moves everything */ /* Find duplicate HDFEOSVersion, StructMetadata.0 and delete * from output file. */ h5fid = H5Fopen(h5file,H5F_ACC_RDWR, H5P_DEFAULT); if (h5fid >=0 ) { h5gid = H5Gopen(h5fid,"/"); res = H5Adelete(h5gid, "HDFEOSVersion_GLOSDS" ) ; res = H5Adelete(h5gid, "StructMetadata.0_GLOSDS" ) ; H5Gclose(h5gid); H5Fclose(h5fid); } /* * The file may have images, e.g. a browse image. * * Find them and move to HDF5. */ res = h3toh5allloneimage(h35id, "/", NULL, h325_ALLATTRS, h325_PAL); /* * The file may have Vdata tables. * * Find them and move to HDF5. */ nvd = h3toh5alllonevdata(h35id, "/", h325_ALLATTRS);
/* * The file may have SDS datasets. * * Find them and move to HDF5. */ res = h3toh5alllonesds(h35id, "/", NULL, h325_DIMSCALE, h325_ALLATTRS);
/* * ... there may be other objects as well, which need to be located * and converted. */ h3toh5close(h35id); }
For other build environments, such as Windows or Macintosh, equivalent include and library paths are needed.# # Need paths to everything! # # HDF4 and HDFEOS2.x HDF=/usr/local/hdf HE2=/usr/local/hdfeos # HDF5 and HDFEOS5.x HE5=/usr/local/hdfeos5 HDF5=/usr/local/hdf5 # The libh3toh5 installed h35=/usr/local/h3toh5 CFLAGS = -I$(HDF5)/include -I$(HE2)/include -I$(HE5)/include -I$(HDF)/include -I$(h35)/include LIBS=-L$(h35)/lib -L$(HDF5)/lib -L$(HDF)/lib -L$(HE2)/lib/sun5 -L$(HE5)/lib/sun5 -L$(h35LIB)/lib -lh3toh5
-lhdfeos -lhe5_hdfeos -lhdf5 -lmfhdf -ldf -lz -ljpeg -lGctp -lnsl -lm -lsocket -lnsl heconvert: convert.o $(CC) -o heconvert convert.o $(LIBS) convert.o: convert.c