BioHDF version 0.3 alpha
Scalable NGS Data Storage Based on HDF5
|
Represents NGS reads (FASTQ, etc. More...
Data Structures | |
struct | bioh5g_read_data |
Read data container. More... | |
Typedefs | |
typedef struct _bioh5g_reads * | bioh5g_reads |
BioHDF reads collection handle. | |
typedef struct _bioh5g_reads_iterator * | bioh5g_reads_iterator |
BioHDF reads iterator handle. | |
typedef struct _bioh5g_reads_properties * | bioh5g_reads_properties |
BioHDF reads creation properties. | |
Enumerations | |
enum | bioh5g_reads_type { BASE_SPACE, COLOR_SPACE } |
Describes the data "space" of the reads (base, color, etc.) More... | |
enum | bioh5g_reads_format { FASTQ_FORMAT, FASTA_FORMAT } |
The text file format for read I/O. More... | |
Functions | |
BIOHDF_API biohdf_error | BIOH5Gcheck_reads_presence (biohdf_file file, const char *path, int *presence) |
Test if a reads collection exists. | |
BIOHDF_API biohdf_error | BIOH5Gcreate_reads_collection (biohdf_file file, bioh5g_reads_properties properties, const char *path, bioh5g_reads *reads) |
Create (and open) a new reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gopen_reads_collection (biohdf_file file, const char *path, biohdf_open_mode mode, bioh5g_reads *reads) |
Open an existing reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gclose_reads_collection (bioh5g_reads *reads) |
Close an open reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gget_reads_count (const bioh5g_reads reads, int64_t *count) |
Get the number of stored reads in a collection. | |
BIOHDF_API biohdf_error | BIOH5Gcreate_reads_iterator (const bioh5g_reads reads, bioh5g_reads_iterator *iter) |
Create an iterator for a reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gdestroy_reads_iterator (bioh5g_reads_iterator *iter) |
Destroy an iterator for a reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gadd_read (const bioh5g_reads reads, const bioh5g_read_data *data) |
Add a read to a collection. | |
BIOHDF_API biohdf_error | BIOH5Gget_index_of_last_added_read (const bioh5g_reads reads, int64_t *index) |
Get the index of the last read that was added. | |
BIOHDF_API biohdf_error | BIOH5Gget_next_read (bioh5g_reads_iterator iter, int64_t *index, bioh5g_read_data **data) |
Get the next read from a reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gget_read (const bioh5g_reads reads, int64_t index, bioh5g_read_data **data) |
Given a read index, get the read from a reads collection. | |
BIOHDF_API biohdf_error | BIOH5Gfree_read (bioh5g_read_data **data) |
Free read data that has been obtained from the library. | |
BIOHDF_API biohdf_error | BIOH5Gcreate_read_string (const bioh5g_read_data *read, bioh5g_reads_format format, char **read_string) |
Create a read string in a given format (FASTQ, etc.) | |
BIOHDF_API biohdf_error | BIOH5Gwrite_read_to_stream (const bioh5g_read_data *read, bioh5g_reads_format format, FILE *stream) |
Output a read to a stream in a given format. | |
Functions: Data Accessors | |
BIOHDF_API biohdf_error | BIOH5Gcreate_read_data (bioh5g_read_data **data) |
Create a read data container. | |
BIOHDF_API biohdf_error | BIOH5Gget_read_identifier (bioh5g_read_data *data, char **identifier) |
Get the read identifier. | |
BIOHDF_API biohdf_error | BIOH5Gset_read_identifier (bioh5g_read_data *data, char *identifier) |
Set the read identifier. | |
BIOHDF_API biohdf_error | BIOH5Gget_read_sequence (bioh5g_read_data *data, char **sequence) |
Get the read sequence. | |
BIOHDF_API biohdf_error | BIOH5Gset_read_sequence (bioh5g_read_data *data, char *sequence) |
Set the read sequence. | |
BIOHDF_API biohdf_error | BIOH5Gget_read_quality_values (bioh5g_read_data *data, char **quality_values) |
Get the read quality values. | |
BIOHDF_API biohdf_error | BIOH5Gset_read_quality_values (bioh5g_read_data *data, char *quality_values) |
Set the read quality values. | |
Functions: Collection Creation Properties | |
BIOHDF_API biohdf_error | BIOH5Gcreate_reads_properties (bioh5g_reads_properties *props) |
Create a reads properties container. | |
BIOHDF_API biohdf_error | BIOH5Gdestroy_reads_properties (bioh5g_reads_properties *props) |
Destroy a reads properties container. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_reads_type (bioh5g_reads_properties props, bioh5g_reads_type reads_type) |
Set reads properties reads type. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_chunk_size (bioh5g_reads_properties props, int64_t chunk_size) |
Set reads properties chunk size. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_compression_level (bioh5g_reads_properties props, compression_level level) |
Set reads properties compression level. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_sequences_scheme (bioh5g_reads_properties props, biohdf_string_storage_scheme scheme) |
Set reads properties sequences storage scheme. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_identifiers_scheme (bioh5g_reads_properties props, biohdf_string_storage_scheme scheme) |
Set reads properties identifiers storage scheme. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_sequences_length (bioh5g_reads_properties props, size_t length) |
Set reads properties sequences length. | |
BIOHDF_API biohdf_error | BIOH5Gset_reads_properties_identifiers_length (bioh5g_reads_properties props, size_t length) |
Set reads properties identifiers length. |
Represents NGS reads (FASTQ, etc.
entries).
enum bioh5g_reads_format |
enum bioh5g_reads_type |
BIOHDF_API biohdf_error BIOH5Gadd_read | ( | const bioh5g_reads | reads, |
const bioh5g_read_data * | data | ||
) |
Add a read to a collection.
reads | A BioHDF reads handle |
data | A BioHDF read |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gcheck_reads_presence | ( | biohdf_file | file, |
const char * | path, | ||
int * | presence | ||
) |
Test if a reads collection exists.
This function will return TRUE if a collection of the same type exists at the named location. If any other HDF5 or BioHDF object with that same name exists, TRUE will be returned as well as an error code, the assumption being that Bad Code(tm) that does not check return values will be more likely to attempt to open code (and fail), rather than create things (which my partially succeed, making a mess).
file | A BioHDF file handle | |
path | The BioHDF path to the collection | |
[out] | presence | TRUE if the collection exists, FALSE if it does not. |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gclose_reads_collection | ( | bioh5g_reads * | reads | ) |
Close an open reads collection.
This function will set the collection handle to NULL after freeing it.
[in,out] | reads | A BioHDF reads handle |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gcreate_read_string | ( | const bioh5g_read_data * | read, |
bioh5g_reads_format | format, | ||
char ** | read_string | ||
) |
Create a read string in a given format (FASTQ, etc.)
The read sequence and identifier must not be NULL. If the format is FASTQ, the quality values must not be NULL. The sequence and quality values must also have the same length for FASTQ output.
read | The BioHDF read | |
format | The format for the output string | |
[out] | read_string | The output string |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gcreate_reads_collection | ( | biohdf_file | file, |
bioh5g_reads_properties | properties, | ||
const char * | path, | ||
bioh5g_reads * | reads | ||
) |
Create (and open) a new reads collection.
The collection handle returned by this function will be ready to accept I/O.
file | A BioHDF file handle | |
properties | Collection creation properties | |
path | The BioHDF path to the collection | |
[out] | reads | A BioHDF reads handle |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gcreate_reads_iterator | ( | const bioh5g_reads | reads, |
bioh5g_reads_iterator * | iter | ||
) |
Create an iterator for a reads collection.
reads | A BioHDF reads handle | |
[out] | iter | An iterator for a reads collection |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gdestroy_reads_iterator | ( | bioh5g_reads_iterator * | iter | ) |
Destroy an iterator for a reads collection.
The iterator is set to NULL as a part of deletion.
[in,out] | iter | An iterator for a reads collection |
BIOHDF_API biohdf_error BIOH5Gfree_read | ( | bioh5g_read_data ** | data | ) |
Free read data that has been obtained from the library.
The data is set to NULL as a part of deletion.
[in,out] | data | The BioHDF read |
BIOHDF_API biohdf_error BIOH5Gget_index_of_last_added_read | ( | const bioh5g_reads | reads, |
int64_t * | index | ||
) |
Get the index of the last read that was added.
Useful for adding SAM lines where a link to a given read must be created.
reads | A BioHDF reads handle | |
[out] | index | The index of the last read that was added |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gget_next_read | ( | bioh5g_reads_iterator | iter, |
int64_t * | index, | ||
bioh5g_read_data ** | data | ||
) |
Get the next read from a reads collection.
When the iterator has finished traversing the collection, data will be NULL, index will be -1 and the return value will be BIOHDF_NO_ERROR.
iter | An iterator for a reads collection | |
[out] | index | The index of this read |
[out] | data | The BioHDF read |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gget_read | ( | const bioh5g_reads | reads, |
int64_t | index, | ||
bioh5g_read_data ** | data | ||
) |
Given a read index, get the read from a reads collection.
reads | A BioHDF reads handle | |
index | The index of this read | |
[out] | data | The BioHDF read |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gget_reads_count | ( | const bioh5g_reads | reads, |
int64_t * | count | ||
) |
Get the number of stored reads in a collection.
reads | A BioHDF reads handle | |
[out] | count | The number of reads in the collection |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gopen_reads_collection | ( | biohdf_file | file, |
const char * | path, | ||
biohdf_open_mode | mode, | ||
bioh5g_reads * | reads | ||
) |
Open an existing reads collection.
file | A BioHDF file handle | |
path | The BioHDF path to the collection | |
mode | The access mode (read-only | read-write) | |
[out] | reads | A BioHDF reads handle |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE
BIOHDF_API biohdf_error BIOH5Gwrite_read_to_stream | ( | const bioh5g_read_data * | read, |
bioh5g_reads_format | format, | ||
FILE * | stream | ||
) |
Output a read to a stream in a given format.
This saves you from having to create temp strings that will just be dumped to a stream.
The read sequence and identifier must not be NULL. If the format is FASTQ, the quality values must not be NULL. The sequence and quality values must also have the same length for FASTQ output.
read | The BioHDF read |
format | The format for the output string |
stream | The output stream (can be STDOUT) |
CHECK*PARAMETERS
CODE
SUCCESS
FAILURE