Class hdf_file (o2scl_hdf)¶
-
class hdf_file¶
Store data in an compatible HDF5 file.
See also the File I/O with HDF5 section of the o2 User’s guide.
The member functions which write or get data from an HDF file begin with either
get
orset
. Where appropriate, the next character is eitherc
for character,d
for double,f
for float, ori
for int.By default, vectors and matrices are written to HDF files in a chunked format, so their length can be changed later as necessary. The chunk size is chosen in def_chunk() to be the closest power of 10 to the current vector size.
All files not closed by the user are closed in the destructor, but the destructor does not automatically close groups.
- Idea for Future:
This class opens all files in R/W mode, which may cause I/O problems in file systems. This needs to be fixed by allowing the user to open a read-only file. (AWS: 3/16/18 I think this is fixed now.)
The HDF functions do not always consistently choose between throwing exceptions and throwing HDF5 exceptions. Check and/or fix this.
Automatically close groups, e.g. by storing hid_t’s in a stack?
Rewrite the _arr_alloc() functions so that they return a shared_ptr?
Move the code from the ‘filelist’ acol command here into hdf_file.
Note
Currently, HDF I/O functions write data to HDF files assuming that
int
andfloat
have 4 bytes, whilesize_t
anddouble
are 8 bytes. All output is done in little endian format. Whileget
functions can read data with different sizes or in big endian format, theset
functions cannot currently write data this way.Note
It does make sense to write a zero-length vector to an HDF file if the vector does not have a fixed size in order to create a placeholder for future output. Thus the
set_vec()
and allow zero-length vectors and theset_arr()
functions allow thesize_t
parameter to be zero, in which case the pointer parameter is ignored. Theset_vec_fixed()
andset_arr_fixed()
functions do not allow this, and will throw an exception if sent a zero-length vector.Warning
This class is still in development. Because of this, hdf5 files generated by this class may not be easily read by future versions. Later versions of may have stronger guarantees on backwards compatibility.
Mode values for \ref iterate_parms
-
static const int ip_filelist = 1¶
-
static const int ip_name_from_type = 2¶
-
static const int ip_type_from_name = 3¶
-
static const int ip_type_from_pattern = 4¶
-
static const int ip_name_list_from_type = 5¶
-
static void type_process(iterate_parms &ip, int mode, size_t ndims, hsize_t dims[100], hsize_t max_dims[100], std::string base_type, std::string name)¶
Process a type for iterate_func()
-
static herr_t iterate_func(hid_t loc, const char *name, const H5L_info_t *inf, void *op_data)¶
HDF object iteration function.
-
static herr_t iterate_copy_func(hid_t loc, const char *name, const H5L_info_t *inf, void *op_data)¶
HDF5 object iteration function when copying.
Open and close files
-
int open(std::string fname, bool write_access = false, bool err_on_fail = true)¶
Open a file named
fname
.If
err_on_fail
istrue
, this calls the error handler if opening the file fails (e.g. because the file does not exist). Iferr_on_fail
isfalse
and opening the file fails, nothing is done and the function returns the value o2scl::exc_efilenotfound. If the open succeeds, this function returns o2scl::success.
-
void open_or_create(std::string fname)¶
Open a file named
fname
or create if it doesn’t already exist.
-
void close()¶
Close the file.
Manipulate ids
-
hid_t get_file_id()¶
Get the current file id.
-
void set_current_id(hid_t cur)¶
Set the current working id.
-
hid_t get_current_id()¶
Retrieve the current working id.
Simple get functions
If the specified object is not found, the error handler will be called.
-
int getc(std::string name, char &c)¶
Get a character named
name
.
-
int getd(std::string name, double &d)¶
Get a double named
name
.
-
int getf(std::string name, float &f)¶
Get a float named
name
.
-
int geti(std::string name, int &i)¶
Get a integer named
name
.
-
int get_szt(std::string name, size_t &u)¶
Get an unsigned integer named
name
.
-
int gets(std::string name, std::string &s)¶
Get a string named
name
.Note
Strings are stored as character arrays and thus retrieving a string from a file requires loading the information from the file into a character array, and then copying it to the string. This will be slow for very long strings.
-
int gets_var(std::string name, std::string &s)¶
Get a variable length string named
name
.
-
int gets_fixed(std::string name, std::string &s)¶
Get a fixed-length string named
name
.
-
int gets_def_fixed(std::string name, std::string def, std::string &s)¶
Get a fixed-length string named
name
with default values
.
Simple set functions
-
void setc(std::string name, char c)¶
Set a character named
name
to valuec
.
-
void setd(std::string name, double d)¶
Set a double named
name
to valued
.
-
void setf(std::string name, float f)¶
Set a float named
name
to valuef
.
-
void seti(std::string name, int i)¶
Set an integer named
name
to valuei
.
-
void set_szt(std::string name, size_t u)¶
Set an unsigned integer named
name
to valueu
.
-
void sets(std::string name, std::string s)¶
Set a string named
name
to values
.The string is stored in the HDF file as an extensible character array rather than a string.
-
void sets_fixed(std::string name, std::string s)¶
Set a fixed-length string named
name
to values
.This function stores
s
as a fixed-length string in the HDF file. If a dataset namedname
is already present, thens
must not be longer than the string length already specified in the HDF file.
Generic floating point I/O
-
template<class fp_t>
inline int setfp_copy(std::string name, fp_t &f)¶ Set a generic floating point named
name
to valuef
.
-
template<class vec_fp_t>
inline int setfp_vec_copy(std::string name, vec_fp_t &f)¶ Set a generic floating point named
name
to valuef
.
-
template<class fp_t>
inline int getfp_copy(std::string name, fp_t &f)¶ Get a generic floating point named
name
.Warning
No checks are made to ensure that the stored precision matches the precision of the floating point which is used.
-
inline int getfp_copy(std::string name, long double &f)¶
Get a long double named
name
.Warning
No checks are made to ensure that the stored precision matches the precision of the floating point which is used. Note that the precision of the long double type is also not platform-independent.
-
template<size_t N>
inline int getfp_copy(std::string name, boost::multiprecision::number<boost::multiprecision::cpp_dec_float<N>> &f)¶ Get a boost multiprecision floating point named
name
(specialization for Boost multiprecision numbers)Warning
No checks are made to ensure that the stored precision matches the precision of the floating point which is used.
-
template<class vec_fp_t>
inline int getfp_vec_copy(std::string name, vec_fp_t &f)¶ Get a generic floating point named
name
.Warning
No checks are made to ensure that the stored precision matches the precision of the floating point which is used.
-
template<size_t N>
inline int getfp_vec_copy(std::string name, std::vector<boost::multiprecision::number<boost::multiprecision::cpp_dec_float<N>>> &f)¶ Get a generic floating point named
name
(specialization for Boost multiprecision numbers)Warning
No checks are made to ensure that the stored precision matches the precision of the floating point which is used.
Group manipulation
-
hid_t open_group(hid_t init_id, std::string path)¶
Open a group relative to the location specified in
init_id
.Note
In order to ensure that future objects are written to the newly-created group, the user must use set_current_id() using the newly-created group ID for the argument.
-
hid_t open_group(std::string path)¶
Open a group relative to the current location.
Note
In order to ensure that future objects are written to the newly-created group, the user must use set_current_id() using the newly-created group ID for the argument.
-
inline int close_group(hid_t group)¶
Close a previously created group.
Vector get functions
These functions automatically free any previously allocated memory in
v
and then allocate the proper space required to read the information from the HDF file.-
int getd_vec(std::string name, std::vector<double> &v)¶
Get vector dataset and place data in
v
.
-
template<class vec_t>
inline int getd_vec_copy(std::string name, vec_t &v)¶ Get vector dataset and place data in
v
.This works with any vector class which has a
resize()
method.- Idea for Future:
This currently requires a copy, but there may be a way to write a new version which does not.
-
int geti_vec(std::string name, std::vector<int> &v)¶
Get vector dataset and place data in
v
.
-
template<class vec_int_t>
inline int geti_vec_copy(std::string name, vec_int_t &v)¶ Get vector dataset and place data in
v
.- Idea for Future:
This currently requires a copy, but there may be a way to write a new version which does not.
-
int get_szt_vec(std::string name, std::vector<size_t> &v)¶
Get vector dataset and place data in
v
.
-
template<class vec_size_t>
inline int get_szt_vec_copy(std::string name, vec_size_t &v)¶ Get vector dataset and place data in
v
.- Idea for Future:
This currently requires a copy, but there may be a way to write a new version which does not.
-
int gets_vec_copy(std::string name, std::vector<std::string> &s)¶
Get a vector of strings named
name
and store it ins
.
-
int gets_vec_vec_copy(std::string name, std::vector<std::vector<std::string>> &s)¶
Get a vector of a vector of strings named
name
and store it ins
.
-
int getd_vec_vec_copy(std::string name, std::vector<std::vector<double>> &s)¶
Get a vector of a vector of strings named
name
and store it ins
.
Vector set functions
These functions automatically write all of the vector elements to the HDF file, if necessary extending the data that is already present.
-
int setd_vec(std::string name, const std::vector<double> &v)¶
Set vector dataset named
name
withv
.
-
template<class vec_t>
inline int setd_vec_copy(std::string name, const vec_t &v)¶ Set vector dataset named
name
withv
.This requires a copy before the vector is written to the file.
-
int seti_vec(std::string name, const std::vector<int> &v)¶
Set vector dataset named
name
withv
.
-
template<class vec_int_t>
inline int seti_vec_copy(std::string name, vec_int_t &v)¶ Set vector dataset named
name
withv
.This requires a copy before the vector is written to the file.
-
int set_szt_vec(std::string name, const std::vector<size_t> &v)¶
Set vector dataset named
name
withv
.
-
template<class vec_size_t>
inline int set_szt_vec_copy(std::string name, const vec_size_t &v)¶ Set vector dataset named
name
withv
.This requires a copy before the vector is written to the file.
-
int sets_vec_copy(std::string name, const std::vector<std::string> &s)¶
Set a vector of strings named
name
.Developer note: String vectors are reformatted as a single character array, in order to allow each string to have different length and to make each string extensible. The size of the vector
s
is stored as an integer namednw
.Warning
This function copies the data in the vector of strings to a new string before writing the data to the HDF5 file and thus may be less useful for larger vectors or vectors which contain longer strings.
-
int sets_vec_vec_copy(std::string name, const std::vector<std::vector<std::string>> &s)¶
Set a vector of vectors of strings named
name
.Developer note: String vectors are reformatted as a single character array, in order to allow each string to have different length and to make each string extensible. The size of the vector
s
is stored as an integer namednw
.(experimental)
Warning
This function copies the data in the vector of strings to a new string before writing the data to the HDF5 file and thus may be less useful for larger vectors or vectors which contain longer strings.
-
int setd_vec_vec_copy(std::string name, const std::vector<std::vector<double>> &vvd)¶
Set a vector of vectors named
name
.
Matrix get functions
These functions automatically free any previously allocated memory in
m
and then allocate the proper space required to read the information from the HDF file.-
int geti_mat_copy(std::string name, ubmatrix_int &m)¶
Get matrix dataset and place data in
m
.
Matrix set functions
These functions automatically write all of the vector elements to the HDF file, if necessary extending the data that is already present.
-
int seti_mat_copy(std::string name, const ubmatrix_int &m)¶
Set matrix dataset named
name
withm
.
-
template<class arr2d_t>
inline int setd_arr2d_copy(std::string name, size_t r, size_t c, const arr2d_t &a2d)¶ Set a two-dimensional array dataset named
name
withm
.
Tensor I/O functions
-
int getd_ten(std::string name, o2scl::tensor<double, std::vector<double>, std::vector<size_t>> &t)¶
Get a tensor of double-precision numbers from an HDF file.
This version does not require a full copy of the tensor.
-
int geti_ten(std::string name, o2scl::tensor<int, std::vector<int>, std::vector<size_t>> &t)¶
Get a tensor of integers from an HDF file.
This version does not require a full copy of the tensor.
-
int get_szt_ten(std::string name, o2scl::tensor<size_t, std::vector<size_t>, std::vector<size_t>> &t)¶
Get a tensor of size_t from an HDF file.
This version does not require a full copy of the tensor.
-
template<class vec_t, class vec_size_t>
inline int getd_ten_copy(std::string name, o2scl::tensor<double, vec_t, vec_size_t> &t)¶ Get a tensor of double-precision numbers from an HDF file.
This version requires a full copy of the tensor from the HDF5 file into the o2scl::tensor object.
-
template<class vec_t, class vec_size_t>
inline int geti_ten_copy(std::string name, o2scl::tensor<int, vec_t, vec_size_t> &t)¶ Get a tensor of integers from an HDF file.
This version requires a full copy of the tensor from the HDF5 file into the o2scl::tensor object.
-
int setd_ten(std::string name, const o2scl::tensor<double, std::vector<double>, std::vector<size_t>> &t)¶
Write a tensor of double-precision numbers to an HDF file.
You may overwrite a tensor already present in the HDF file only if it has the same rank. This version does not require a full copy of the tensor.
-
int seti_ten(std::string name, const o2scl::tensor<int, std::vector<int>, std::vector<size_t>> &t)¶
Write a tensor of integers to an HDF file.
You may overwrite a tensor already present in the HDF file only if it has the same rank. This version does not require a full copy of the tensor.
-
int set_szt_ten(std::string name, const o2scl::tensor<size_t, std::vector<size_t>, std::vector<size_t>> &t)¶
Write a tensor of integers to an HDF file.
You may overwrite a tensor already present in the HDF file only if it has the same rank. This version does not require a full copy of the tensor.
-
template<class vec_t, class vec_size_t>
inline int setd_ten_copy(std::string name, const o2scl::tensor<double, std::vector<double>, std::vector<size_t>> &t)¶ Write a tensor of double-precision numbers to an HDF file.
You may overwrite a tensor already present in the HDF file only if it has the same rank. This version requires a full copy of the tensor from the o2scl::tensor object into the HDF5 file.
-
template<class vec_t, class vec_size_t>
inline int seti_ten_copy(std::string name, const o2scl::tensor<int, std::vector<int>, std::vector<size_t>> &t)¶ Write a tensor of integers to an HDF file.
You may overwrite a tensor already present in the HDF file only if it has the same rank. This version requires a full copy of the tensor from the o2scl::tensor object into the HDF5 file.
Array get functions
All of these functions assume that the pointer allocated beforehand, and matches the size of the array in the HDF file. If the specified object is not found, the error handler will be called.
-
int getc_arr(std::string name, size_t n, char *c)¶
Get a character array named
name
of sizen
.Note
The pointer
c
must be allocated beforehand to holdn
entries, andn
must match the size of the array in the HDF file.
-
int getd_arr(std::string name, size_t n, double *d)¶
Get a double array named
name
of sizen
.Note
The pointer
d
must be allocated beforehand to holdn
entries, andn
must match the size of the array in the HDF file.
-
int getd_arr_compr(std::string name, size_t n, double *d, int &compr)¶
Get a double array named
name
of sizen
and put the compression type incompr
.Note
The pointer
d
must be allocated beforehand to holdn
entries, andn
must match the size of the array in the HDF file.
-
int getf_arr(std::string name, size_t n, float *f)¶
Get a float array named
name
of sizen
.Note
The pointer
f
must be allocated beforehand to holdn
entries, andn
must match the size of the array in the HDF file.
-
int geti_arr(std::string name, size_t n, int *i)¶
Get an integer array named
name
of sizen
.Note
The pointer
i
must be allocated beforehand to holdn
entries, andn
must match the size of the array in the HDF file.
Array get functions with memory allocation
These functions allocate memory with
new
, which should be freed by the user withdelete
.-
int getc_arr_alloc(std::string name, size_t &n, char *c)¶
Get a character array named
name
of sizen
.
-
int getd_arr_alloc(std::string name, size_t &n, double *d)¶
Get a double array named
name
of sizen
.
-
int getf_arr_alloc(std::string name, size_t &n, float *f)¶
Get a float array named
name
of sizen
.
-
int geti_arr_alloc(std::string name, size_t &n, int *i)¶
Get an integer array named
name
of sizen
.
Array set functions
-
int setc_arr(std::string name, size_t n, const char *c)¶
Set a character array named
name
of sizen
to valuec
.
-
int setd_arr(std::string name, size_t n, const double *d)¶
Set a double array named
name
of sizen
to valued
.
-
int setf_arr(std::string name, size_t n, const float *f)¶
Set a float array named
name
of sizen
to valuef
.
-
int seti_arr(std::string name, size_t n, const int *i)¶
Set a integer array named
name
of sizen
to valuei
.
-
int set_szt_arr(std::string name, size_t n, const size_t *u)¶
Set a integer array named
name
of sizen
to valuei
.
Fixed-length array set functions
If a dataset named
name
is already present, then the user-specified array must not be longer than the array already present in the HDF file.-
int setc_arr_fixed(std::string name, size_t n, const char *c)¶
Set a character array named
name
of sizen
to valuec
.
-
int setd_arr_fixed(std::string name, size_t n, const double *c)¶
Set a double array named
name
of sizen
to valued
.
-
int setf_arr_fixed(std::string name, size_t n, const float *f)¶
Set a float array named
name
of sizen
to valuef
.
-
int seti_arr_fixed(std::string name, size_t n, const int *i)¶
Set an integer array named
name
of sizen
to valuei
.
Get functions with default values
If the requested dataset is not found in the HDF file, the object is set to the specified default value and the error handler is not called.
-
int getc_def(std::string name, char def, char &c)¶
Get a character named
name
.
-
int getd_def(std::string name, double def, double &d)¶
Get a double named
name
.
-
int getf_def(std::string name, float def, float &f)¶
Get a float named
name
.
-
int geti_def(std::string name, int def, int &i)¶
Get a integer named
name
.
-
int get_szt_def(std::string name, size_t def, size_t &i)¶
Get a size_t named
name
.
-
int gets_def(std::string name, std::string def, std::string &s)¶
Get a string named
name
.
-
int gets_var_def(std::string name, std::string def, std::string &s)¶
Get a variable length string named
name
.
Get functions with pre-allocated pointer
-
int getd_vec_prealloc(std::string name, size_t n, double *d)¶
Get a double array
d
pre-allocated to have sizen
.
-
int geti_vec_prealloc(std::string name, size_t n, int *i)¶
Get an integer array
i
pre-allocated to have sizen
.
-
int getd_mat_prealloc(std::string name, size_t n, size_t m, double *d)¶
Get a double matrix
d
pre-allocated to have size(n,m)
-
int geti_mat_prealloc(std::string name, size_t n, size_t m, int *i)¶
Get an integer matrix
i
pre-allocated to have size(n,m)
Find a group
-
int find_object_by_type(std::string type, std::string &name, bool use_regex = false, int verbose = 0)¶
Look in hdf_file
hf
for an object of typetype
and if found, setname
to the associated object name.This function returns 0 if an object of type
type
is found and o2scl::exc_enoprog if it fails.
-
int list_objects_by_type(std::string type, std::vector<std::string> &vs, bool use_regex = false, int verbose = 0)¶
Find all objects in hdf_file
hf
of typetype
and store the names invs
.This function returns 0 if an object of type
type
is found and o2scl::exc_enoprog if it fails.
-
int find_object_by_name(std::string name, std::string &type, bool use_regex = false, int verbose = 0)¶
Look in hdf_file
hf
for an object with namename
and if found, settype
to the associated type.This function returns 0 if an object with name
name
is found and o2scl::exc_enoprog if it fails.
-
int find_object_by_pattern(std::string name, std::string &type, bool use_regex = false, int verbose = 0)¶
Look in hdf_file
hf
for an object with name which matches a regular expression.If an object is found,
type
is set to the associated type. This function returns 0 if an object with namename
is found and o2scl::exc_enoprog if it fails.
-
void file_list(bool use_regex = false, int verbose = 0)¶
List datasets and objects in the top-level of the file.
Public Types
-
typedef boost::numeric::ublas::vector<double> ubvector¶
-
typedef boost::numeric::ublas::matrix<double> ubmatrix¶
-
typedef boost::numeric::ublas::vector<int> ubvector_int¶
-
typedef boost::numeric::ublas::matrix<int> ubmatrix_int¶
Public Functions
-
hdf_file()¶
-
virtual ~hdf_file()¶
-
inline bool has_write_access()¶
If true, then the file has read and write access.
Public Members
-
int compr_type¶
Compression type (support experimental)
-
size_t min_compr_size¶
Minimum size to compress by default.
Protected Functions
-
inline virtual hsize_t def_chunk(size_t n)¶
Default chunk size.
Choose the closest power of 10 which is greater than or equal to 10 and less than or equal to \( 10^6 \).
Protected Attributes
-
hid_t file¶
File ID.
-
bool file_open¶
True if a file has been opened.
-
hid_t current¶
Current file or group location.
-
bool write_access¶
If true, then the file has read and write access.
-
struct iterate_copy_parms¶
Parameters for iterate_copy_func()
-
struct iterate_parms¶
Parameters for iterate_func()
Public Members
-
std::string tname¶
Object name.
-
bool found¶
True if found.
-
std::string type¶
Object type.
-
int verbose¶
Verbose parameter.
-
int mode¶
Iteration mode, either ip_filelist, ip_name_from_type, ip_type_from_name or ip_type_from_pattern.
-
bool use_regex¶
If true, then use regex to match names.
-
std::vector<std::string> name_list¶
The list of names, used by list_objects_of_type()
-
std::string tname¶