Mesh Oriented datABase  (version 5.5.1)
An array-based unstructured mesh library
c_h5mformat.h
Go to the documentation of this file.
1 /** \page h5mmain H5M File Format API
2  *
3  *\section Intro Introduction
4  *
5  * MOAB's native file format is based on the HDF5 file format.
6  * The most common file extension used for such files is .h5m.
7  * A .h5m file can be identified by the top-level \c tstt group
8  * in the HDF5 file.
9  *
10  * The API implemented by this library is a wrapper on top of the
11  * underlying HDF5 library. It provides the following features:
12  * - Enforces and hides MOAB's expected file layout
13  * - Provides a slightly higher-level API
14  * - Provides some backwards compatibility for file layout changes
15  *
16  *
17  *\section Overview H5M File Layout
18  *
19  * The H5M file format relies on the use of a unique entity ID space for
20  * all vertices, elements, and entity sets stored in the file. This
21  * ID space is defined by the application. IDs must be unique over all
22  * entity types (a vertex and an entity set may not have the same ID.)
23  * The IDs must be positive (non-zero) integer values.
24  * There are no other requirements imposed by the format on the ID space.
25  *
26  * Elements, with the exception of polyhedra, are defined by a list of
27  * vertex IDs. Polyhedra are defined by a list of face IDs. Entity sets
28  * have a list of contained entity IDs, and lists of parent and child
29  * entity set IDs. The set contents may include any valid entity ID,
30  * including other sets. The parent and child lists are expected to
31  * contain only entity IDs corresponding to other entity sets. A zero
32  * entity ID may be used in some contexts (tag data with the mhdf_ENTITY_ID
33  * property) to indicate a 'null' value,
34  *
35  * Element types are defined by the combination of a topology identifier (e.g.
36  * hexahedral topology) and the number of nodes in the element.
37  *
38  *
39  *\section Root The tstt Group
40  *
41  * All file data is stored in the \c tstt group in the HDF5 root group.
42  * The \c tstt group may have an optional scalar integer attribute
43  * named \c max_id . This attribute, if present, should contain the
44  * value of the largest entity ID used internally to the file. It can
45  * be used to verify that the code reading the file is using an integer
46  * type of sufficient size to accommodate the entity IDs.
47  *
48  * The \c tstt group contains four sub-groups, a datatype object, and a
49  * dataset object. The four sub-groups are: \c nodes, \c elements,
50  * \c sets, and \c tags. The dataset is named \c history .
51  *
52  * The \c elemtypes datatype is an enumeration of the elem topologies
53  * used in the file. The element topologies understood by MOAB are:
54  * - \c Edge
55  * - \c Tri
56  * - \c Quad
57  * - \c Polygon
58  * - \c Tet
59  * - \c Pyramid
60  * - \c Prism
61  * - \c Knife
62  * - \c Hex
63  * - \c Polyhedron
64  *
65  *
66  *\section History The history DataSet
67  *
68  * The \c history DataSet is a list of variable-length strings with
69  * application-defined meaning.
70  *
71  *\section Nodes The nodes Group
72  *
73  *
74  * The \c nodes group contains a single DataSet and an optional
75  * subgroup. The \c tags subgroup is described in the
76  * \ref Dense "section on dense tag storage".
77  *
78  * The \c coordinates
79  * DataSet contains the coordinates of all vertices in the mesh.
80  * The DataSet should contain floating point values and have a dimensions
81  * \f$ n \times d \f$, where \c n is the number of vertices and \c d
82  * is the number of coordinate values for each vertex.
83  *
84  * The \c coordinates DataSet must have an integer attribute named \c start_id .
85  * The vertices are then defined to have IDs beginning with this value
86  * and increasing sequentially in the order that they are defined in the
87  * \c coordinates table.
88  *
89  *
90  *\section Elements The elements Group
91  *
92  * The \c elements group contains an application-defined number of
93  * subgroups. Each subgroup defines one or more mesh elements that
94  * have the same topology and length of connectivity (number of nodes
95  * for any topology other than \c Polyhedron.) The names of the subgroups
96  * are application defined. MOAB uses a combination of the element
97  * topology name and connectivity length (e.g. "Hex8".).
98  *
99  * Each subgroup must have an attribute named \c element_type that
100  * contains one of the enumerated element topology values defined
101  * in the \c elemtypes datatype described in a \ref Root "previous section".
102  *
103  * Each subgroup contains a single DataSet named \c connectivity and an
104  * optional subgroup named \c tags. The \c tags subgroup is described in the
105  * \ref Dense "section on dense tag storage".
106  *
107  * The \c connectivity DataSet is an \f$ n \times m \f$ array of integer
108  * values. The DataSet contains one row for each of the \c n contained
109  * elements, where the connectivity of each element contains \c m IDs. For
110  * all element types supported by MOAB, with the exception of polyhedra,
111  * the element connectivity list is expected to contain only IDs
112  * corresponding to nodes.
113  *
114  * Each element \c connectivity DataSet must have an integer attribute
115  * named \c start_id . The elements defined in the connectivity table
116  * are defined to have IDs beginning with this value and increasing
117  * sequentially in the order that they are defined in the table.
118  *
119  *
120  *\section Sets The sets Group
121  *
122  * The \c sets group contains the definitions of any entity sets stored
123  * in the file. It contains 1 to 4 DataSets and the optional \c tags
124  * subgroup. The \c contents, \c parents, and \c children data sets
125  * are one dimensional arrays containing the concatenation of the
126  * corresponding lists for all of the sets represented in the file.
127  *
128  * The \c lists DataSet is a \f$ n \times 4 \f$ table, having one
129  * row of four integer values for each set. The first three values
130  * for each set are the indices into the \c contents, \c children,
131  * and \c parents DataSets, respectively, at which the \em last value
132  * for set is stored. The contents, child, and parent lists for
133  * sets are stored in the corresponding datasets in the same order as
134  * the sets are listed in the \c lists DataSet, such that the index of
135  * the first value in one of those tables is one greater than the
136  * corresponding end index in the \em previous row of the table. The
137  * number of content entries, parents, or children for a given set can
138  * be calculated as the difference between the corresponding end index
139  * entry for the current set and the same entry in the previous row
140  * of the table. If the first set in the \c lists DataSet had no parent
141  * sets, then the corresponding index in the third column of the table
142  * would be \c -1. If it had one parent, the index would be \c 0. If it
143  * had two parents, the index would be \c 1, as the first parent would be
144  * stored at position 0 of the \c parents DataSet and the second at position
145  * 1.
146  *
147  * The fourth column of the \c lists DataSet is a series of bit flags
148  * defining some properties of the sets. The four bit values currently
149  * defined are:
150  * - 0x1 owner
151  * - 0x2 unique
152  * - 0x4 ordered
153  * - 0x8 range compressed
154  *
155  * The fourth (most significant) bit indicates that, in the \c contents
156  * data set, that the contents list for the corresponding set is stored
157  * using a single range compression. Rather than storing the IDs of the
158  * contained entities individually, each ID \c i is followed by a count
159  * \c n indicating that the set contains the contiguous range of IDs
160  * \f$ [i, i+n-1] \f$.
161  *
162  * The three least significant bits specify intended properties of the
163  * set and are unrelated to how the set data is stored in the file. These
164  * properties, described briefly from least significant bit to most
165  * significant are: contained entities should track set membership;
166  * the set should contain each entity only once (strict set); and
167  * that the order of the entries in the set should be preserved.
168  *
169  * Similar to the \c nodes/coordinates and \c elements/.../connectivity
170  * DataSets, the \c lists DataSet must have an integer attribute
171  * named \c start_id . IDs are assigned to to sets in the order that
172  * they occur in the \c lists table, beginning with the attribute value.
173  *
174  * The \c sets group may contain a subgroup names \c tags. The \c tags
175  * subgroup is described in the \ref Dense "section on dense tag storage".
176  *
177  *
178  * \section Tags The tags Group
179  *
180  * The \c tags group contains a sub-group for each tag defined
181  * in the file. These sub-groups contain the definition of the
182  * tag and may contain some or all of the tag values associated with
183  * entities in the file. However, it should be noted that tag values
184  * may also be stored in the "dense" format as described in the
185  * \ref Dense "section on dense tag storage".
186  *
187  * Each sub-group of the \c tags group contains the definition for
188  * a single tag. The name of each sub-group is the name of the
189  * corresponding tag. Non-printable characters, characters
190  * prohibited in group names in the HDF5 file format, and the
191  * backslash ('\') character are encoded
192  * in the name string by a backslash ('\') character followed by
193  * the ASCII value of the character expressed as a pair of hexadecimal
194  * digits. Thus the backslash character would be represented as \c \5C .
195  * Each tag group should also contain a comment which contains the
196  * unencoded tag name.
197  *
198  * The tag sub-group may have any or all of the following four attributes:
199  * \c default, \c global, \c is_handle, and \c variable_length.
200  * The \c default attribute, if present,
201  * must contain a single tag value that is to be considered the 'default'
202  * value of the tag. The \c global attribute, if present, must contain a
203  * single tag value that is the value of the tag as set on the mesh instance
204  * (MOAB terminology) or root set (ITAPS terminology.) The presence of the
205  * \c is_handle attribute (the value, if any, is meaningless) indicates
206  * that the tag values are to be considered entity IDs. After reading the
207  * file, the reader should map any such tag values to whatever mechanism
208  * it uses to reference the corresponding entities read from the file.
209  * The presence of the \c variable_length attribute indicates that each
210  * tag value is a variable-length array. The reader should rely on the
211  * presence of this attribute rather than the presence of the \c var_indices
212  * DataSet discussed below because the file may contain the definition of
213  * a variable length tag without containing any values for that tag. In such
214  * a case, the \c var_indices DataSet will not be present.
215  *
216  * Each tag sub-group will contain a committed type object named \c type .
217  * This type must be the type instance used by the \c global and \c default
218  * attributes discussed above and any tag value data sets. For fixed-length
219  * tag data, the tag types understood by MOAB are:
220  * - opaque data
221  * - a single floating point value
222  * - a single integer value
223  * - a bit field
224  * - an array of floating point values
225  * - an array of integer values
226  * Any other data types will be treated as opaque data.
227  * For Variable-length tag data, MOAB expects the \c type object to be
228  * one of:
229  * - opaque data
230  * - a single floating point value
231  * - a single integer value
232  *
233  * For fixed-length tags, the tag sub-group may contain 'sparse' formatted
234  * tag data, which is comprised of two data sets: \c id_list and \c values.
235  * Both data sets must be 1-dimensional arrays of the same length. The
236  * \c id_list data set contains a list of entity IDs and the \c values
237  * data set contains a list of corresponding tag values. The data stored in
238  * the \c values table must be of type \c type. Fixed-length tag values
239  * may also be stored in the "dense" format as described in the
240  * \ref Dense "section on dense tag storage". A mixture of both sparse-
241  * and dense-formatted tag values may be present for a single tag.
242  *
243  * For variable-length tags the tag values, if any, are always stored
244  * in the tag sub-group of the \c tags group and are represented by
245  * three one-dimensional data sets: \c id_list, \c var_indices, and \c values.
246  * Similar to the fixed-length sparse-formatted tag data, the \c id_list
247  * contains the IDs of the entities for which tag values are defined.
248  * The \c values dataset contains the concatenation of the tag values
249  * for each of the entities referenced by ID in the \c id_list table,
250  * in the order that the entities are referenced in the \c id_list table.
251  * The \c var_indices table contains an index into the \c values data set
252  * for each entity in \c id_list. The index indicates the position of
253  * the \em last tag value for the entity in \c values. The index of
254  * the first value is one greater than the
255  * corresponding end index for the \em entry in \c var_indices. The
256  * number of tag values for a given entity can
257  * be calculated as the difference between the corresponding end index
258  * entry for the current entity and the previous value in the \c var_indices
259  * dataset.
260  *
261  *
262  * \section Dense The tags Sub-Groups
263  *
264  * Data for fixed-length tags may also be stored in the \c tags sub-group
265  * of the \c nodes, \c sets, and subgroups of the \c elements group.
266  * Values for given tag are stored in a dataset within the \c tags sub-group
267  * that has the following properties:
268  * - The name must be the same as that of the tag definition in the main
269  * \c tags group
270  * - The type of the data set must be the committed type object stored
271  * as \c /tstt/tags/<tagname>/type .
272  * - The data set must have the same length as the data set in the
273  * parent group with the \c start_id attribute.
274  *
275  * If dense-formatted data is specified for any entity in the group, then
276  * it must be specified for every entity in the group. The table is
277  * expected to contain one value for each entity in the corresponding
278  * primary definition table (\c /tstt/nodes/coordinates ,
279  * \c /tstt/elements/<name>/connectivity , or \c /tstt/sets/list), in the
280  * same order as the entities in that primary definition table.
281  *
282  *
283  *\section mhdf_set mhdf Meshset data
284  *
285  * Meshset data is divided into three groups of data. The set-list/meta-information table,
286  * the set contents table and the set children table. Each is written and read independently.
287  *
288  * The set list table contains one row for each set. Each row contains four values:
289  * {content list end index, child list end index, parent list end index, and flags}. The flags
290  * value is a collection of bits with
291  * values defined in \ref mhdf_set_flag . The all the flags except \ref mhdf_SET_RANGE_BIT are
292  * saved properties of the mesh data and are not relevant to the actual file in any way. The
293  * \ref mhdf_SET_RANGE_BIT flag is a toggle for how the meshset contents (not children) are saved.
294  * It is an internal property of the file format and should not be passed on to the mesh database.
295  * The content list end index and child list end index are the indices of the last entry for the
296  * set in the contents and children tables respectively. In the case where a set has either no
297  * children or no contents, the last index of should be the same as the last index of the previous
298  * set in the table, or -1 for the first set in the table. Thus the first index is always one
299  * greater than the last index of the previous set. If the first index, calculated as one greater
300  * that the last index of the previous set is greater than the last index of the current set, then
301  * there are no values in the corresponding contents or children table for that set.
302  *
303  * The set contents table is a vector of integer global IDs that is the concatenation of the contents
304  * data for all of the mesh sets. The values are stored corresponding to the order of the sets
305  * in the set list table. Depending on the value of \ref mhdf_SET_RANGE_BIT in the flags field of
306  * the set list table, the contents for a specific set may be stored in one of two formats. If the
307  * flag is set, the contents list is a list of pairs where each pair is a starting global Id and a
308  * count. For each pair, the set contains the range of global Ids beginning at the start value.
309  * If the \ref mhdf_SET_RANGE_BIT flag is not set, the meshset contents are a simple list of global Ids.
310  *
311  * The meshset child table is a vector of integer global IDs. It is a concatenation of the child
312  * lists for all the mesh sets, in the order the sets occur in the meshset list table. The values
313  * are always simple lists. The child table may never contain ranges of IDs.
314  *
315  *
316  *\section mhdf_tag mhdf Tag data
317  *
318  * The data for each tag can be stored in two places/formats: sparse and/or
319  * dense. The data may be stored in both, but there should not be redundant
320  * values for the same entity.
321  *
322  * Dense tag data is stored as multiple tables of tag values, one for each
323  * element group. (Note: special \ref mhdf_ElemHandle values are available
324  * for accessing dense tag data on nodes or meshsets via the \ref mhdf_node_type_handle
325  * and \ref mhdf_set_type_handle functions.) Each dense tag table should contain
326  * the same number of entries as the element connectivity table. The tag values
327  * are associated with the corresponding element in the connectivity table.
328  *
329  * Sparse tag data is stored as a global table pair for each tag type. The first
330  * if the pair of tables is a list of Global IDs. The second is the corresponding
331  * tag value for each entity in the ID list.
332  */
333