:mod:`~grand.io` --- Utilities for reading or writing data ========================================================== .. module:: grand.io ---- Overview -------- The :mod:`grand.io` module provides utilities for easy reading (writing) Python objects from (to) data files. The following types are supported: numpy :class:`~numpy.array`, :class:`~astropy.coordinates.BaseRepresentation`, :class:`~bytes`, :class:`~float`, :class:`~int`, astropy :class:`~astropy.units.Quantity`, :class:`~str`. In addition any :class:`~list` or :class:`~tuple` of numpy :class:`~numpy.array` or astropy :class:`~astropy.units.Quantity` are stored as a numeric table with annotated columns. Reading and writing coordinate frames is only supported for the :mod:`~grand` specific :class:`~grand.ECEF` and :class:`grand.LTP` frames. Note that the coordinates data, if any, are not stored. If needed they must written explictly as a separate entry. Inside a file data are structured under :class:`~grand.io.DataNode`. The :func:`~grand.io.open` function allows to access the root node of a file. Conceptualy a :class:`~grand.io.DataNode` can be seen as a folder within a file system while the data would be files inside the folder. .. note:: A C complient memory layout is used for storing the data allowing for an efficient read back from C. Numeric values are annotated, e.g. with a unit, column name or a metatype. Note also that Python objects are preserved when writing and reading back from a data file, e.g. the object type is restored. .. warning:: The HDF5 format is currently used since it allows a hierarchical organization of data, has bindings both for C and Python and automatic compression. Note however that several `issues`_ have been reported when using HDF5, e.g. reliability and performances. Therefore, the underlying data format might change in the future, e.g. for a tar archive which actually provides the same features. Accessing data files -------------------- Data files are accessed using the :func:`~grand.io.open` function. The semantic is the same than the Python :func:`~open` or C `fopen` functions. .. autofunction:: grand.io.open Open file and return the root :class:`~grand.io.DataNode` object. If the file cannot be opened, an :class:`~OSError` is raised. For example, the following creates a new data file using the root :class:`~grand.io.DataNode` as a closing context manager. >>> with io.open('data.hdf5', 'w') as root: ... pass In order to read from (append to) a file use the `'r'` (`'a'`) mode. Managing data nodes ------------------- .. autoclass:: grand.io.DataNode Sub-nodes can be accessed by index providing their relative path w.r.t. this node. For example the following gets a reference to the sub-node named *apples*. .. >>> root = io.open('data.hdf5', 'w') >>> node = root.branch('apples') >>> node = root['apples'] .. >>> root.close() Note that an :class:`~IndexError` is raised if the sub-node does not exist. Use the :func:`~grand.io.DataNode.branch` method in order to create a new sub-node. The :func:`~grand.io.DataNode.read` and :func:`~grand.io.DataNode.write` methods allow to read and write data to this node. .. autoproperty:: grand.io.DataNode.children Iterator over the sub-nodes inside this node. For example, the loop below iterates over all sub-nodes below the root one. .. >>> root = io.open('data.hdf5') >>> for node in root.children: ... pass .. >>> root.close() .. autoproperty:: grand.io.DataNode.elements Iterator over the data elements inside the node. For example, the loop below iterates over all data elements in the root node. .. >>> root = io.open('data.hdf5') >>> for name, data in root.elements: ... pass .. >>> root.close() .. note:: The data are loaded from disk at each loop iteration. Use the :func:`~grand.io.DataNode.read` method instead if you only want to load a specific data element. .. autoproperty:: grand.io.DataNode.filename The name of the data file containing this :class:`~grand.io.DataNode`. .. autoproperty:: grand.io.DataNode.name The name of this :class:`~grand.io.DataNode`. .. autoproperty:: grand.io.DataNode.parent A reference to the parent :class:`~grand.io.DataNode` or `None`. .. autoproperty:: grand.io.DataNode.path The full path of this :class:`~grand.io.DataNode` w.r.t. the root node. .. automethod:: grand.io.DataNode.branch Get a reference to a sub-node. .. note:: If the node does not exists it is created and initialised empty. Use an indexed access instead if you want to access only existing sub-nodes. .. automethod:: grand.io.DataNode.close Close the data file containing the current node. .. warning:: Closing the data file disables all related nodes, parents and children, which might lead to unexpected results. Therefore it is stronly recommended to wrap all I/Os within a root node context (i.e. using a `with` statement as shown below) instead of explictly calling the :func:`~grand.io.DataNode.close` method. >>> with open('data.hdf5') as root: ... # Do all I/Os within this context ... pass The data file is automatically closed when exiting the root node's context. Note that only the root node is a closing context. Contexts spawned from a sub-node do not close the data file. .. automethod:: grand.io.DataNode.read Read data from this node. The optional argument *dtype* allows to specify the data type to use for the read values. By default the native data type in the file is used. Multiple data can be read at once by providing multiple arguments. For example the following reads out two data elements from the root node. .. >>> root = io.open('data.hdf5', 'w') >>> root.write('frequency', 1 * u.Hz) >>> root.write('position', CartesianRepresentation(1, 2, 3, unit='m')) >>> frequency, position = root.read('frequency', 'position') .. >>> root.close() .. automethod:: grand.io.DataNode.write Write data to this node. The data type (*dtype*) can be explictly specified a `numpy data type`_. For example the following writes an astropy :class:`~astropy.units.Quantity` as a 32 bits floating point. .. >>> root = io.open('data.hdf5', 'w') >>> root.write('frequency', 1 * u.Hz, dtype='f') .. >>> root.close() Note that if *dtype* is omitted the native Python precision is used when writing the data to file. The *unit* keyword allows to specify the unit to use when writing an astropy :class:`~astropy.units.Quantity`. If omitted the native unit is used. The *units* and *columns* keywords allow to specify the units an names of columns when writing a table, i.e. a :class:`~list` or :class:`~tuple` of numpy :class:`~numpy.array` or astropy :class:`~astropy.units.Quantity`. Examples -------- Serialising Python data ^^^^^^^^^^^^^^^^^^^^^^^ The following example shows how to write basic Python objects to a data file. .. >>> import numpy as np >>> with io.open('data.hdf5', 'w') as root: ... root.write('example_of_cstring', b'This is a C like string\x00') ... root.write('example_of_str', 'This is a Python string') ... root.write('example_of_number', 1) ... root.write('example_of_array', np.array((1, 2, 3))) .. note:: Python :class:`~str` objects differ from C ones. In order to generate a C like string a :class:`bytes` object must be used with an explicit null termination. Conversely, reading the data back can be done as following. >>> with io.open('data.hdf5') as root: ... cstring = root.read('example_of_cstring') >>> python_string = cstring.decode() Working with physical data ^^^^^^^^^^^^^^^^^^^^^^^^^^ The following example illustrates how to create a new data file and populate it with some physical data organised under various branches. >>> with io.open('data.hdf5', 'w') as root: ... root.write('energy', 1E+18 * u.eV) ... ... with root.branch('fields/a0') as a0: ... r = CartesianRepresentation(0, 0, 0, unit='m') ... a0.write('r', r, dtype='f') # Store the data with a specific format ... ... E = CartesianRepresentation( ... np.array((0, 0, 0)), ... np.array((0, 1, 0)), ... np.array((0, 0, 0)), ... unit='uV/m' ... ) ... a0.write('E', E, unit='V/m') # Store the data with a specific unit .. _issues: https://cyrille.rossant.net/moving-away-hdf5 .. _numpy data type: https://numpy.org/devdocs/user/basics.types.html