HL-HDF
Python Interface - PyHL

PyHL is just like the HL-HDF library in that it allows the user to work with HDF5 at a high level.

PyHL is designed to work at the highest level of abstraction using the Python programming language, since Python allows the user to interact directly with HDF5 files. In fact, PyHL is nothing more than a wrapper around HL-HDF but with some additional functionality which is only available in very high level languages such as Python. Like HL-HDF, it is up to the user to define appropriate ways of representing data and using the building blocks available in PyHL to store the data in HDF5.

(PyHL is pronounced "pile", which is an appropriate description of a heirarchy ...)

Compilation and installation

The Python programming language, version 2.5.2, is required along with the Numpy package. Python is found at the Corporation for National Research Initiatives at http://www.python.org/ and Numpy is found at http://numpy.scipy.org/.

Create module _pyhl

If the configure script was not called with –with-python=no the _pyhl module should be compiled together with the rest of the code. If the configure script was called with –with-python=no, then the best thing is to rebuild the whole HL-HDF package (with –with-python=yes) and installation as descriped in Sections Compilation and Installation.

NOTE: Python version 2.5.2 is required to compile _pyhl; otherwise there will be unresolved symbols. Also, be aware that the hdf5 library is linked dynamically which requires that the LD_LIBRARY_PATH contains the path to where libhdf5.so has been installed. contains the path to where libhdf5.so has been installed.

Examples

The creation of HDF5 files with PyHL is quite easy, and there are not to many things one has to know about the HDF5 internals. However, in order to build an HDF5 file, one has to understand that the file should be built sequentialy, i.e. it is not possible to create a subgroup to a group before the group has been created. Neither is it possible to create an attribute or a dataset in a group before the group has been created etc. In other words, always create the top nodes before trying to create nodes under them in the heirarchy.

Another thing to bear in mind is that when the method addNode has been called the nodelist instance will take control over the node, so it will not be possible to alter the node after a call to addNode has been made.

When working with compound types, remember that the data that is passed to setScalarValue and setArrayValue must be a Python string. Also when working with compound types, the itemSize and lhid has to be passed on, otherwise the compound data most likely will be corrupted.

Another thing to be aware of when working with compound types is that the hdf5 library has to be linked dynamically, otherwise it will not be possible to pass the hid_t references between the Python modules.

Time to look at some simple examples. These examples can be found in HLHDF/examples. In order to get these examples to run you will need to set the LD_LIBRARY_PATH to point at the HDF5 so library and the PYTHONPATH needs to be pointing at both the test catalogue where the rave_info_type is defined and the pyhl catalogue.

prompt$ export LD_LIBRARY_PATH=<path to hdf5 library>
prompt$ export PYTHONPATH=<path to HLHDF/pyhl>:<path to HLHDF/test/python>

Then it is just to run the different examples by typing:

prompt% python <TestFileName.py>

Writing a simple HDF5 file

###########################################################################
# Copyright (C) 2009 Swedish Meteorological and Hydrological Institute, SMHI,
#
# This file is part of HLHDF.
#
# HLHDF is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# HLHDF is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with HLHDF.  If not, see <http://www.gnu.org/licenses/>.
###########################################################################

## @package examplee
# @file WriteSimpleFile.py
#
# Example that writes a simple hdf5 file
import _pyhl
from numpy import *

def writeFile():
  # Create an empty node list instance
  aList = _pyhl.nodelist()

  # Create an group called info
  aNode = _pyhl.node(_pyhl.GROUP_ID,"/info")

  # Add the node to the nodelist
  # Remember that the nodelist takes responsibility
  aList.addNode(aNode)

  # Insert the attribute xscale in the group "/info"
  aNode = _pyhl.node(_pyhl.ATTRIBUTE_ID,"/info/xscale")

  # Set the value to a double with value 10.0
  # Note the -1's that has been used since the data not is compaound
  aNode.setScalarValue(-1,10.0,"double",-1)
  aList.addNode(aNode)

  # Similar for yscale,xsize and ysize
  aNode = _pyhl.node(_pyhl.ATTRIBUTE_ID,"/info/yscale")
  aNode.setScalarValue(-1,20.0,"double",-1)
  aList.addNode(aNode)
  aNode = _pyhl.node(_pyhl.ATTRIBUTE_ID,"/info/xsize")
  aNode.setScalarValue(-1,10,"int",-1)
  aList.addNode(aNode)
  aNode = _pyhl.node(_pyhl.ATTRIBUTE_ID,"/info/ysize")
  aNode.setScalarValue(-1,10,"int",-1)
  aList.addNode(aNode)

  # Add a description
  aNode = _pyhl.node(_pyhl.ATTRIBUTE_ID,"/info/description")
  aNode.setScalarValue(-1,"This is a simple example","string",-1)
  aList.addNode(aNode)

  # Add an array of data
  myArray = arange(100)
  myArray = array(myArray.astype('i'),'i')
  myArray = reshape(myArray,(10,10))
  aNode = _pyhl.node(_pyhl.DATASET_ID,"/data")

  # Set the data as an array, note the list with [10,10] which
  # Indicates that it is an array of 10x10 items
  aNode.setArrayValue(-1,[10,10],myArray,"int",-1)
  aList.addNode(aNode)

  # And now just write the file as "simple_test.hdf" with
  # Compression level 9 (highest compression)
  aList.write("simple_test.hdf",9)

if __name__ == "__main__":
  writeFile()

When checking this file with h5dump, the command syntax would be: prompt% h5dump simple_test.hdf And the result would be:

HDF5 "simple_test.hdf" {
GROUP "/" {
   DATASET "data" {
      DATATYPE { H5T_STD_I32LE }
      DATASPACE { SIMPLE ( 10, 10 ) / ( 10, 10 ) }
      DATA {
         0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
         10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
         20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
         30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
         40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
         50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
         60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
         70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
         80, 81, 82, 83, 84, 85, 86, 87, 88, 89,
         90, 91, 92, 93, 94, 95, 96, 97, 98, 99
      }
   }
   GROUP "info" {
      ATTRIBUTE "xscale" {
         DATATYPE { H5T_IEEE_F64LE }
         DATASPACE { SCALAR }
         DATA {
            10
         }
      }
      ATTRIBUTE "yscale" {
         DATATYPE { H5T_IEEE_F64LE }
         DATASPACE { SCALAR }
         DATA {
            20
         }
      }
      ATTRIBUTE "xsize" {
         DATATYPE { H5T_STD_I32LE }
         DATASPACE { SCALAR }
         DATA {
            10
         }
      }
      ATTRIBUTE "ysize" {
         DATATYPE { H5T_STD_I32LE }
         DATASPACE { SCALAR }
         DATA {
            10
         }
      }
      ATTRIBUTE "description" {
         DATATYPE {
            { STRSIZE 25;
              STRPAD H5T_STR_NULLTERM;
              CSET H5T_CSET_ASCII;
              CTYPE H5T_C_S1;
            }
         }
         DATASPACE { SCALAR }
         DATA {
            "This is a simple example"
         }
      }
   }
}
}

Writing an HDF5 file containing a compound datatype

This is a bit more complex since it requires the implementation of a Python C++-module that contains the datatype definition, and a couple of methods for converting data to a string and the other way around.

There is a small example located in the test/python directory called rave_info_type.c which implements a small compound type definition. Basically this module defines an object containing xscale, yscale, xsize and ysize variables. This module has also got a type class which should be used.

###########################################################################
# Copyright (C) 2009 Swedish Meteorological and Hydrological Institute, SMHI,
#
# This file is part of HLHDF.
#
# HLHDF is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# HLHDF is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with HLHDF.  If not, see <http://www.gnu.org/licenses/>.
###########################################################################

## @package examplee
# @file WriteCompoundType.py
#
# Example that writes a compound datatype.
import _pyhl
import _rave_info_type

## Function that writes the actual file
def writeFile():
  # Create the rave info HDF5 type
  typedef = _rave_info_type.type()

  # Create the rave info HDF5 object
  obj = _rave_info_type.object()

  # Set the values
  obj.xsize=10
  obj.ysize=10
  obj.xscale=150.0
  obj.yscale=150.0

  aList = _pyhl.nodelist()

  # Create a datatype node
  aNode = _pyhl.node(_pyhl.TYPE_ID,"/MyDatatype")

  # Make the datatype named
  aNode.commit(typedef.hid())
  aList.addNode(aNode)

  # Create an attribute containing the compound type
  aNode = _pyhl.node(_pyhl.ATTRIBUTE_ID,"/myCompoundAttribute")
  
  # Note that I use both itemSize and lhid
  # Also note how I translate the compound object to a string
  aNode.setScalarValue(typedef.size(),obj.tostring(),"compound",typedef.hid())
  aList.addNode(aNode)

  # Better create a dataset also with the compound type
  obj.xsize=1
  obj.ysize=1
  aNode = _pyhl.node(_pyhl.DATASET_ID,"/myCompoundDataset")

  # I use setArrayValue instead
  aNode.setArrayValue(typedef.size(),[1],obj.tostring(),"compound",typedef.hid())
  aList.addNode(aNode)

  # And finally write the HDF5 file.
  aList.write("compound_test.hdf")

if __name__== "__main__":
  writeFile()

When checking this file with h5dump, the command syntax would be: prompt% h5dump compound_test.hdf and the result would be:

HDF5 "compound_test.hdf" {
GROUP "/" {
   ATTRIBUTE "myCompoundAttribute" {
      DATATYPE {
         H5T_STD_I32LE "xsize";
         H5T_STD_I32LE "ysize";
         H5T_IEEE_F64LE "xscale";
         H5T_IEEE_F64LE "yscale";
      }
      DATASPACE { SCALAR }
      DATA {
         {
            [ 10 ],
            [ 10 ],
            [ 150 ],
            [ 150 ]
         }
      }
   }
   DATATYPE "MyDatatype" {
      H5T_STD_I32LE "xsize";
      H5T_STD_I32LE "ysize";
      H5T_IEEE_F64LE "xscale";
      H5T_IEEE_F64LE "yscale";
   }
   DATASET "myCompoundDataset" {
      DATATYPE {
         "/MyDatatype"
      }
      DATASPACE { SIMPLE ( 1 ) / ( 1 ) }
      DATA {
         {
            [ 1 ],
            [ 1 ],
            [ 150 ],
            [ 150 ]
         }
      }
   }
}
}

Reading a simple HDF5 file

The following example code will read the /info/xscale, /info/yscale and /data fields from the HDF5 file simple_test.hdf.

###########################################################################
# Copyright (C) 2009 Swedish Meteorological and Hydrological Institute, SMHI,
#
# This file is part of HLHDF.
#
# HLHDF is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# HLHDF is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with HLHDF.  If not, see <http://www.gnu.org/licenses/>.
###########################################################################

## @package examplee
# @file ReadSimpleFile.py
#
# Example that reads a simple hdf5 file
import _pyhl

## Reads the file
def readFile():
  aList = _pyhl.read_nodelist("simple_test.hdf")

  # Select individual nodes, instead of all of them
  aList.selectNode("/info/xscale")
  aList.selectNode("/info/yscale")
  aList.selectNode("/data")

  # Fetch the data for selected nodes
  aList.fetch()

  # Print the data
  aNode = aList.getNode("/info/xscale")
  print "XSCALE=" + `aNode.data()`
  aNode = aList.getNode("/info/yscale")
  print "YSCALE=" + `aNode.data()`
  aNode = aList.getNode("/data")
  print "DATA=" + `aNode.data()`

if __name__ == "__main__":
  readFile()

Reading an HDF5 file containing a compound type

This example shows how an HDF5 file containing a compound type in it can be read. It will read the file "compound_test.hdf" that was generated above. Note that this code might not be portable to any other machine due to the usage of the rawdata method.

###########################################################################
# Copyright (C) 2009 Swedish Meteorological and Hydrological Institute, SMHI,
#
# This file is part of HLHDF.
#
# HLHDF is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# HLHDF is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with HLHDF.  If not, see <http://www.gnu.org/licenses/>.
###########################################################################

## @package examplee
# @file ReadCompoundTypeRaw.py
#
# Example that reads a compound datatype by using the raw method in pyhl
import _pyhl
import _rave_info_type

def readFile():
  # There is no meaning creating the type
  obj = _rave_info_type.object()
  aList = _pyhl.read_nodelist("compound_test.hdf")

  # Select everything for retrival
  aList.selectAll()
  aList.fetch()
  aNode = aList.getNode("/myCompoundAttribute")

  # Translate from the string representation to object
  obj.fromstring(aNode.rawdata())

  # Display the values
  print "XSIZE="+`obj.xsize`
  print "YSIZE="+`obj.ysize`
  print "XSCALE="+`obj.xscale`
  print "YSCALE="+`obj.yscale`

if __name__ == "__main__":
  readFile()

Reading an HDF5 file containing a compound type (alterntive)

This example shows how an HDF5 file containing a compound type in it can be read. It will read the file "compound_test.hdf" that was generated above. This example should work on any supported platform.

###########################################################################
# Copyright (C) 2009 Swedish Meteorological and Hydrological Institute, SMHI,
#
# This file is part of HLHDF.
#
# HLHDF is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# HLHDF is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with HLHDF.  If not, see <http://www.gnu.org/licenses/>.
###########################################################################

## @package examplee
# @file ReadCompoundType.py
#
# Example that reads a compound datatype by using the compound_data method in pyhl
import _pyhl

def readFile():
  # There is no meaning creating the type
  aList = _pyhl.read_nodelist("compound_test.hdf")

  # Fetch the node
  aNode = aList.fetchNode("/myCompoundAttribute")

  # Translate from the string representation to object
  cdescr = aNode.compound_data()
  print "XSIZE="+`cdescr["xsize"]`
  print "YSIZE="+`cdescr["ysize"]`
  print "XSCALE="+`cdescr["xscale"]`
  print "YSCALE="+`cdescr["yscale"]`


if __name__ == "__main__":
  readFile()
  

Creating a HDF5 image with a reference

This example shows how it is possible to create a HDF5 that is viewable in for example the H5View visualization tool.

###########################################################################
# Copyright (C) 2009 Swedish Meteorological and Hydrological Institute, SMHI,
#
# This file is part of HLHDF.
#
# HLHDF is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# HLHDF is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with HLHDF.  If not, see <http://www.gnu.org/licenses/>.
###########################################################################

## @package examplee
# @file HDF5ImageWithReference.py
#
# Example that creates a HDF5 image with a reference to a pallette.

import _pyhl
from numpy import *

# Function for creating a dummy palette
def createPalette():
  a = zeros((256, 3), 'b')
  for i in range(0, 256):
    a[i][0] = i
  return a


# Function for creating a dummy image}
def createImage():
  a = zeros((256, 256), 'b')
  for i in range(0, 256):
    for j in range(0, 256):
      a[i][j] = i
  return a

# Function for the HDF5 file
def create_test_image():
  a=_pyhl.nodelist()

  # First create the palette}
  b=_pyhl.node(_pyhl.DATASET_ID,"/PALETTE")
  c=createPalette()
  b.setArrayValue(-1,[256,3],c,"uchar",-1)
  a.addNode(b)
  b=_pyhl.node(_pyhl.ATTRIBUTE_ID,"/PALETTE/CLASS")
  b.setScalarValue(-1,"PALETTE","string",-1)
  a.addNode(b)
  b=_pyhl.node(_pyhl.ATTRIBUTE_ID,"/PALETTE/PAL_VERSION")
  b.setScalarValue(-1,"1.2","string",-1)
  a.addNode(b)
  b=_pyhl.node(_pyhl.ATTRIBUTE_ID,"/PALETTE/PAL_COLORMODEL")
  b.setScalarValue(-1,"RGB","string",-1)
  a.addNode(b)
  b=_pyhl.node(_pyhl.ATTRIBUTE_ID,"/PALETTE/PAL_TYPE")
  b.setScalarValue(-1,"STANDARD8","string",-1)
  a.addNode(b)

  # Now create the image to display}
  b=_pyhl.node(_pyhl.DATASET_ID,"/IMAGE1")
  c=createImage()
  b.setArrayValue(1,[256,256],c,"uchar",-1)
  a.addNode(b)
  b=_pyhl.node(_pyhl.ATTRIBUTE_ID,"/IMAGE1/CLASS")
  b.setScalarValue(-1,"IMAGE","string",-1)
  a.addNode(b)
  b=_pyhl.node(_pyhl.ATTRIBUTE_ID,"/IMAGE1/IMAGE_VERSION")
  b.setScalarValue(-1,"1.2","string",-1)
  a.addNode(b)

  # Finally insert the reference}
  b=_pyhl.node(_pyhl.REFERENCE_ID,"/IMAGE1/PALETTE")
  b.setScalarValue(-1,"/PALETTE","string",-1)
  a.addNode(b)

  a.write("test_image.hdf")

# The main function}
if __name__ == "__main__":
  create_test_image()