Reading EEGUnity HDF5 Files#

This tutorial explains how to read .hdf5 files exported by UnifiedDataset.eeg_batch.export_h5Dataset.

HDF5 Layout Used by EEGUnity#

For each EEG file, EEGUnity creates one HDF5 group at the root level:

  • <group_name>/eeg: EEG array (n_channels, n_samples)

  • <group_name>/info: pickled mne.Info bytes (uint8 array)

  • Attributes on <group_name>/eeg:

    • rsFreq: sampling rate

    • chOrder: channel order

Example Script#

import h5py
import pickle

file_path = r"path/to/EEGUnity_export.hdf5"

with h5py.File(file_path, "r") as f:
    group_names = list(f.keys())
    print("Number of groups:", len(group_names))

    for i, grp_name in enumerate(group_names):
        grp = f[grp_name]
        print(f"\n=== Group {i}: {grp_name} ===")

        if "eeg" not in grp:
            print("Missing dataset: eeg")
            continue

        eeg_dset = grp["eeg"]
        eeg_data = eeg_dset[:]
        rs_freq = eeg_dset.attrs.get("rsFreq", None)
        ch_order = eeg_dset.attrs.get("chOrder", None)

        print("EEG shape:", eeg_data.shape)
        print("Sampling rate:", rs_freq)
        print("Channel order:", ch_order)

        if "info" in grp:
            info_bytes = grp["info"][()].tobytes()
            mne_info = pickle.loads(info_bytes)
            print("MNE info keys:", list(mne_info.keys())[:10])
        else:
            print("Missing dataset: info")

Optional: Load Only Metadata#

If you only need metadata without loading full EEG arrays:

import h5py

file_path = r"path/to/EEGUnity_export.hdf5"

with h5py.File(file_path, "r") as f:
    for grp_name in f.keys():
        eeg_dset = f[grp_name]["eeg"]
        print(
            grp_name,
            eeg_dset.shape,
            eeg_dset.attrs.get("rsFreq", None),
        )

Notes#

  • info is serialized with Python pickle; load only files from trusted sources.

  • chOrder should be used together with EEG array rows when feeding models.

  • If you need random access at scale, iterate by group and avoid loading all groups into memory at once.