Reading EEGUnity HDF5 Files#

This tutorial explains how to read .hdf5 files exported by UnifiedDataset.eeg_batch.export_h5Dataset.

HDF5 Layout Used by EEGUnity#

For each EEG file, EEGUnity creates one HDF5 group at the root level:

<group_name>/eeg: EEG array (n_channels, n_samples)
<group_name>/info: pickled mne.Info bytes (uint8 array)
Attributes on <group_name>/eeg:
- rsFreq: sampling rate
- chOrder: channel order

Example Script#

import h5py
import pickle

file_path = r"path/to/EEGUnity_export.hdf5"

with h5py.File(file_path, "r") as f:
    group_names = list(f.keys())
    print("Number of groups:", len(group_names))

    for i, grp_name in enumerate(group_names):
        grp = f[grp_name]
        print(f"\n=== Group {i}: {grp_name} ===")

        if "eeg" not in grp:
            print("Missing dataset: eeg")
            continue

        eeg_dset = grp["eeg"]
        eeg_data = eeg_dset[:]
        rs_freq = eeg_dset.attrs.get("rsFreq", None)
        ch_order = eeg_dset.attrs.get("chOrder", None)

        print("EEG shape:", eeg_data.shape)
        print("Sampling rate:", rs_freq)
        print("Channel order:", ch_order)

        if "info" in grp:
            info_bytes = grp["info"][()].tobytes()
            mne_info = pickle.loads(info_bytes)
            print("MNE info keys:", list(mne_info.keys())[:10])
        else:
            print("Missing dataset: info")

Optional: Load Only Metadata#

If you only need metadata without loading full EEG arrays:

import h5py

file_path = r"path/to/EEGUnity_export.hdf5"

with h5py.File(file_path, "r") as f:
    for grp_name in f.keys():
        eeg_dset = f[grp_name]["eeg"]
        print(
            grp_name,
            eeg_dset.shape,
            eeg_dset.attrs.get("rsFreq", None),
        )

Notes#

info is serialized with Python pickle; load only files from trusted sources.
chOrder should be used together with EEG array rows when feeding models.
If you need random access at scale, iterate by group and avoid loading all groups into memory at once.