EEGUnity Kernel Tutorial: Rich Metadata, misc, stim, and Annotations#
This tutorial explains how to use EEGUnity kernels to inject dataset-specific metadata and channels in memory.
1. Design Principle#
EEGUnity keeps locator metadata as source of truth:
format_channel_names()standardizes locator channels aschannel_type:channel_name.get_data_row()uses locator metadata to overwrite raw metadata at load time.Kernels are applied after locator-driven metadata patching.
This allows online metadata maintenance without modifying source files.
2. What a Kernel Can Do#
A kernel can:
add or update
raw.info["description"]add or adjust multiple
miscchannelsadd or adjust multiple
stimchannelsadd/update annotations
Kernel interface:
class SomeKernel:
def apply(self, udataset, raw, row):
...
return raw
KERNEL = SomeKernel()
3. Annotation vs misc vs stim#
Use these three mechanisms for different semantics:
Annotations: text labels mapped to time segments (onset,duration,description).miscchannels: continuous values over time (for example probability density, reaction-time trajectory).stimchannels: integer event codes over time (for example class sequence 1/2/3).
For a single scalar value for one segment, fill the covered segment in a misc channel.
4. Example Kernel with Multiple misc and stim Channels#
from __future__ import annotations
from dataclasses import dataclass
import numpy as np
import mne
def add_channel(raw: mne.io.BaseRaw, ch_name: str, ch_type: str, values: np.ndarray) -> mne.io.BaseRaw:
"""Append one channel to raw with explicit MNE channel type."""
if values.ndim != 1:
raise ValueError("values must be a 1D array")
if values.shape[0] != raw.n_times:
raise ValueError("values length must equal raw.n_times")
info = mne.create_info([ch_name], sfreq=raw.info["sfreq"], ch_types=[ch_type])
ch_raw = mne.io.RawArray(values[np.newaxis, :], info, verbose=False)
raw.add_channels([ch_raw], force_update_info=True)
return raw
@dataclass
class ExampleKernel:
KERNEL_ID: str = "example_rich_meta"
def apply(self, udataset, raw: mne.io.BaseRaw, row):
n = raw.n_times
# misc channels (continuous signals)
prob_density = np.linspace(0.1, 0.9, n, dtype=float)
reaction_time = np.full(n, 0.42, dtype=float)
raw = add_channel(raw, "prob_density", "misc", prob_density)
raw = add_channel(raw, "reaction_time", "misc", reaction_time)
# stim channels (integer codes)
task_code = np.zeros(n, dtype=float)
task_code[n // 4: n // 2] = 1
task_code[n // 2: 3 * n // 4] = 2
task_code[3 * n // 4:] = 3
stage_code = np.zeros(n, dtype=float)
stage_code[n // 3: 2 * n // 3] = 7
raw = add_channel(raw, "task_code", "stim", task_code)
raw = add_channel(raw, "stage_code", "stim", stage_code)
# annotation segments (text semantics)
ann = mne.Annotations(
onset=[0.0, raw.times[n // 2]],
duration=[2.0, 2.0],
description=["trial_start", "feedback"],
)
raw.set_annotations(ann)
return raw
KERNEL = ExampleKernel()
5. Binding and Running#
from eegunity import UnifiedDataset
ud = UnifiedDataset(
dataset_path=r"path/to/dataset",
domain_tag="my_dataset",
kernel_spec=r"path/to/example_kernel.py",
)
# Parser path
raw0 = ud.eeg_parser.get_data(0)
# Batch path (kernel is also applied when loading row data in batch methods)
ud.eeg_batch.get_file_hashes(data_stream=True)
6. Channel Type Compatibility#
EEGUnity standard prefixes are lowercase MNE-style (eeg, eog, emg, ecg, meg, stim, misc, bio) and it also accepts explicit MNE channel type strings in locator entries, for example:
seeg:LA1ecog:G1dbs:DBS1fnirs_od:S1_D1_760pupil:pupil_leftmisc:prob_densitystim:task_code
Legacy uppercase prefixes (EEG, EOG, EMG, ECG, STIM, Unknown) are accepted for backward compatibility.
7. Recommended Practice#
Use annotations for semantic event intervals.
Use
stimfor integer-coded sequences.Use
miscfor continuous labels.Keep kernel logic dataset-specific and deterministic.