eegunity.modules.parser#
Submodules#
eegunity.modules.parser.eeg_parser module#
- class eegunity.modules.parser.eeg_parser.EEGParser(main_instance)[source]#
Bases:
_UDatasetSharedAttributes- check_locator(locator)[source]#
Validate the contents of the locator DataFrame.
- Parameters:
locator (pd.DataFrame) – A DataFrame containing file metadata, including data shape, channel names, file type, file path, number of channels, sampling rate, and duration.
- Returns:
The updated DataFrame with a ‘Completeness Check’ column indicating whether the validation was completed or if errors were found.
- Return type:
pd.DataFrame
- eegunity.modules.parser.eeg_parser.channel_name_parser(input_string)[source]#
Format and standardize a list of channel names based on predefined rules.
- Parameters:
input_string (str) – A comma-separated string containing channel names to be formatted.
- Returns:
A comma-separated string of formatted channel names. If duplicates are found, the original input string is returned.
- Return type:
str
Warning
Warns if an invalid channel name is detected or if a duplicate formatted channel name is found.
- eegunity.modules.parser.eeg_parser.convert_unit(data: Raw, unit: str) Raw[source]#
Convert the units of EEG data in a MNE Raw object.
- Parameters:
data (mne.io.Raw) – The raw EEG data to be converted.
unit (str) – The target unit to convert the data to. Must be one of ‘V’, ‘mV’, or ‘uV’.
- Raises:
ValueError – If the provided unit is not valid.
- Returns:
The raw EEG data with converted units.
- Return type:
mne.io.Raw
- eegunity.modules.parser.eeg_parser.create_montage_from_json(json_file)[source]#
Create a montage from a JSON file containing channel positions.
- Parameters:
json_file (str) – The path to the JSON file containing channel names as keys and their positions as values.
- Returns:
A montage object created from the channel positions defined in the JSON file.
- Return type:
mne.channels.DigMontage
- eegunity.modules.parser.eeg_parser.extract_events(raw)[source]#
Extract events from an mne.io.Raw object.
Attempt to extract events using mne.events_from_annotations. If it fails, use mne.find_events to extract events without description.
- Parameters:
raw (mne.io.Raw) – The raw data object.
- Returns:
events (numpy.ndarray) – The events array, shaped (n_events, 3).
event_id (dict) – Dictionary of event IDs.
- eegunity.modules.parser.eeg_parser.get_data_row(row: dict, norm_type: str | None = None, is_set_channel_type: bool | None = None, is_set_montage: bool = False, pick_types_params: dict | None = None, unit_convert: str | None = None, read_raw_params: dict | None = None, handle_nonstandard_params: dict | None = None, preload: bool = True) BaseRaw[source]#
Process and return raw EEG data based on the input row information.
This function handles both standard and non-standard data, with options for setting channel types, montage, normalization, and unit conversion.
- Parameters:
row (dict) – Dictionary containing data attributes, such as file paths, file types, and channel names.
norm_type (str, optional) – Type of normalization to apply, if any. Defaults to None.
is_set_channel_type (bool or None, optional) – Determines whether to set channel types based on the provided information. - If True, channel types will be set explicitly. - If None, the setting of channel types depends on whether the File Path in the locator follows the format “type:name” (see UnifiedDataset.EEGBatch.format_channel_names() for details). Defaults to None.
is_set_montage (bool, optional) – Whether to set montage (electrode coordinates). Defaults to False.
pick_types_params (dict, optional) – Dictionary specifying which channel types to include. The keys should match the parameters of raw.pick_types(). Defaults to None.
unit_convert (str, optional) – Conversion type for resetting channel units. Defaults to None.
read_raw_params (dict, optional) – Additional parameters to pass to mne.io.read_raw() for standard data loading.
handle_nonstandard_params (dict, optional) – Additional parameters to pass to handle_nonstandard_data() for non-standard data loading.
preload (bool, optional) – Whether to preload the data into memory. Defaults to True.
- Returns:
The processed raw EEG data object.
- Return type:
mne.io.BaseRaw
- Raises:
ValueError – If the number of channels in the locator file does not match the metadata.
Warning – If pick_types is not None but is_set_channel_type is False, a warning will be issued to inform the user to set is_set_channel_type=True.
- eegunity.modules.parser.eeg_parser.handle_nonstandard_data(row, verbose='CRITICAL')[source]#
Handles the loading of non-standard EEG data files into MNE Raw format.
This function processes EEG data from either .mat files or .csv/.txt files marked as ‘csvData’. It extracts channel names and sampling rates from the provided row, and creates an MNE RawArray object containing the EEG data.
- Parameters:
row (pd.Series) – A row from a DataFrame containing information about the file, including ‘File Path’, ‘Channel Names’, ‘Sampling Rate’, and ‘File Type’.
verbose (str, optional) – The verbosity level for MNE functions. Default is ‘CRITICAL’.
- Returns:
An MNE Raw object containing the EEG data.
- Return type:
mne.io.Raw
- Raises:
ValueError – If the number of channels in the DataFrame does not match the channel names specified in the row.
Exception – If the file type is unsupported or there is an error in loading the data.
- eegunity.modules.parser.eeg_parser.infer_channel_unit(ch_name, ch_data, ch_type)[source]#
Infer the unit type for a given channel based on its data and type.
- Parameters:
ch_name (str) – The name of the channel.
ch_data (array-like) – The data of the channel, typically an array of amplitude values.
ch_type (str) – The type of the channel, such as ‘eeg’, ‘emg’, etc.
- Returns:
The inferred unit type, such as “uV”, “mV”, or “V”, based on the channel data and type.
- Return type:
str
- eegunity.modules.parser.eeg_parser.normalize_data(raw_data, mean_std_str: str | Dict, norm_type: str)[source]#
Normalize EEG data based on provided mean and standard deviation values.
- Parameters:
raw_data (mne.io.Raw) – The raw EEG data to be normalized. The data should be in MNE Raw format.
mean_std_str (Union[str, Dict]) – A dictionary or string that contains mean and standard deviation values. If it’s a string, it will be evaluated into a dictionary. The dictionary keys should be channel names (for channel-wise normalization) or ‘all_eeg’ (for sample-wise normalization).
norm_type (str) – The type of normalization to perform. It can be: - ‘channel-wise’: Normalize each channel individually based on its mean and standard deviation. - ‘sample-wise’: Normalize all channels based on a common mean and standard deviation.
- Returns:
The normalized raw EEG data.
- Return type:
mne.io.Raw
- Raises:
ValueError – If norm_type is not ‘channel-wise’ or ‘sample-wise’.
- eegunity.modules.parser.eeg_parser.process_mne_files(files_locator, verbose)[source]#
Process MNE files based on a locator DataFrame.
- Parameters:
files_locator (pandas.DataFrame) – DataFrame containing file paths and related metadata for processing.
verbose (str) – Verbosity level for MNE functions.
- Returns:
Updated DataFrame with metadata extracted from processed files.
- Return type:
pandas.DataFrame
- eegunity.modules.parser.eeg_parser.set_channel_type(raw_data, channel_str)[source]#
Set the channel types for the given raw data based on the specified channel string.
- Parameters:
raw_data (mne.io.Raw) – The raw data object containing the EEG, EMG, ECG, EOG, or other types of signals.
channel_str (str) – A string specifying the channel types and names in the format ‘type:name’, separated by commas. Each type must correspond to the desired signal type.
- Returns:
The updated raw data object with renamed channels and set channel types.
- Return type:
mne.io.Raw
- Raises:
ValueError – If the format of any channel in the channel string is invalid (not in ‘type:name’ format).
- eegunity.modules.parser.eeg_parser.set_infer_unit(raw_data, row)[source]#
Set the inferred unit for EEG channels in the raw data.
- Parameters:
raw_data (mne.io.Raw) – The raw data object containing the EEG channels.
row (pandas.Series) – A row from a DataFrame containing the ‘Infer Unit’ field, which should be a dictionary with channel names as keys and units as values.
- Returns:
The updated raw data object with the inferred units set for the specified channels.
- Return type:
mne.io.Raw
- Raises:
ValueError – If ‘Infer Unit’ is not a valid dictionary.
- eegunity.modules.parser.eeg_parser.set_montage_any(raw_data: Raw, verbose='CRITICAL')[source]#
Set the montage for the given raw data using a montage defined in a JSON file.
- Parameters:
raw_data (mne.io.Raw) – The raw data object to which the montage will be applied.
verbose (str, optional) – The verbosity level for warnings or messages, by default ‘CRITICAL’.
- Returns:
The updated raw data object with the applied montage.
- Return type:
mne.io.Raw
eegunity.modules.parser.eeg_parser_config module#
eegunity.modules.parser.eeg_parser_csv module#
- eegunity.modules.parser.eeg_parser_csv.calculate_interval(times)[source]#
Calculate the average interval between time points.
- Parameters:
times (pandas.Series) – A pandas Series object containing time points. The time points can either be timezone-aware DatetimeTZDtype or naive pd.Timestamp objects.
- Returns:
The average interval between consecutive time points in seconds. If the input series is empty or only has one time point, returns None.
- Return type:
float or None
- eegunity.modules.parser.eeg_parser_csv.identify_time_columns(df)[source]#
Identify potential time columns in a DataFrame.
- Parameters:
df (pandas.DataFrame) – The input DataFrame containing potential time columns.
- Returns:
If a single time column is identified, returns the column name and its sampling frequency as a float. If multiple time columns are found with the same sampling frequency, returns a list of column names and the common sampling frequency. Returns None if no valid time column is detected.
- Return type:
str or list of str, float
- eegunity.modules.parser.eeg_parser_csv.is_datetime_format(s)[source]#
Check if a string follows a datetime format.
- Parameters:
s (str) – The string to be evaluated for compatibility with the datetime format.
- Returns:
Returns True if the string matches the datetime format “%Y-%m-%d %H:%M:%S.%f”. Otherwise, returns `False.
- Return type:
bool
- eegunity.modules.parser.eeg_parser_csv.process_csv_files(files_locator)[source]#
Process CSV files and update a DataFrame with file details.
- Parameters:
files_locator (pandas.DataFrame) – A DataFrame containing the metadata of files, including their file paths and other details. The column ‘File Path’ is expected to contain paths to the files.
- Returns:
Updated DataFrame with additional columns ‘File Type’, ‘Sampling Rate’, ‘Channel Names’, ‘Number of Channels’, and ‘Duration’ for each file. If a file cannot be processed, appropriate messages are printed.
- Return type:
pandas.DataFrame
eegunity.modules.parser.eeg_parser_mat module#
- eegunity.modules.parser.eeg_parser_mat.process_mat_files(files_locator)[source]#
Process MAT files and update a DataFrame with file details.
- Parameters:
files_locator (pandas.DataFrame) – A DataFrame containing the metadata of files, including their file paths and other details. The column ‘File Path’ is expected to contain paths to the MAT files.
- Returns:
Updated DataFrame with additional columns ‘File Type’, ‘Sampling Rate’, ‘Channel Names’, ‘Number of Channels’, and ‘Duration’ for each file. If a file cannot be processed, appropriate messages are printed.
- Return type:
pandas.DataFrame
- Raises:
FileNotFoundError – If the MAT file cannot be located.
Exception – General exception for unexpected errors during file processing.