eegunity.modules.llm_booster#

Submodules#

eegunity.modules.llm_booster.eeg_llm_booster module#

class eegunity.modules.llm_booster.eeg_llm_booster.EEGLLMBooster(main_instance)[source]#

Bases: _UDatasetSharedAttributes

This is a key module of UnifiedDataset class, with focus on large language boosting. This EEGLLMBooster class has the same attributes as the UnifiedDataset class. In this class, we define the functions relative to large language boosting.

eegunity.modules.llm_booster.eeg_llm_des_parser module#

eegunity.modules.llm_booster.eeg_llm_des_parser.llm_description_file_parser(directory: str, client_type: str, client_paras: dict, completion_para: dict)[source]#

Parse files in a specified directory to extract sampling rate and channel information using a Large Language Model (LLM) API.

This function traverses a directory to read various file formats. It extracts sampling rates and channel names from the files using an LLM API (e.g., GPT-4), and processes the extracted information based on user inputs to resolve conflicts. Contributor: Jingyi Ding (Jingyi.Ding21@student.xjtlu.edu.cn), on 2024-07-26. EEGUnity Team modified it on 2025-02-23

Parameters:
  • directory (str) – The directory path where the files are stored for processing. Generally speaking, it will be the root directory of the dataset

  • client_type (str) – The type of LLM client to use (e.g., “AzureOpenAI”, “OpenAI”).

  • client_paras (dict) – A dictionary containing the parameters needed to initialize the LLM API client. Please refer to OpenAI documentation.

Returns:

dict – Returns an error message if no files are selected or if all data is discarded due to conflicts.

Return type:

A dictionary containing the parsed sampling rate and channel information.

Raises:

ValueError – If no files are selected for further analysis or if there are conflicts in the extracted data.:

Examples

>>> directory = 'path/to/description/directory'
>>> client_paras = {"api_key": "your_api_key", "api_version": "2023-03-15-preview"}
>>> client_type = "AzureOpenAI"
>>> result = llm_description_file_parser(directory, client_paras, client_type)
>>> print("The end result:", json.dumps(result, indent=4, ensure_ascii=False))

eegunity.modules.llm_booster.eeg_llm_file_parser module#

eegunity.modules.llm_booster.eeg_llm_file_parser.llm_boost_parser(file_path: str, client_type: str, client_paras: dict, completion_para: dict, max_iterations: int = 5)[source]#

Parses and processes an EEG data file using Azure OpenAI to generate a function that reads the data, calculates the sampling frequency, and extracts channel names.

This function interacts with Azure OpenAI to automatically generate and refine a Python function that reads EEG data from a CSV or TXT file, determines the sampling frequency from timestamp columns, and extracts the relevant channel names. The function iterates through the process up to max_iterations times to refine the generated code in case of errors or unsatisfactory outputs. Contributor: Ziyi Jia (Ziyi.Jia21@student.xjtlu.edu.cn), on 2024-07-26. EEGUnity Team modified this file, on 2025-03-22.

Parameters:
  • file_path (str) – Path to the CSV or TXT file.

  • client_type (str) – Type of LLM client to use (e.g., ‘AzureOpenAI’, ‘OpenAI’).

  • client_paras (dict) – Parameters for initializing the LLM client.

  • completion_para (dict) – Parameters for initializing the LLM completion process. Note: Parameter ‘messages’ is generated by this function, do not specify this parameter in ‘completion_para’.

  • max_iterations ((int, optional)) – Maximum number of iterations to refine the generated function code. Default is 5.

Returns:

mne.io.Raw

Return type:

An MNE RawArray object containing the processed EEG data.

Raises:
  • ValueError – If the file extension is not supported.:

  • FileNotFoundError – If the specified file is not found.:

  • RuntimeError – If the function code cannot be generated within the maximum iteration limit.:

Example

>>> api_key = "your_api_key"
>>> azure_endpoint = "https://your_endpoint"
>>> locator_path = "data_file.csv"
>>> raw_data = llm_boost_parser(locator_path, api_key, azure_endpoint)
>>> print("Extracted Data:", raw_data)

Module contents#