neurobench.datasets

Google Speech Commands

The Google Speech Commands dataset (V2) is a commonly used dataset in assessing the performance of keyword spotting algorithms. The dataset consists of 105,829 1 second utterances of 35 different words from 2,618 distinct speakers. The data is encoded as linear 16-bit, single-channel, pulse code modulated values, at a 16 kHz sampling frequency.

class neurobench.datasets.speech_commands.SpeechCommands(*args: Any, **kwargs: Any)[source]

Bases: NeuroBenchDataset, SPEECHCOMMANDS

Speech commands dataset v0.02 with 35 keywords.

Wraps the torchaudio SPEECHCOMMANDS dataset.

__getitem__(idx)[source]

Getter method for dataset.

Parameters:: idx (int) – index of sample to return
Returns:: waveform of audio sample label (torch.Tensor): label index of audio sample
Return type:: waveform (torch.Tensor)

__init__(path, subset: str | None = None, truncate_or_pad_to_1s=True)[source]

Initializes the SpeechCommands dataset.

Parameters:

path (str) – path to the root directory of the dataset
subset (str, optional) – one of “training”, “validation”, or “testing”. Defaults to None.
truncate_or_pad_to_1s (bool, optional) – whether to truncate or pad samples to 1s. Defaults to True.

label_to_index(label)[source]

Converts a label to an index.

Parameters:: label (str) – label of audio sample
Returns:: index of label
Return type:: torch.Tensor

DVS Gestures

The IBM Dynamic Vision Sensor (DVS) Gesture dataset is composed of recordings of 29 distinct individuals executing 10 different types of gestures, including but not limited to clapping, waving, etc. Additionally, an 11th gesture class is included that comprises gestures that cannot be categorized within the first 10 classes. The gestures are recorded under four distinct lighting conditions, and each gesture is associated with a label that indicates the corresponding lighting condition under which it was performed.

class neurobench.datasets.dvs_gesture.DVSGesture(*args: Any, **kwargs: Any)[source]

Bases: NeuroBenchDataset

Installs DVSGesture Dataset with individual events in each file, if not yet installed, else pass the path of the tonic DVSGesture install.

Data information: - Event rate: 1MHz -> dt 1e-6 - Sample length: 1.7 seconds - Default timestep for frames: 5 ms

For possible preprocessing functions, see: https://docs.prophesee.ai/stable/tutorials/ml/data_processing/event_preprocessing.html?highlight=metavision_ml%20preprocessing

__getitem__(idx)[source]

Getter method for test data in the DataLoader.

Parameters:: idx (int) – Index of the sample.
Returns:: Individual data sample, which can be a sequence of frames or raw data. target (tensor): Corresponding gesture label.
Return type:: sample (tensor)

__init__(path, split='testing', data_type='frames', preprocessing='stack')[source]

Initialization will load in data from path if possible, else will download dataset into path.

Parameters:

path (str) – Path of DVS Gesture dataset folder if applicable, else the destination of DVS Gesture dataset.
split (str) – Return testing or training data.
data_type (str) – If ‘frames’, returns frames with preprocessing applied; else returns raw events.
preprocessing (str) – Preprocessing to get frames from raw events.

set_sample_params(delta_t=5, length=1700, random_window=False)[source]

Sets sample parameters used if frames are created from events.

Parameters:

delta_t (int) – Time steps to stack events into frames (in milliseconds).
length (int) – Length in milliseconds of each sample.
random_window (bool) – If True, the sample will be a random time window of length within the gesture.

neurobench.datasets.dvs_gesture.histogram_difference_preprocessing(xypt, delta_t=5000, tbins=200, h_og=128, w_og=128, channels=3, display_frame=False)[source]

Applies histogram preprocessing to events. For every positive (pos) or negative (neg) event that has occurred at (x,y) in delta_t, 1 will be added to (x,y) in the corresponding channel (pos or neg).

Parameters:

delta_t (int) – Time steps to stack events into frames (in milliseconds).
tbins (int) – Number of frames required.
h_og (int) – Number of pixels in height.
w_og (int) – Number of pixels in width.
channels (int) – Number of channels in each frame (default 3 for plotting purposes).
display_frame (bool) – If True, will create an animation to visualize event frames.

neurobench.datasets.dvs_gesture.stack_preprocessing(xypt, delta_t=5000, tbins=200, h_og=128, w_og=128, channels=3, display_frame=False)[source]

Applies stack preprocessing to events. If at least one event has occurred at (x,y) in delta_t corresponding channel (pos or neg) will be 1, else zero.

Parameters:

delta_t (int) – Time steps to stack events into frames (in milliseconds).
tbins (int) – Number of frames required.
h_og (int) – Number of pixels in height.
w_og (int) – Number of pixels in width.
channels (int) – Number of channels in each frame (default 3 for plotting purposes).
display_frame (bool) – If True, will create an animation to visualize event frames.

neurobench.datasets.dvs_gesture.update(frame, frames)[source]: Helper function for animation.

Prophesee Megapixel Automotive

The Prophesee 1 Megapixel Automotive Detection Dataset was recorded with a high-resolution event camera with a 110 degree field of view mounted on a car windshield. The car was driven in various areas under different daytime weather conditions over several months. The dataset was labeled using the video stream of an additional RGB camera in a semi-automated way, resulting in over 25 million bounding boxes for seven different object classes: pedestrian, two-wheeler, car, truck, bus, traffic sign, and traffic light. The labels are provided at a rate of 60Hz, and the recording of 14.65 hours is split into 11.19, 2.21, and 2.25 hours for training, validation, and testing, respectively.

class neurobench.datasets.megapixel_automotive.Gen4DetectionDataLoader(*args: Any, **kwargs: Any)[source]

Bases: SequentialDataLoader

NeuroBench DataLoader for Gen4 pre-computed dataset.

The default parameters are set for the Gen4 Histograms dataset, which can be downloaded from https://docs.prophesee.ai/stable/datasets.html#precomputed-datasets but you can change that easily by downloading one of the other pre-computed datasets and changing the preprocess_function_name and channels parameters accordingly.

Once downloaded, extract the zip folder and set the dataset_path parameter to the path of the extracted folder.

__init__(dataset_path='data/Gen 4 Histograms', split='testing', batch_size: int = 4, num_tbins: int = 12, preprocess_function_name='histo', delta_t=50000, channels=2, height=360, width=640, max_incr_per_pixel=5, class_selection=['pedestrian', 'two wheeler', 'car'], num_workers=4)[source]

Initializes the Gen4DetectionDataLoader dataloader.

Parameters:

dataset_path – path to the dataset folder
split – split to use, can be ‘training’, ‘validation’ or ‘testing’
batch_size – batch size
num_tbins – number of time bins in a mini batch
preprocess_function_name – name of the preprocessing function to use, ‘histo’ by default. Can be that are listed under https://docs.prophesee.ai/stable/api/python/ml/preprocessing.html
delta_t – time interval between two consecutive frames
channels – number of channels in the input data, 2 by default for histograms
height – height of the input data
width – width of the input data
max_incr_per_pixel – maximum number of events per pixel
class_selection – list of classes to use
num_workers – number of workers for the dataloader

neurobench.datasets.megapixel_automotive.create_class_lookup(wanted_keys=[])[source]: Source code modified from metavision_ml.data.box_processing.create_class_lookup to avoid having extraneous label map json file.

Nonhuman Primate Reaching

The Nonhuman Primate reaching Dataset consists of multi-channel recordings obtained from the sensorimotor cortex of two non-human primates (NHP) during self-paced reaching movements towards a grid of targets. The variable x is represented by threshold crossing times (or spike times) and sorted units for each of the recording channels. The target y is represented by 2-dimensional position coordinates of the fingertip of the reaching hand, sampled at a frequency of 250 Hz. The complete dataset contains 37 sessions spanning 10 months for NHP-1 and 10 sessions from NHP-2 spanning one month. For this study, three sessions from each NHP were selected to include the entire recording duration, resulting in a total of 6774 seconds of data.

This file contains code from PyTorch Vision (https://github.com/pytorch/vision) which is licensed under BSD 3-Clause License. These snippets are the Copyright (c) of Soumith Chintala 2016. All other code is the Copyright (c) of the NeuroBench Developers 2023.

class neurobench.datasets.primate_reaching.PrimateReaching(*args: Any, **kwargs: Any)[source]

Bases: NeuroBenchDataset

Dataset for the Primate Reaching Task.

The Dataset can be downloaded from the following website: https://zenodo.org/record/583331

For this task, the following files are selected: 1. indy_20170131_02.mat 2. indy_20160630_01.mat 3. indy_20160622_01.mat 4. loco_20170301_05.mat 5. loco_20170215_02.mat 6. loco_20170210_03.mat

The description of the structure of the dataset can be found on the website in the section: Variable names.

Once these .mat files are downloaded, store them in the same directory.

__getitem__(idx)[source]: Getter method of the dataloader.

__init__(file_path, filename, num_steps, train_ratio=0.8, label_series=False, biological_delay=0, spike_sorting=False, stride=0.004, bin_width=0.028, max_segment_length=2000, split_num=1, remove_segments_inactive=False, download=True)[source]

Initialises the Dataset for the Primate Reaching Task.

Parameters:

file_path (str) – The path to the directory storing the matlab files.
filename (str) – The name of the file that will be loaded.
num_steps (int) – Number of consecutive timesteps that are included per sample. In the real-time case, this should be 1.
train_ratio (float) – ratio for how the dataset will be split into training/(val+test) set. Default is 0.8 (80% of data is training).
label_series (bool) – Whether the labels are series or not. Useful for training with multiple timesteps. Default is False.
biological_delay (int) – How many steps of delay is to be applied to the dataset. Default is 0 i.e. no delay applied.
spike_sorting (bool) – Apply spike sorting for processing raw spike data. Default is False.
stride (float) – How many steps are taken when moving the bin_window. Default is 0.004 (4ms).
bin_width (float) – The size of the bin_window. Default is 0.028 (28ms).
max_segment_length – Define the upper limits of a segment. Default is 2000 data points (8s)
split_num (int) – The number of chunks to break the timeseries into. Default is 1 (no splits).
remove_segments_inactive (bool) – Whether to remove segments longer than max_segment_length, which represent subject inactivity. Default is False.
download (bool) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it will not be downloaded again.

apply_delay()[source]: Shift the labels by the delay to account for the biological delay between spikes and movement onset.

download()[source]: Download the Primate Reaching data if it doesn’t exist already.

static get_flag_index(target_pos)[source]: Find where each segment begins and ends.

load_data()[source]: Load the data from the matlab file and spike data if spike data has been processed and stored already.

md5s = {'indy_20160622_01.mat': 'c33d5fff31320d709d23fe445561fb6e', 'indy_20160630_01.mat': '197413a5339630ea926cbd22b8b43338', 'indy_20170131_02.mat': '2790b1c869564afaa7772dbf9e42d784', 'loco_20170210_03.mat': '4cae63b58c4cb9c8abd44929216c703b', 'loco_20170215_02.mat': '739b70762d838f3a1f358733c426bb02', 'loco_20170301_05.mat': '47342da09f9c950050c9213c3df38ea3'}

remove_segments_by_length()[source]: Remove the segments where its duration exceeds the limit set by max_segment_length.

split_data()[source]: Split segments into training/validation/test set.

static split_into_segments(indices)[source]: Combine the start and end index into a NumPy array.

url = 'https://zenodo.org/record/583331/files/'

Mackey-Glass

The Mackey Glass dataset is synthetic and consists of a one-dimensional non-linear time delay differential equation, where the evolution of the signal can be altered by a number of different parameters. These parameters are defined in NeuroBench.

class neurobench.datasets.mackey_glass.MackeyGlass(*args: Any, **kwargs: Any)[source]

Bases: Dataset

Dataset for the Mackey-Glass task.

__getitem__(idx)[source]

Getter method for dataset.

Parameters:: idx (int or tensor) – index(s) of sample(s) to return
Returns:: individual data sample, shape=(timestamps, features)=(1,1) target (tensor): corresponding next state of the system, shape=(label,)=(1,)
Return type:: sample (tensor)

__init__(file_path=None, tau=17, lyaptime=197, constant_past=0.7206597, nmg=10, beta=0.2, gamma=0.1, pts_per_lyaptime=75, traintime=10.0, testtime=10.0, start_offset=0.0, seed_id=0, bin_window=1, download=True)[source]

Initializes the Mackey-Glass dataset.

Parameters:

file_path (str) – path to .npy file containing Mackey-Glass time-series. If this is provided, then tau, lyaptime, constant_past, nmg, beta, gamma are ignored.
tau (float) – parameter of the Mackey-Glass equation
lyaptime (float) – Lyapunov time of the time-series
constant_past (float) – initial condition for the solver
nmg (float) – parameter of the Mackey-Glass equation
beta (float) – parameter of the Mackey-Glass equation
gamma (float) – parameter of the Mackey-Glass equation
pts_per_lyaptime (int) – number of points to sample per one Lyapunov time
traintime (float) – number of Lyapunov times to be used for training a model
testtime (float) – number of Lyapunov times to be used for testing a model
start_offset (int) – added offset in number of points to shift the timeseries forward
seed_id (int) – seed for generating function solution
bin_window (int) – number of points forming lookback window for each prediction
download (bool) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it will not be downloaded again.

download()[source]: Download the Mackey Glass data if it doesn’t exist already.

generate_data()[source]: Generate time-series using the provided parameters of the equation.

load_data(file)[source]

split_data()[source]: Generate training and testing indices.

Multi-Lingual Spoken Word Corpus

MLCommons Multilingual Spoken Words Corpus is a large and growing audio dataset of spoken words in 50 languages for academic research and commercial applications in keyword spotting and spoken term search, licensed under CC-BY 4.0. The dataset contains more than 340,000 keywords, totaling 23.4 million 1-second spoken examples (over 6,000 hours).

The NeuroBench harness does not use the full MSWC dataset. For more information on the subset used, see the NeuroBench paper.

Bases: Dataset

Subset version (https://huggingface.co/datasets/NeuroBench/mswc_fscil_subset) of the original MSWC dataset (https://mlcommons.org/en/multilingual-spoken-words/) for a few-shot class-incremental learning (FSCIL) task consisting of 200 voice commands keywords:

100 base classes available for pre-training with:
- 500 train samples
- 100 validation samples
- 100 test samples
100 evaluation classes to do class-incremental learning on with 200 samples each.

The subset of data used for this task, as well as the supporting files for base class and incremental splits, is hosted on Huggingface at the first link above.

The data is given in 48kHz opus format. Converted 16kHz wav files are available to download at the link above.

__getitem__(index: int) → Tuple[Tensor, int][source]

Getter method to get waveform samples.

Parameters:: idx (int) – Index of the sample.
Returns:: Individual waveform sample, padded to always match dimension (48000, 1). target (int): Corresponding keyword index based on FSCIL_KEYWORDS order (by decreasing number of samples in original dataset).
Return type:: sample (tensor)

Initialization will create the new base eval splits if needed .

Parameters:

root (str) – Path of data root folder where is or will be the MSWC/ folder containing the dataset.
subset (str) – Return “base” or “evaluation” classes.
procedure (str) – For base subset, return “training”, “testing” or “validation” samples.
language (str) – Language to use for evaluation task.
download (bool) – If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it will not be downloaded again.

download()[source]: Download the MSWC FSCIL data if it doesn’t exist already.

class neurobench.datasets.MSWC_dataset.MSWC_query(walker)[source]

Bases: Dataset

Simple Dataset object created for incremental queries.

__getitem__(index: int)[source]

Getter method to get waveform samples.

Parameters:: idx (int) – Index of the sample.
Returns:: Individual waveform sample, padded to always match dimension (1, 48000). target (int): Corresponding keyword index based on FSCIL_KEYWORDS order (by decreasing number of samples in original dataset).
Return type:: sample (tensor)

__init__(walker)[source]

Initialization of the dataset.

Parameters:: walker (list) – List of tuples with data (filename, class_index, dirname)

neurobench.datasets.MSWC_dataset.get_mswc_item(item, dirname, return_path)[source]

Wireless Sensor Data Mining

The “WISDM Smartphone and Smartwatch Activity and Biometrics Dataset” includes data collected from 51 subjects, each of whom were asked to perform 18 tasks for 3 minutes each. Each subject had a smartwatch placed on his/her dominant hand and a smartphone in their pocket. The data collection was controlled by a custom-made app that ran on the smartphone and smartwatch. The sensor data that was collected was from the accelerometer and gyrocope on both the smartphone and smartwatch, yielding four total sensors. The sensor data was collected at a rate of 20 Hz (i.e., every 50ms).

class neurobench.datasets.WISDM.WISDM(*args: Any, **kwargs: Any)[source]

Bases: LightningDataModule

__init__(path: str = 'path/to/file', batch_size: int = 256)[source]

static load_wisdm2_data(path)[source]

predict_dataloader()[source]

setup(stage: str)[source]

test_dataloader()[source]

train_dataloader()[source]

val_dataloader()[source]

neurobench.datasets.WISDM.convert_to_tensor(x, y)[source]