neurobench.benchmarks

Benchmark

class neurobench.benchmarks.Benchmark(model, dataloader, preprocessors, postprocessors, metric_list)[source]

Bases: object

Top-level benchmark class for running benchmarks.

run(quiet=False, verbose: bool = False, dataloader=None, preprocessors=None, postprocessors=None, device=None)[source]

Runs batched evaluation of the benchmark.

Parameters:
  • dataloader (optional) – override DataLoader for this run.

  • preprocessors (optional) – override preprocessors for this run.

  • postprocessors (optional) – override postprocessors for this run.

  • quiet (bool, default=False) – If True, output is suppressed.

  • verbose (bool, default=False) – If True, metrics for each bach will be printed. If False (default), metrics are accumulated and printed after all batches are processed.

  • device (optional) – use device for this run (e.g. ‘cuda’ or ‘cpu’).

Returns:

A dictionary of results.

Return type:

results

Workload Metrics

class neurobench.benchmarks.workload_metrics.AccumulatedMetric[source]

Bases: object

Abstract class for a metric which must save state between batches.

compute()[source]

Compute the metric score using all accumulated data.

Returns:

the final accumulated metric.

Return type:

result

reset()[source]

Reset the metric state.

This is called when the benchmark is run again, e.g. on the FSCIL task the benchmark is run at the end of each session.

class neurobench.benchmarks.workload_metrics.COCO_mAP[source]

Bases: AccumulatedMetric

COCO mean average precision.

Measured for event data based on Perot2020, Supplementary B (https://arxiv.org/abs/2009.13436)
  • Skips first 0.5s of each sequence

  • Bounding boxes with diagonal size smaller than 60 pixels are ignored

compute()[source]

Compute COCO mAP using accumulated data.

reset()[source]

Reset metric state.

neurobench.benchmarks.workload_metrics.MSE(model, preds, data)[source]

Mean squared error of the model predictions.

Parameters:
  • model – A NeuroBenchModel.

  • preds – A tensor of model predictions.

  • data – A tuple of data and labels.

Returns:

Mean squared error.

Return type:

float

neurobench.benchmarks.workload_metrics.activation_sparsity(model, preds, data)[source]

Sparsity of model activations.

Calculated as the number of zero activations over the total number of activations, over all layers, timesteps, samples in data.

Parameters:
  • model – A NeuroBenchModel.

  • preds – A tensor of model predictions.

  • data – A tuple of data and labels.

Returns:

Activation sparsity.

Return type:

float

neurobench.benchmarks.workload_metrics.classification_accuracy(model, preds, data)[source]

Classification accuracy of the model predictions.

Parameters:
  • model – A NeuroBenchModel.

  • preds – A tensor of model predictions.

  • data – A tuple of data and labels.

Returns:

Classification accuracy.

Return type:

float

neurobench.benchmarks.workload_metrics.detect_activations_connections(model)[source]

Register hooks or other operations that should be called before running a benchmark.

class neurobench.benchmarks.workload_metrics.membrane_updates[source]

Bases: AccumulatedMetric

Number of membrane potential updates.

This metric can only be used for spiking models implemented with SNNTorch.

compute()[source]

Compute membrane updates using accumulated data.

Returns:

Compute the total updates to each neuron’s membrane potential within the model, aggregated across all neurons and normalized by the number of samples processed.

Return type:

float

reset()[source]

Reset metric state.

neurobench.benchmarks.workload_metrics.number_neuron_updates(model, preds, data)[source]

Number of times each neuron type is updated.

Parameters:
  • model – A NeuroBenchModel.

  • preds – A tensor of model predictions.

  • data – A tuple of data and labels.

Returns:

key is neuron type, value is number of updates.

Return type:

dict

class neurobench.benchmarks.workload_metrics.r2[source]

Bases: AccumulatedMetric

R2 Score of the model predictions.

Currently implemented for 2D output only.

compute()[source]

Compute r2 score using accumulated data.

reset()[source]

Reset metric state.

neurobench.benchmarks.workload_metrics.sMAPE(model, preds, data)[source]

Symmetric mean absolute percentage error of the model predictions.

Parameters:
  • model – A NeuroBenchModel.

  • preds – A tensor of model predictions.

  • data – A tuple of data and labels.

Returns:

Symmetric mean absolute percentage error.

Return type:

float

class neurobench.benchmarks.workload_metrics.synaptic_operations[source]

Bases: AccumulatedMetric

Number of synaptic operations.

MACs for ANN ACs for SNN

compute()[source]

Compute the metric score using all accumulated data.

Returns:

the final accumulated metric.

Return type:

result

reset()[source]

Reset the metric state.

This is called when the benchmark is run again, e.g. on the FSCIL task the benchmark is run at the end of each session.

Static Metrics

neurobench.benchmarks.static_metrics.connection_sparsity(model)[source]

Sparsity of model connections between layers. Based on number of zeros in supported layers, other layers are not taken into account in the computation: Supported layers: Linear Conv1d, Conv2d, Conv3d RNN, RNNBase, RNNCell LSTM, LSTMBase, LSTMCell GRU, GRUBase, GRUCell

Parameters:

model – A NeuroBenchModel.

Returns:

Connection sparsity, rounded to 3 decimals.

Return type:

float

neurobench.benchmarks.static_metrics.footprint(model)[source]

Memory footprint of the model.

Parameters:

model – A NeuroBenchModel.

Returns:

Model size in bytes.

Return type:

float

neurobench.benchmarks.static_metrics.parameter_count(model)[source]

Number of parameters in the model.

Parameters:

model – A NeuroBenchModel.

Returns:

Number of parameters.

Return type:

int