metadata

title: matching_series
tags:
  - evaluate
  - metric
description: Matching-based time-series generation metric
sdk: gradio
sdk_version: 3.5
app_file: app.py
pinned: false

Metric Card for matching_series

Metric Description

Matching Series is a metric for evaluating time-series generation models. It is based on the idea of matching the generated time-series with the original time-series. The metric calculates the Mean Squared Error (MSE) between the generated time-series and the original time-series between matched instances. The metric outputs a score greater or equal to 0, where 0 indicates a perfect generation.

How to Use

At minium, the metric requires the original time-series and the generated time-series as input. The metric can be used to evaluate the performance of time-series generation models.

>>> num_generation = 100
>>> num_reference = 10
>>> seq_len = 100
>>> num_features = 10
>>> references = np.random.rand(num_reference, seq_len, num_features)
>>> predictions = np.random.rand(num_generation, seq_len, num_features)
>>> metric = evaluate.load("bowdbeg/matching_series")
>>> results = metric.compute(references=references, predictions=predictions, batch_size=1000)
>>> print(results)
{'matching_mse': 0.15250070138019745, 'harmonic_mean': 0.15246672297315564, 'covered_mse': 0.15243275970407652, 'index_mse': 0.16772539808686357, 'matching_mse_features': [0.11976368411913872, 0.1238622735860897, 0.1235259257706047, 0.12385236248438022, 0.12241466736218365, 0.12328439290438079, 0.1232240061707885, 0.12342319803028035, 0.12235222572924524, 0.12437865819262514], 'harmonic_mean_features': [0.12010478503934609, 0.12379899085819131, 0.12321441761307182, 0.12273884163905005, 0.12256126537300535, 0.12323289686030311, 0.12323847434641247, 0.12333469339243568, 0.12273530480438972, 0.12390254295493403], 'covered_mse_features': [0.12044783449951382, 0.1237357727610885, 0.12290447662839017, 0.12164516506865233, 0.12270821492248948, 0.12318144381818667, 0.12325294591995689, 0.12324631559392285, 0.12312079021887229, 0.12343005890751833], 'index_mse_features': [0.16331894487549958, 0.1679797859239729, 0.16904075114728268, 0.16962427920551068, 0.16915910655024802, 0.16686197230602684, 0.17056311327206022, 0.1638796919248867, 0.16736730842643857, 0.16945902723670975], 'macro_matching_mse': 0.1230081394349717, 'macro_covered_mse': 0.12276730183385913, 'macro_harmonic_mean': 0.12288622128811397}

Inputs

predictions: (list of list of list of float or numpy.ndarray): The generated time-series. The shape of the array should be (num_generation, seq_len, num_features).
references: (list of list of list of float or numpy.ndarray): The original time-series. The shape of the array should be (num_reference, seq_len, num_features).

Output Values

Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}

State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."

Values from Popular Papers

Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.

Examples

Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.

Limitations and Bias

Note any known limitations or biases that the metric has, with links and references if possible.

Citation

Cite the source where this metric was introduced.

Further References

Add any useful further references.