Spaces:
Running
Running
| title: Wilcoxon | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 3.0.2 | |
| app_file: app.py | |
| pinned: false | |
| tags: | |
| - evaluate | |
| - comparison | |
| description: >- | |
| Wilcoxon's test is a signed-rank test for comparing paired samples. | |
| # Comparison Card for Wilcoxon | |
| ## Comparison description | |
| Wilcoxon's test is a non-parametric signed-rank test that tests whether the distribution of the differences is symmetric about zero. It can be used to compare the predictions of two models. | |
| ## How to use | |
| The Wilcoxon comparison is used to analyze paired ordinal data. | |
| ## Inputs | |
| Its arguments are: | |
| `predictions1`: a list of predictions from the first model. | |
| `predictions2`: a list of predictions from the second model. | |
| ## Output values | |
| The Wilcoxon comparison outputs two things: | |
| `stat`: The Wilcoxon statistic. | |
| `p`: The p value. | |
| ## Examples | |
| Example comparison: | |
| ```python | |
| wilcoxon = evaluate.load("wilcoxon") | |
| results = wilcoxon.compute(predictions1=[-7, 123.45, 43, 4.91, 5], predictions2=[1337.12, -9.74, 1, 2, 3.21]) | |
| print(results) | |
| {'stat': 5.0, 'p': 0.625} | |
| ``` | |
| ## Limitations and bias | |
| The Wilcoxon test is a non-parametric test, so it has relatively few assumptions (basically only that the observations are independent). It should be used to analyze paired ordinal data only. | |
| ## Citations | |
| ```bibtex | |
| @incollection{wilcoxon1992individual, | |
| title={Individual comparisons by ranking methods}, | |
| author={Wilcoxon, Frank}, | |
| booktitle={Breakthroughs in statistics}, | |
| pages={196--202}, | |
| year={1992}, | |
| publisher={Springer} | |
| } | |
| ``` | |