README.md · launch/MLRC_Bench at e63f04d232a43ebf17f36354eade38b2b25ea3f2

metadata

title: MLRC-BENCH
emoji: 📊
colorFrom: green
colorTo: blue
sdk: streamlit
sdk_version: 1.39.0
app_file: app.py
pinned: false
license: cc-by-4.0

Installation & Setup

Clone the repository

git clone https://huggingface.co/spaces/launch/MLRC_Bench
cd MLRC_Bench

Setup virtual env and install the required dependencies

python -m venv env
source env/bin/activate
pip install -r requirements.txt

Run the application
```
streamlit run app.py
```

Updating Metrics

To update the table, update the respective metric file in src/data/metrics directory

Updating Text

To update the tab on Benchmark details, make changes to the the following file - src/components/tasks.py To update the metric definitions, make changes to the following file - src/components/tasks.py

Adding New Metrics

To add a new metric:

Create a new JSON data file in the src/data/metrics/ directory (e.g., src/data/metrics/new_metric.json)

Update metrics_config in src/utils/config.py:

metrics_config = {
    "Margin to Human": { ... },
    "New Metric Name": {
        "file": "src/data/metrics/new_metric.json",
        "description": "Description of the new metric",
        "min_value": 0,
        "max_value": 100,
        "color_map": "viridis"
    }
}

Ensure your metric JSON file follows the same format as existing metrics:

{
  "task-name": {
    "model-name-1": value,
    "model-name-2": value
  },
  "another-task": {
    "model-name-1": value,
    "model-name-2": value
  }
}

Adding New Agent Types

To add new agent types:

Update model_categories in src/utils/config.py:

model_categories = {
    "Existing Model": "Category",
    "New Model Name": "New Category"
}

License

MIT License