MLRC_Bench / README.md
Armeddinosaur's picture
updating readme
8f9396c
|
raw
history blame
1.95 kB
metadata
title: MLRC-BENCH
emoji: 📊
colorFrom: green
colorTo: blue
sdk: streamlit
sdk_version: 1.39.0
app_file: app.py
pinned: false
license: cc-by-4.0

Installation & Setup

  1. Clone the repository
git clone https://huggingface.co/spaces/launch/MLRC_Bench
cd MLRC_Bench
  1. Setup virtual env and install the required dependencies

    python -m venv env
    source env/bin/activate
    pip install -r requirements.txt
    
  2. Run the application

    streamlit run app.py
    

Updating Metrics

To update the table, update the respective metric file in src/data/metrics directory

Updating Text

To update the tab on Benchmark details, make changes to the the following file - src/components/tasks.py To update the metric definitions, make changes to the following file - src/components/tasks.py

Adding New Metrics

To add a new metric:

  1. Create a new JSON data file in the src/data/metrics/ directory (e.g., src/data/metrics/new_metric.json)

  2. Update metrics_config in src/utils/config.py:

    metrics_config = {
        "Margin to Human": { ... },
        "New Metric Name": {
            "file": "src/data/metrics/new_metric.json",
            "description": "Description of the new metric",
            "min_value": 0,
            "max_value": 100,
            "color_map": "viridis"
        }
    }
    
  3. Ensure your metric JSON file follows the same format as existing metrics:

    {
      "task-name": {
        "model-name-1": value,
        "model-name-2": value
      },
      "another-task": {
        "model-name-1": value,
        "model-name-2": value
      }
    }
    

Adding New Agent Types

To add new agent types:

  1. Update model_categories in src/utils/config.py:
    model_categories = {
        "Existing Model": "Category",
        "New Model Name": "New Category"
    }
    

License

MIT License