title: Bug Priority Multiclass
emoji: π»
colorFrom: red
colorTo: gray
sdk: docker
pinned: false
short_description: This is a Multiclass Bug Priority Model
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
tags:
- text-classification
- accessibility
- bug-triage
- transformers
- roberta
- pytorch-lightning license: apache-2.0 datasets:
- custom language:
- en
RoBERTa Base Model for Accessibility Bug Priority Classification
This model fine-tunes roberta-base
using a labeled dataset of accessibility-related bug descriptions to automatically classify their priority level. It helps automate the triage of bugs affecting users of screen readers and other assistive technologies.
π§ Problem Statement
Modern applications often suffer from accessibility issues that impact users with disabilities, such as content not being read properly by screen readers like VoiceOver, NVDA, or JAWS. These bugs are often reported via issue trackers or user forums in the form of short text summaries.
Due to the unstructured and domain-specific nature of these reports, manual triage is:
- Time-consuming
- Inconsistent
- Often delayed in resolution
There is a critical need to prioritize accessibility bugs quickly and accurately to ensure inclusive user experiences.
π― Research Objective
This research project builds a machine learning model that can automatically assign a priority level to an accessibility bug report. The goal is to:
- Streamline accessibility QA workflows
- Accelerate high-impact fixes
- Empower developers and testers with ML-assisted tooling
π Dataset Statistics
The dataset used for training consists of real-world accessibility bug reports, each labeled with one of four priority levels. The distribution of labels is imbalanced, and label-aware preprocessing steps were taken to improve model performance.
Label | Priority Level | Count |
---|---|---|
1 | Critical | 2035 |
2 | Major | 1465 |
0 | Blocker | 804 |
3 | Minor | 756 |
Total Samples: 5,060
π§Ή Preprocessing
- Text normalization and cleanup
- Length filtering based on token count
- Label frequency normalization for class-weighted loss
To address class imbalance, class weights were computed as inverse label frequency and used in the cross-entropy loss during training.
π§ͺ Dataset Description
The dataset consists of short bug report texts labeled with one of four priority levels:
Label | Meaning |
---|---|
0 | Blocker |
1 | Critical |
2 | Major |
3 | Minor |
βοΈ Sample Entries:
Text,Label
"mac voiceover screen reader",3
"Firefox crashes when interacting with some MathML content using Voiceover on Mac",0
"VoiceOver skips over text in paragraphs which contain <strong> or <em> tags",2
π Model Comparison
We fine-tuned and evaluated three transformer models under identical training conditions using PyTorch Lightning (multi-GPU, mixed precision, and weighted loss). The validation accuracy and F1 scores are as follows:
Model | Base Architecture | Validation Accuracy | Weighted F1 Score |
---|---|---|---|
DeBERTa-v3 Base | microsoft/deberta-v3-base | 69% | 0.69 |
ALBERT Base | albert-base-v2 | 68% | 0.68 |
RoBERTa Base | roberta-base | 66% | 0.67 |
π Observations
- DeBERTa delivered the best performance, likely due to its disentangled attention and enhanced positional encoding.
- ALBERT performed surprisingly well despite having fewer parameters, showcasing its efficiency.
- RoBERTa provided stable and reliable results but slightly underperformed compared to the others.
RoBERTa Base Model for Accessibility Priority Classification
This model fine-tunes roberta-base
using a 4-class custom dataset to classify accessibility issues by priority. It was trained using PyTorch Lightning and optimized with mixed precision on multiple GPUs.
Details
- Model: roberta-base
- Framework: PyTorch Lightning
- Labels: 0 (Blocker), 1 (Critical), 2 (Major), 3 (Minor)
- Validation F1: 0.71 (weighted)
Usage
from transformers import RobertaTokenizer, RobertaForSequenceClassification
import torch
model = RobertaForSequenceClassification.from_pretrained("shivamjadhav/roberta-priority-multiclass")
tokenizer = RobertaTokenizer.from_pretrained("shivamjadhav/roberta-priority-multiclass")
inputs = tokenizer("VoiceOver skips over text with <strong> tags", return_tensors="pt")
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=1).item()
print("Predicted Priority:", prediction)