Spaces:

shivamjadhav
/

Bug_Priority_Multiclass

Sleeping

App Files Files

xet

Community

shivamjadhav commited on May 8

Commit

a896d2d

1 Parent(s): 0fe35b7

created Bug Priority model and hugging face deployment read project

Browse files

Files changed (1) hide show

README.md +125 -0

README.md CHANGED Viewed

@@ -9,3 +9,128 @@ short_description: This is a Multiclass Bug Priority Model
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+tags:
+- text-classification
+- accessibility
+- bug-triage
+- transformers
+- roberta
+- pytorch-lightning
+license: apache-2.0
+datasets:
+- custom
+language:
+- en
+# RoBERTa Base Model for Accessibility Bug Priority Classification
+This model fine-tunes `roberta-base` using a labeled dataset of accessibility-related bug descriptions to automatically classify their **priority level**. It helps automate the triage of bugs affecting users of screen readers and other assistive technologies.
+## 🧠 Problem Statement
+Modern applications often suffer from accessibility issues that impact users with disabilities, such as content not being read properly by screen readers like **VoiceOver**, **NVDA**, or **JAWS**. These bugs are often reported via issue trackers or user forums in the form of short text summaries.
+Due to the unstructured and domain-specific nature of these reports, manual triage is:
+- Time-consuming
+- Inconsistent
+- Often delayed in resolution
+There is a critical need to **prioritize accessibility bugs quickly and accurately** to ensure inclusive user experiences.
+## 🎯 Research Objective
+This research project builds a machine learning model that can **automatically assign a priority level** to an accessibility bug report. The goal is to:
+- Streamline accessibility QA workflows
+- Accelerate high-impact fixes
+- Empower developers and testers with ML-assisted tooling
+## 📊 Dataset Statistics
+The dataset used for training consists of real-world accessibility bug reports, each labeled with one of four priority levels. The distribution of labels is imbalanced, and label-aware preprocessing steps were taken to improve model performance.
+| Label | Priority Level | Count |
+|-------|----------------|-------|
+| 1     | Medium         | 2035  |
+| 2     | High           | 1465  |
+| 0     | Low            | 804   |
+| 3     | Critical       | 756   |
+**Total Samples**: 5,060
+### 🧹 Preprocessing
+- Text normalization and cleanup
+- Length filtering based on token count
+- Label frequency normalization for class-weighted loss
+To address class imbalance, class weights were computed as inverse label frequency and used in the cross-entropy loss during training.
+## 🧪 Dataset Description
+The dataset consists of short bug report texts labeled with one of four priority levels:
+| Label | Meaning     |
+|-------|-------------|
+| 0     | Low         |
+| 1     | Medium      |
+| 2     | High        |
+| 3     | Critical    |
+### ✏️ Sample Entries:
+```csv
+Text,Label
+"mac voiceover screen reader",3
+"Firefox crashes when interacting with some MathML content using Voiceover on Mac",0
+"VoiceOver skips over text in paragraphs which contain <strong> or <em> tags",2
+```
+## 📊 Model Comparison
+We fine-tuned and evaluated three transformer models under identical training conditions using PyTorch Lightning (multi-GPU, mixed precision, and weighted loss). The validation accuracy and F1 scores are as follows:
+| Model           | Base Architecture          | Validation Accuracy | Weighted F1 Score |
+|-----------------|----------------------------|---------------------|-------------------|
+| DeBERTa-v3 Base | microsoft/deberta-v3-base  | **69%**             | **0.69**          |
+| ALBERT Base     | albert-base-v2             | 68%                 | 0.68              |
+| RoBERTa Base    | roberta-base               | 66%                 | 0.67              |
+### 📝 Observations
+- **DeBERTa** delivered the best performance, likely due to its *disentangled attention* and *enhanced positional encoding*.
+- **ALBERT** performed surprisingly well despite having fewer parameters, showcasing its efficiency.
+- **RoBERTa** provided stable and reliable results but slightly underperformed compared to the others.
+# RoBERTa Base Model for Accessibility Priority Classification
+This model fine-tunes `roberta-base` using a 4-class custom dataset to classify accessibility issues by priority. It was trained using PyTorch Lightning and optimized with mixed precision on multiple GPUs.
+## Details
+- **Model**: roberta-base
+- **Framework**: PyTorch Lightning
+- **Labels**: 0 (Low), 1 (Medium), 2 (High), 3 (Critical)
+- **Validation F1**: 0.71 (weighted)
+## Usage
+```python
+from transformers import RobertaTokenizer, RobertaForSequenceClassification
+import torch
+model = RobertaForSequenceClassification.from_pretrained("your-username/roberta-priority-multiclass")
+tokenizer = RobertaTokenizer.from_pretrained("your-username/roberta-priority-multiclass")
+inputs = tokenizer("VoiceOver skips over text with <strong> tags", return_tensors="pt")
+outputs = model(**inputs)
+prediction = torch.argmax(outputs.logits, dim=1).item()
+print("Predicted Priority:", prediction)
+```