shivamjadhav commited on
Commit
a896d2d
Β·
1 Parent(s): 0fe35b7

created Bug Priority model and hugging face deployment read project

Browse files
Files changed (1) hide show
  1. README.md +125 -0
README.md CHANGED
@@ -9,3 +9,128 @@ short_description: This is a Multiclass Bug Priority Model
9
  ---
10
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
12
+
13
+ tags:
14
+ - text-classification
15
+ - accessibility
16
+ - bug-triage
17
+ - transformers
18
+ - roberta
19
+ - pytorch-lightning
20
+ license: apache-2.0
21
+ datasets:
22
+ - custom
23
+ language:
24
+ - en
25
+
26
+ # RoBERTa Base Model for Accessibility Bug Priority Classification
27
+
28
+ This model fine-tunes `roberta-base` using a labeled dataset of accessibility-related bug descriptions to automatically classify their **priority level**. It helps automate the triage of bugs affecting users of screen readers and other assistive technologies.
29
+
30
+
31
+ ## 🧠 Problem Statement
32
+
33
+ Modern applications often suffer from accessibility issues that impact users with disabilities, such as content not being read properly by screen readers like **VoiceOver**, **NVDA**, or **JAWS**. These bugs are often reported via issue trackers or user forums in the form of short text summaries.
34
+
35
+ Due to the unstructured and domain-specific nature of these reports, manual triage is:
36
+ - Time-consuming
37
+ - Inconsistent
38
+ - Often delayed in resolution
39
+
40
+ There is a critical need to **prioritize accessibility bugs quickly and accurately** to ensure inclusive user experiences.
41
+
42
+
43
+ ## 🎯 Research Objective
44
+
45
+ This research project builds a machine learning model that can **automatically assign a priority level** to an accessibility bug report. The goal is to:
46
+
47
+ - Streamline accessibility QA workflows
48
+ - Accelerate high-impact fixes
49
+ - Empower developers and testers with ML-assisted tooling
50
+
51
+ ## πŸ“Š Dataset Statistics
52
+
53
+ The dataset used for training consists of real-world accessibility bug reports, each labeled with one of four priority levels. The distribution of labels is imbalanced, and label-aware preprocessing steps were taken to improve model performance.
54
+
55
+ | Label | Priority Level | Count |
56
+ |-------|----------------|-------|
57
+ | 1 | Medium | 2035 |
58
+ | 2 | High | 1465 |
59
+ | 0 | Low | 804 |
60
+ | 3 | Critical | 756 |
61
+
62
+ **Total Samples**: 5,060
63
+
64
+ ### 🧹 Preprocessing
65
+
66
+ - Text normalization and cleanup
67
+ - Length filtering based on token count
68
+ - Label frequency normalization for class-weighted loss
69
+
70
+ To address class imbalance, class weights were computed as inverse label frequency and used in the cross-entropy loss during training.
71
+
72
+ ## πŸ§ͺ Dataset Description
73
+
74
+ The dataset consists of short bug report texts labeled with one of four priority levels:
75
+
76
+ | Label | Meaning |
77
+ |-------|-------------|
78
+ | 0 | Low |
79
+ | 1 | Medium |
80
+ | 2 | High |
81
+ | 3 | Critical |
82
+
83
+ ### ✏️ Sample Entries:
84
+
85
+ ```csv
86
+ Text,Label
87
+ "mac voiceover screen reader",3
88
+ "Firefox crashes when interacting with some MathML content using Voiceover on Mac",0
89
+ "VoiceOver skips over text in paragraphs which contain <strong> or <em> tags",2
90
+ ```
91
+
92
+
93
+ ## πŸ“Š Model Comparison
94
+
95
+ We fine-tuned and evaluated three transformer models under identical training conditions using PyTorch Lightning (multi-GPU, mixed precision, and weighted loss). The validation accuracy and F1 scores are as follows:
96
+
97
+ | Model | Base Architecture | Validation Accuracy | Weighted F1 Score |
98
+ |-----------------|----------------------------|---------------------|-------------------|
99
+ | DeBERTa-v3 Base | microsoft/deberta-v3-base | **69%** | **0.69** |
100
+ | ALBERT Base | albert-base-v2 | 68% | 0.68 |
101
+ | RoBERTa Base | roberta-base | 66% | 0.67 |
102
+
103
+ ### πŸ“ Observations
104
+
105
+ - **DeBERTa** delivered the best performance, likely due to its *disentangled attention* and *enhanced positional encoding*.
106
+ - **ALBERT** performed surprisingly well despite having fewer parameters, showcasing its efficiency.
107
+ - **RoBERTa** provided stable and reliable results but slightly underperformed compared to the others.
108
+
109
+
110
+ # RoBERTa Base Model for Accessibility Priority Classification
111
+
112
+ This model fine-tunes `roberta-base` using a 4-class custom dataset to classify accessibility issues by priority. It was trained using PyTorch Lightning and optimized with mixed precision on multiple GPUs.
113
+
114
+ ## Details
115
+
116
+ - **Model**: roberta-base
117
+ - **Framework**: PyTorch Lightning
118
+ - **Labels**: 0 (Low), 1 (Medium), 2 (High), 3 (Critical)
119
+ - **Validation F1**: 0.71 (weighted)
120
+
121
+ ## Usage
122
+
123
+ ```python
124
+ from transformers import RobertaTokenizer, RobertaForSequenceClassification
125
+ import torch
126
+
127
+ model = RobertaForSequenceClassification.from_pretrained("your-username/roberta-priority-multiclass")
128
+ tokenizer = RobertaTokenizer.from_pretrained("your-username/roberta-priority-multiclass")
129
+
130
+ inputs = tokenizer("VoiceOver skips over text with <strong> tags", return_tensors="pt")
131
+ outputs = model(**inputs)
132
+ prediction = torch.argmax(outputs.logits, dim=1).item()
133
+
134
+ print("Predicted Priority:", prediction)
135
+ ```
136
+