Spaces:
Running
Running
Commit
Β·
a57714d
1
Parent(s):
470f102
UX and formatted text fix
Browse files
app.py
CHANGED
@@ -35,7 +35,19 @@ label_mapping = {
|
|
35 |
39: 'text-davinci-002', 40: 'text-davinci-003'
|
36 |
}
|
37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
def classify_text(text):
|
|
|
39 |
if not text.strip():
|
40 |
result_message = (
|
41 |
f"---- \n"
|
@@ -72,7 +84,7 @@ def classify_text(text):
|
|
72 |
else:
|
73 |
result_message = (
|
74 |
f"**The text is** <span class='highlight-ai'>**{ai_total_prob:.2f}%** likely <b>AI generated</b>.</span>\n\n"
|
75 |
-
f"**Identified AI Model
|
76 |
)
|
77 |
|
78 |
return result_message
|
@@ -92,7 +104,7 @@ This tool uses the <b>ModernBERT</b> model to identify whether a given text was
|
|
92 |
<div style="line-height: 1.8;">
|
93 |
β
<b>Human Verification:</b> Human-written content is clearly marked.<br>
|
94 |
π <b>Model Detection:</b> Can identify content from over 40 AI models.<br>
|
95 |
-
π <b>Accuracy:</b> Works best with longer texts
|
96 |
π <b>Read more:</b> Our method is detailed in our paper:
|
97 |
<a href="https://aclanthology.org/2025.genaidetect-1.15/" target="_blank" style="color: #007bff; text-decoration: none;"><b>LINK</b></a>.
|
98 |
</div>
|
@@ -110,8 +122,8 @@ AI_texts = [
|
|
110 |
]
|
111 |
|
112 |
Human_texts = [
|
113 |
-
"The present book is intended as a text in basic mathematics. As such, it can have multiple use: for a one-year course in the high schools during the third or fourth year (if possible the third, so that calculus can be taken during the fourth year); for a complementary reference in earlier high school grades (elementary algebra and geometry are covered); for a one-semester course at the college level, to review or to get a firm foundation in the basic mathematics necessary to go ahead in calculus, linear algebra, or other topics. Years ago, the colleges used to give courses in β college algebraβ and other subjects which should have been covered in high school. More recently, such courses have been thought unnecessary, but some experiences I have had show that they are just as necessary as ever. What is happening is that thecolleges are getting a wide variety of students from high schools, ranging from exceedingly well-prepared ones who have had a good first course in calculus, down to very poorly prepared ones.
|
114 |
-
"Fats are rich in energy, build body cells, support brain development of infants, help body processes, and facilitate the absorption and use of fat-soluble vitamins A, D, E, and K. The major component of lipids is glycerol and fatty acids. According to chemical properties, fatty acids can be divided into saturated and unsaturated fatty acids. Generally lipids containing saturated fatty acids are solid at room temperature and include animal fats (butter, lard, tallow, ghee) and tropical oils (palm,coconut, palm kernel). Saturated fats increase the risk of heart disease."
|
115 |
"BERT, which stands for Bidirectional Encoder Representations from Transformers, is a deep learning model introduced by Google in 2018 to help machines understand the complex nuances of human language. Thanks to its Transformer-based architecture, it can grasp the deeper meaning and context of words in the text. This makes BERT especially effective at tasks like text classification, translation, question answering, and language inference."]
|
116 |
|
117 |
iface = gr.Blocks(css="""
|
|
|
35 |
39: 'text-davinci-002', 40: 'text-davinci-003'
|
36 |
}
|
37 |
|
38 |
+
def clean_text(text):
|
39 |
+
|
40 |
+
text = text.replace("\r\n", "\n").replace("\r", "\n")
|
41 |
+
|
42 |
+
text = re.sub(r"\n\s*\n+", "\n\n", text)
|
43 |
+
|
44 |
+
text = re.sub(r"[ \t]+", " ", text)
|
45 |
+
|
46 |
+
text = text.strip()
|
47 |
+
return text
|
48 |
+
|
49 |
def classify_text(text):
|
50 |
+
cleaned_text = clean_text(text)
|
51 |
if not text.strip():
|
52 |
result_message = (
|
53 |
f"---- \n"
|
|
|
84 |
else:
|
85 |
result_message = (
|
86 |
f"**The text is** <span class='highlight-ai'>**{ai_total_prob:.2f}%** likely <b>AI generated</b>.</span>\n\n"
|
87 |
+
f"**Identified AI Model: {ai_argmax_model}**"
|
88 |
)
|
89 |
|
90 |
return result_message
|
|
|
104 |
<div style="line-height: 1.8;">
|
105 |
β
<b>Human Verification:</b> Human-written content is clearly marked.<br>
|
106 |
π <b>Model Detection:</b> Can identify content from over 40 AI models.<br>
|
107 |
+
π <b>Accuracy:</b> Works best with longer texts.<br>
|
108 |
π <b>Read more:</b> Our method is detailed in our paper:
|
109 |
<a href="https://aclanthology.org/2025.genaidetect-1.15/" target="_blank" style="color: #007bff; text-decoration: none;"><b>LINK</b></a>.
|
110 |
</div>
|
|
|
122 |
]
|
123 |
|
124 |
Human_texts = [
|
125 |
+
"The present book is intended as a text in basic mathematics. As such, it can have multiple use: for a one-year course in the high schools during the third or fourth year (if possible the third, so that calculus can be taken during the fourth year); for a complementary reference in earlier high school grades (elementary algebra and geometry are covered); for a one-semester course at the college level, to review or to get a firm foundation in the basic mathematics necessary to go ahead in calculus, linear algebra, or other topics. Years ago, the colleges used to give courses in β college algebraβ and other subjects which should have been covered in high school. More recently, such courses have been thought unnecessary, but some experiences I have had show that they are just as necessary as ever. What is happening is that thecolleges are getting a wide variety of students from high schools, ranging from exceedingly well-prepared ones who have had a good first course in calculus, down to very poorly prepared ones.",
|
126 |
+
"Fats are rich in energy, build body cells, support brain development of infants, help body processes, and facilitate the absorption and use of fat-soluble vitamins A, D, E, and K. The major component of lipids is glycerol and fatty acids. According to chemical properties, fatty acids can be divided into saturated and unsaturated fatty acids. Generally lipids containing saturated fatty acids are solid at room temperature and include animal fats (butter, lard, tallow, ghee) and tropical oils (palm,coconut, palm kernel). Saturated fats increase the risk of heart disease.",
|
127 |
"BERT, which stands for Bidirectional Encoder Representations from Transformers, is a deep learning model introduced by Google in 2018 to help machines understand the complex nuances of human language. Thanks to its Transformer-based architecture, it can grasp the deeper meaning and context of words in the text. This makes BERT especially effective at tasks like text classification, translation, question answering, and language inference."]
|
128 |
|
129 |
iface = gr.Blocks(css="""
|