Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
garak-llm
/
attackgeneration-toxicity_gpt2
like
1
Follow
garak, LLM vulnerability scanner
9
Safetensors
Anthropic/hh-rlhf
google/jigsaw_unintended_bias
English
gpt2
Not-For-All-Audiences
arxiv:
2204.05862
License:
apache-2.0
Model card
Files
Files and versions
Community
leondz
commited on
Aug 27, 2024
Commit
31fcab3
·
verified
·
1 Parent(s):
a10ec10
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+7
-0
README.md
ADDED
Viewed
@@ -0,0 +1,7 @@
1
+
---
2
+
language:
3
+
- en
4
+
base_model: openai-community/gpt2
5
+
---
6
+
7
+
See https://interhumanagreement.substack.com/p/faketoxicityprompts-automatic-red