File size: 1,470 Bytes
ccc98cc
 
7fa875a
 
 
 
ccc98cc
 
7fa875a
ccc98cc
7fa875a
ccc98cc
7fa875a
 
ccc98cc
7fa875a
ccc98cc
 
 
7fa875a
 
 
ccc98cc
 
 
 
 
 
 
7fa875a
ccc98cc
 
 
7fa875a
ccc98cc
7fa875a
 
 
ccc98cc
7fa875a
ccc98cc
7fa875a
ccc98cc
7fa875a
ccc98cc
7fa875a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
library_name: transformers
license: apache-2.0
datasets:
- grascii/gregg-preanniversary-words
pipeline_tag: image-to-text
---

# Gregg Vision v0.2.1

Gregg Vision v0.2.1 generates a [Grascii](https://github.com/grascii/grascii) representation of a Gregg Shorthand form.

- **Model type:** Vision Encoder Text Decoder
- **License:** Apache 2.0
- **Repository:** [More Information Needed]
- **Demo:** [Grascii Search Space](https://huggingface.co/spaces/grascii/search)

## Uses

Given a grayscale image of a single shorthand form, Gregg Vision can be used to
generate its Grascii representation. When combined with [Grascii Search](https://github.com/grascii/grascii),
one can obtain possible English interpretations of the shorthand form.

## How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

## Technical Details

### Model Architecture and Objective

Gregg Vision v0.2.1 is a transformer model with a ViT encoder and a Roberta decoder.

For training, the model was warm-started using
[vit-small-patch16-224-single-channel](https://huggingface.co/grascii/vit-small-patch16-224-single-channel)
for the encoder and a randomly initialized Roberta network for the decoder.

### Training Data

Gregg Vision v0.2.1 was trained on the [gregg-preanniversary-words](https://huggingface.co/datasets/grascii/gregg-preanniversary-words) dataset.

### Training Hardware

Gregg Vision v0.2.1 was trained using 1xT4.