transformers-CFG-JSON-demo

Sleeping

Saibo-backup commited on Apr 10, 2024

Commit

4d5cbc9

1 Parent(s): b5b0c27

better documentation

Files changed (1) hide show

app.py CHANGED Viewed

@@ -43,7 +43,7 @@ if __name__ == "__main__":
         grammar_processor = GrammarConstrainedLogitsProcessor(grammar)
         outputs = model.generate(
-            **inputs, max_new_tokens=50, repetition_penalty=1, return_dict_in_generate=True, output_scores=True, logits_processor=[grammar_processor]
         )
         # Important: don't forget to set `normalize_logits=True` to obtain normalized probabilities (i.e. sum(p) = 1)
         transition_scores = model.compute_transition_scores(outputs.sequences, outputs.scores, normalize_logits=True)
@@ -72,11 +72,15 @@ if __name__ == "__main__":
         gr.Markdown(
             """
             # 👻 Transformers-CFG JSON Demo
-            This is a demo of how you can constrain the output of a GPT-2 model to be a valid JSON string.
             Here we use a simple JSON grammar to constrain the output of the model.
             The grammar is defined in `json_minimal.ebnf` and is written in the Extended Backus-Naur Form (EBNF).
             Internally, it relies on the library [`transformers-cfg`](https://github.com/epfl-dlab/transformers-CFG).
-            For demo purpose, gpt2 is used, but you can use much larger models for better performance.
             """
         )

         grammar_processor = GrammarConstrainedLogitsProcessor(grammar)
         outputs = model.generate(
+            **inputs, max_new_tokens=30, repetition_penalty=1, return_dict_in_generate=True, output_scores=True, logits_processor=[grammar_processor]
         )
         # Important: don't forget to set `normalize_logits=True` to obtain normalized probabilities (i.e. sum(p) = 1)
         transition_scores = model.compute_transition_scores(outputs.sequences, outputs.scores, normalize_logits=True)
         gr.Markdown(
             """
             # 👻 Transformers-CFG JSON Demo
+            This is a demo of how you can constrain the output of a GPT-2 model to be a valid JSON string(up to truncation).
             Here we use a simple JSON grammar to constrain the output of the model.
             The grammar is defined in `json_minimal.ebnf` and is written in the Extended Backus-Naur Form (EBNF).
             Internally, it relies on the library [`transformers-cfg`](https://github.com/epfl-dlab/transformers-CFG).
+            For demo purpose, gpt2-large is used, but you can use much larger models for better performance.
+            The inference is a bit slow because of the inference is run on **CPU(~20s for 30 tokens)**.
+            The constraint itself **doesn't** introduce significant overhead to the inference.
             """
         )