Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -69,6 +69,8 @@ widget: 
     | 
|
| 69 | 
         | 
| 70 | 
         
             
            # flan-t5-small-instructiongen
         
     | 
| 71 | 
         | 
| 
         | 
|
| 
         | 
|
| 72 | 
         
             
            This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on the None dataset.
         
     | 
| 73 | 
         
             
            It achieves the following results on the evaluation set:
         
     | 
| 74 | 
         
             
            - Loss: 1.3401
         
     | 
| 
         @@ -78,17 +80,17 @@ It achieves the following results on the evaluation set: 
     | 
|
| 78 | 
         
             
            - Rougelsum: 50.338
         
     | 
| 79 | 
         
             
            - Gen Len: 14.0450
         
     | 
| 80 | 
         | 
| 81 | 
         
            -
            ## Model description
         
     | 
| 82 | 
         
            -
             
     | 
| 83 | 
         
            -
            More information needed
         
     | 
| 84 | 
         
            -
             
     | 
| 85 | 
         
             
            ## Intended uses & limitations
         
     | 
| 86 | 
         | 
| 87 | 
         
            -
             
     | 
| 
         | 
|
| 
         | 
|
| 88 | 
         | 
| 89 | 
         
             
            ## Training and evaluation data
         
     | 
| 90 | 
         | 
| 91 | 
         
            -
             
     | 
| 
         | 
|
| 
         | 
|
| 92 | 
         | 
| 93 | 
         
             
            ## Training procedure
         
     | 
| 94 | 
         | 
| 
         | 
|
| 69 | 
         | 
| 70 | 
         
             
            # flan-t5-small-instructiongen
         
     | 
| 71 | 
         | 
| 72 | 
         
            +
            Instead of generating questions from text, generate instructions for LLMs!
         
     | 
| 73 | 
         
            +
             
     | 
| 74 | 
         
             
            This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on the None dataset.
         
     | 
| 75 | 
         
             
            It achieves the following results on the evaluation set:
         
     | 
| 76 | 
         
             
            - Loss: 1.3401
         
     | 
| 
         | 
|
| 80 | 
         
             
            - Rougelsum: 50.338
         
     | 
| 81 | 
         
             
            - Gen Len: 14.0450
         
     | 
| 82 | 
         | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 83 | 
         
             
            ## Intended uses & limitations
         
     | 
| 84 | 
         | 
| 85 | 
         
            +
            This is just a **small** model/example. There is likely to be even better performance with larger models (ex [pszemraj/bart-base-instructiongen)](https://huggingface.co/pszemraj/bart-base-instructiongen) generalizes better)
         
     | 
| 86 | 
         
            +
             
     | 
| 87 | 
         
            +
            Additionally, this was trained on a dataset of **only** instructions+outputs, with the `inputs` filtered out. This means that text of *1) cookies and cream 2) chocolate chip 3) mint chip 4) oreo* will **not** get you *"Rank the following ice cream flavors: oreo, mint chip, chocolate chip, cookies and cream"*.
         
     | 
| 88 | 
         | 
| 89 | 
         
             
            ## Training and evaluation data
         
     | 
| 90 | 
         | 
| 91 | 
         
            +
            See the linked dataset `pszemraj/fleece2instructions` - it is a filtered/formatted version of `tatsu-lab/alpaca` to generate instructions for arbitrary text.
         
     | 
| 92 | 
         
            +
             
     | 
| 93 | 
         
            +
            - Some of the API examples are intentionally weird to demonstrate the generalizability of the model.
         
     | 
| 94 | 
         | 
| 95 | 
         
             
            ## Training procedure
         
     | 
| 96 | 
         |