baseten
/

example-Meta-Llama-3-70B-InstructForSequenceClassification

Model card Files Files and versions Community

michaelfeil commited on Feb 2

Commit

6b53f65

·

verified ·

1 Parent(s): bdf6773

Update README.md

Files changed (1) hide show

README.md +35 -0

README.md CHANGED Viewed

@@ -1,6 +1,41 @@
 ---
 {}
 ---
 ```python
 #!/usr/bin/env python
 import torch

 ---
 {}
 ---
+Deployment:
+```
+build_commands: []
+external_package_dirs: []
+model_metadata: {}
+model_name: fp8-baseten/example-Meta-Llama-3-70B-InstructForSequenceClassification
+python_version: py39
+requirements: []
+resources:
+  accelerator: H100:1
+  cpu: "1"
+  memory: 64Gi
+  use_gpu: true
+secrets:
+  hf_access_token: set token in baseten workspace
+system_packages: []
+trt_llm:
+  build:
+    base_model: encoder
+    # automatically infered from config[max_position_embeddings]
+    max_seq_len: 42
+    # max_batch_size per dynamic batch, recommended to stay at 32
+    max_batch_size: 32
+    # max num tokens per dynamic batch, strongly recommended to keep this number
+    max_num_tokens: 16384
+    checkpoint_repository:
+      source: HF
+      repo: "baseten/example-Meta-Llama-3-70B-InstructForSequenceClassification"
+      revision: "main" # hf revision hash
+    quantization_type: fp8
+    num_builder_gpus: 4
+```
+Reproduce this model:
 ```python
 #!/usr/bin/env python
 import torch