How do I deploy the image embedding model on AWS using HuggingFace?

#6
by tararelan - opened

I want to deploy this model so I can get image embeddings but don't want to use AWS JumpStart or anything. I tried deploying it without specifying a task, but I received the following error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400)
from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Task couldn\u0027t be inferenced from NomicVisionModel.Inference Toolkit can only inference tasks
from architectures ending with [\u0027TapasForQuestionAnswering\u0027, \u0027ForQuestionAnswering\u0027,
\u0027ForTokenClassification\u0027, \u0027ForSequenceClassification\u0027, \u0027ForMultipleChoice\u0027,
\u0027ForMaskedLM\u0027, \u0027ForCausalLM\u0027, \u0027ForConditionalGeneration\u0027, \u0027MTModel\u0027,
\u0027EncoderDecoderModel\u0027, \u0027GPT2LMHeadModel\u0027, \u0027T5WithLMHeadModel\u0027].Use env HF_TASK to
define your task."
}
"

This is my code:
import sagemaker
import boto3
sess = sagemaker.Session()

sagemaker session bucket -> used for uploading data, models and logs

sagemaker will automatically create this bucket if it does not exist

sagemaker_session_bucket=None
if sagemaker_session_bucket is None and sess is not None:
# set to default bucket if a bucket name is not given
sagemaker_session_bucket = sess.default_bucket()

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)

print(f"sagemaker role arn: {role}")
print(f"sagemaker session region: {sess.boto_region_name}")

import json
from sagemaker.huggingface import HuggingFaceModel

sagemaker config

instance_type = "ml.g5.xlarge"

Define Model and Endpoint configuration parameter

config = {
'HF_MODEL_ID': "nomic-ai/nomic-embed-vision-v1.5", # model_id from hf.co/models
'HF_TASK': "image-feature-extraction"
}

create HuggingFaceModel with the image uri

emb_model = HuggingFaceModel(
role=role,
env=config,
transformers_version='4.37.0',
pytorch_version='2.1.0',
py_version='py310',
)

Deploy model to an endpoint

https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.model.Model.deploy

emb = emb_model.deploy(
initial_instance_count=1,
instance_type=instance_type,
)

data = {
"inputs": ["20201229_141048.jpg",
# "https://dummyimage.com/333/000/fff.jpg&text=embed+this"
]
}

res = emb.predict(data=data)

print some results

print(f"length of embeddings: {len(res[0])}")
print(f"first 10 elements of embeddings: {res[0][:10]}")

emb.delete_model()
emb.delete_endpoint()

I just modified the code for text embeddings. Any help would be appreciated!

We couldn't figure it out either, so that's why we made the Sagemaker offering with our own inference stack :)

Do share the solution in this thread if you find it, though - developers should have options!

Thank you for your reply, for now I'm going to try another model. One day I'll come back to this and try again!

tararelan changed discussion status to closed

Sign up or log in to comment