Visual Document Retrieval
ColPali
Safetensors
English
vidore
vidore-experimental

Serverless API

#1
by alexanderniebuhr - opened

I'm trying to understand how to deploy this as an serverless API, so I can submit an image as an Input and receive the embeddings as an Output. Could you help with that?

For example, using a function that you can deploy via, say, FastAPI, or some serverless setup. This just creates embeddings for the specified page, and returns that as a JSON string:

    images: list[Image.Image] = convert_from_bytes(pdf_data, dpi=300, first_page=page_number, last_page=page_number)
    image = images[0]

    embeddings = generate_embeddings(colqwen2_model, colqwen2_processor.process_images([image]))[0]
    page_embeddings = [binary_quantize(e) for e in embeddings]
    embeddings_list = [e.to_text() for e in page_embeddings]
    return {"embeddings": json.dumps(embeddings_list)}

Sign up or log in to comment