Spaces:

akshit-g
/

SeeForMe-Live

Running

App Files Files Community

SeeForMe-Live / clients /python /README.md

akshit-g

add : files

d3cd5c1 4 months ago

preview code

raw

history blame

2.15 kB

	# Moondream Python Client Library

	Python client library for moondream. This library is an alpha preview -- it is
	in an early stage of development, and backward compatibility is not yet
	guaranteed. If you are using this in production, please pin the revision you
	are using.

	This library currently offers optimized CPU inference, but will be slower than
	the PyTorch implementation for CUDA and MPS backends. If you are running on a
	Mac with M1/M2/M3 etc. chips, or if you have a GPU available, this library is
	not recommended yet.

	## Setup

	Install the library using pip:

	```bash
	pip install moondream==0.0.2
	```

	Then download the model weights:

	```bash
	# int8 weights (recommended):
	wget "https://huggingface.co/vikhyatk/moondream2/resolve/client/moondream-latest-int8.bin.gz?download=true" -O - \| gunzip > moondream-latest-int8.bin
	# ...or, for fp16 weights (full precision):
	wget "https://huggingface.co/vikhyatk/moondream2/resolve/client/moondream-latest-f16.bin.gz?download=true" -O - \| gunzip > moondream-latest-f16.bin
	# ...or, for int4 weights (resource constrained environments):
	wget "https://huggingface.co/vikhyatk/moondream2/resolve/client/moondream-latest-int4.bin.gz?download=true" -O - \| gunzip > moondream-latest-int4.bin
	```

	## Usage

	```python
	import moondream as md
	from PIL import Image

	model = md.VL("moondream-latest-int8.bin")
	image = Image.open("path/to/image.jpg").convert("RGB")

	# Optional -- encode the image to efficiently run multiple queries on the same
	# image. This is not mandatory, since the model will automatically encode the
	# image if it is not already encoded.
	encoded_image = model.encode_image(image)

	# Caption the image.
	caption = model.caption(encoded_image)

	# ...or, if you want to stream the output:
	for t in model.caption(encoded_image, stream=True)["caption"]:
	print(t, end="", flush=True)

	# Ask a question about the image.
	question = "How many people are in this image?"
	answer = model.query(encoded_image, question)["answer"]

	# ...or again, if you want to stream the output:
	for t in model.query(encoded_image, question, stream=True)["answer"]:
	print(t, end="", flush=True)
	```