Skip to main content

agentchat.contrib.captainagent.tools.information_retrieval.image_qa

image_qa

@with_requirements(["transformers", "torch"],
["transformers", "torch", "PIL", "os"])
def image_qa(image, question, ckpt="Salesforce/blip-vqa-base")

Perform question answering on an image using a pre-trained VQA model.

Arguments:

  • image Union[str, Image.Image] - The image to perform question answering on. It can be either file path to the image or a PIL Image object.
  • question - The question to ask about the image.

Returns:

  • dict - The generated answer text.