Skip to main content

agentchat.contrib.captainagent.tools.information_retrieval.extract_pdf_text

extract_pdf_text

@with_requirements(["PyMuPDF"])
def extract_pdf_text(pdf_path, page_number=None)

Extracts text from a specified page or the entire PDF file.

Arguments:

  • pdf_path str - The path to the PDF file.
  • page_number int, optional - The page number to extract (starting from 0). If not provided, the function will extract text from the entire PDF file.

Returns:

  • str - The extracted text.