CambioML Howto
CambioML is an open-source machine learning infrastructure company that provides tools for accurate, private, and configurable document retrieval and data extraction using LLMs.
View MoreHow to Use CambioML
Install CambioML: Install the CambioML open-source Python library, likely using pip: pip install cambioml
Import and initialize: Import the library and initialize the AnyParser with your API key: from any_parser import AnyParser; op = AnyParser(your_api_key)
Prepare your document: Have your PDF, HTML, or other document file ready for extraction
Extract content: Use the extract method to process your document: content_result = op.extract(your_file_path)
Configure output: Specify your desired output format (JSON, CSV, or Markdown) and schema mapping
Review and use extracted data: Examine the extracted content and use it for your desired purpose (e.g. LLM training, database input)
Redact if needed: If working with sensitive information, use CambioML's redaction features to remove confidential data during retrieval
Integrate with other tools: Use the extracted data with other CambioML tools like pykoi for model comparison or RLHF fine-tuning if needed
CambioML FAQs
CambioML is a company that specializes in open-source machine learning infrastructure, providing tools for extracting and reconstructing text and data from PDFs, HTMLs and forms. They offer solutions for accurate document retrieval and data extraction using LLMs (Large Language Models).
View More