Skip to content

Document Tools

Martha agents can interact with documents stored in tenant-scoped collections. Built-in platform functions give agents the ability to list, read, and search documents during conversations.

When documents are uploaded, they are automatically parsed, chunked, and embedded so that agents can search over structured chunks with keyword and semantic matching.

Agents can also formally cite document sources using the resolve_citation function, creating verifiable references from chat sessions or workflow executions to specific document chunks.

Prerequisites

  • A document collection with documents uploaded — see Document ingestion.
  • An agent that's been granted the document tools (list_docs, read_doc, search_docs). Tools are platform-provided and registered automatically; you only need to grant them to the agents that should use them. See Granting tool access below.

Available Tools

list_docs

List documents in a collection with optional filters.

PropertyValue
Categorydocument
Timeout10 seconds

Parameters:

ParameterTypeRequiredDescription
collection_idstring (UUID)YesThe collection to list documents from
filename_patternstringNoGlob pattern to filter filenames (e.g. *.pdf)
content_typestringNoFilter by content type prefix (e.g. text/)

Example response:

json
{
  "documents": [
    {
      "id": "a1b2c3d4-...",
      "filename": "report.pdf",
      "content_type": "application/pdf",
      "size_bytes": 102400,
      "status": "ready",
      "created_at": "2026-02-14T10:30:00"
    }
  ],
  "count": 1
}

read_doc

Read the text content of a document. When the document has been ingested, returns the parsed revision text directly (no storage download). Otherwise falls back to reading the raw file.

PropertyValue
Categorydocument
Timeout30 seconds

Parameters:

ParameterTypeRequiredDescription
document_idstring (UUID)YesThe document to read
max_bytesintegerNoMaximum bytes to return (default: 100KB, hard cap: 10MB)

Example response (ingested document):

json
{
  "content": "Parsed document text from the latest revision...",
  "filename": "report.pdf",
  "content_type": "application/pdf",
  "size_bytes": 204800,
  "truncated": false,
  "revision_id": "e5f6g7h8-...",
  "page_count": 12,
  "source": "revision"
}

Example response (non-ingested text file):

json
{
  "content": "Raw file content truncated to max_bytes...",
  "filename": "notes.txt",
  "content_type": "text/plain",
  "size_bytes": 204800,
  "truncated": true,
  "source": "storage"
}

!!! tip After ingestion, read_doc works on all supported formats (PDF, DOCX, images, etc.) because it serves the parsed text, not the raw binary. Before ingestion, only text-based files can be read.

search_docs

Search across documents in a collection. Supports multiple search modes that leverage ingested chunks when available.

PropertyValue
Categorydocument
Timeout60 seconds

Parameters:

ParameterTypeRequiredDescription
collection_idstring (UUID)YesThe collection to search
querystringYesSearch query (natural language for chunk modes, regex for text mode)
search_modestringNohybrid (default), keyword, semantic, or text
max_chunksintegerNoMaximum chunk results (default: 10, max: 100). For hybrid/keyword/semantic modes.
max_resultsintegerNoMaximum matches for text mode (default: 20)
case_sensitivebooleanNoCase-sensitive matching for text mode (default: false)
context_linesintegerNoContext lines around each match for text mode (default: 2)

Search Modes

=== "Hybrid (default)"

Combines keyword and semantic search, then merges results by chunk ID and averages scores. Best for general-purpose search.

```json
{"collection_id": "...", "query": "quarterly revenue trends", "search_mode": "hybrid"}
```

=== "Keyword"

PostgreSQL full-text search using `tsvector`/`tsquery`. Fast and works without embeddings. Good for exact term matching.

```json
{"collection_id": "...", "query": "revenue", "search_mode": "keyword"}
```

=== "Semantic"

Vector cosine similarity using pgvector. Finds conceptually similar content even when exact words differ. Requires embeddings.

```json
{"collection_id": "...", "query": "how much money did we make", "search_mode": "semantic"}
```

=== "Text (regex)"

Regex-based search over raw file content. Used as automatic fallback when no chunks exist yet. Also available explicitly.

```json
{"collection_id": "...", "query": "revenue.*\\d+", "search_mode": "text"}
```

Example response (chunk-based modes):

json
{
  "results": [
    {
      "chunk_id": "f1g2h3i4-...",
      "chunk_text": "Q3 revenue reached $4.2M, exceeding projections by 15%...",
      "chunk_index": 7,
      "document_id": "a1b2c3d4-...",
      "document_name": "annual-report.pdf",
      "revision_id": "e5f6g7h8-...",
      "page_start": 12,
      "page_end": 12,
      "section_heading": "Financial Results",
      "score": 0.87
    }
  ],
  "search_mode": "hybrid",
  "total_count": 1
}

!!! note "Automatic fallback" When using hybrid, keyword, or semantic mode on a collection with no ingested chunks, search automatically falls back to text-mode regex and returns "search_mode": "text_fallback" in the response.

Text mode limits:

  • Binary content types are skipped automatically
  • Per-document cap: 1 MB
  • Total download cap across all documents: 5 MB

get_page_image

Get a page image from an ingested document. Returns a presigned URL to the rendered PNG.

PropertyValue
Categorydocument
Timeout15 seconds

Parameters:

ParameterTypeRequiredDescription
document_idstring (UUID)YesThe document
page_numberintegerYes1-based page number

Example response:

json
{
  "document_id": "c7fdca97-...",
  "page_number": 3,
  "image_url": "https://storage.example.com/...signed-url...",
  "document_name": "site-plan.pdf"
}

query_collection

High-level RAG search across an entire collection. Runs 3-way retrieval in parallel: keyword (tsvector), semantic (pgvector), and visual (ColPali via ColiVara). Results are merged, deduplicated, and score-normalized.

When ColPali finds a page with no matching text chunk, the page is described on-demand by a VLM with the question as context (up to 3 pages per query).

PropertyValue
Categorydocument
Timeout60 seconds

Parameters:

ParameterTypeRequiredDescription
collection_idstring (UUID or slug)YesThe collection to search
questionstringYesNatural language question
max_chunksintegerNoMaximum results (default: 10)

Example response:

json
{
  "results": [
    {
      "chunk_text": "The BBU 6631 supports 3 sector configurations...",
      "document_name": "site-plan.pdf",
      "score": 0.91,
      "page_start": 5,
      "section_heading": "Equipment Specifications"
    }
  ],
  "total_count": 8
}

list_collections

List all document collections available to the tenant.

PropertyValue
Categorydocument
Timeout10 seconds

Search for document pages by visual content using ColPali. Returns presigned image URLs for matching pages. Use when the user asks to see a specific drawing, diagram, or visual element.

PropertyValue
Categorydocument
Timeout15 seconds

Parameters:

ParameterTypeRequiredDescription
collection_idstring (UUID or slug)YesThe collection to search
querystringYesWhat to search for visually
max_resultsintegerNoMaximum pages (default: 5, max: 10)

Example response:

json
{
  "pages": [
    {
      "page_number": 8,
      "document_name": "site-plan.pdf",
      "document_id": "c7fdca97-...",
      "image_url": "https://storage.example.com/...signed-url...",
      "score": 7.2
    }
  ],
  "query": "rack elevation diagram"
}

!!! note "Requires vision indexing" visual_search requires INGESTION_VISION_RETRIEVAL_ENABLED=true and a valid COLIVARA_API_KEY. Documents must be re-ingested after enabling vision to index page images.


Tenant Isolation

All document tools enforce tenant isolation:

  • tenant_id is extracted from the agent's session_context — it is never accepted as a parameter
  • Every database query is scoped by tenant_id (enforced by DocumentCollectionService)
  • Storage keys are tenant-prefixed: {tenant_id}/{slug}/{uuid}_{filename}

An agent running under tenant A cannot list, read, or search documents belonging to tenant B.


Granting Agent Access

To allow an agent to use document tools, grant the agent access to the relevant functions (list_docs, read_doc, search_docs, get_page_image, query_collection, list_collections, visual_search) via the admin UI (Agents page > Capabilities tab) or the API:

bash
# Grant access via API
curl -X POST "/api/admin/definitions/clients/{client_id}/functions" \
  -H "Content-Type: application/json" \
  -d '{"function_name": "list_docs"}'

In the admin UI, document tools appear under the Document category in the capabilities list.


Managing Documents

Use the Documents page in the admin panel to:

  • Create and manage collections per tenant
  • Upload files via drag-and-drop or file picker (PDF, DOCX, PPTX, HTML, Markdown, CSV, images up to 50MB)
  • Monitor ingestion status with progress bars and stage indicators
  • Re-ingest documents to regenerate chunks and embeddings
  • Browse documents with status, size, and type information
  • Preview text and image content inline
  • Download documents via presigned URLs
  • Delete documents (soft delete)

Martha is built by aiaiai-pt.