OCR Agent

The OCR Agent provides intelligent document processing and question-answering capabilities for PDF documents. It uses advanced OCR technology to extract text and tables, and enables intelligent question-answering about your documents.

Base URL

/api/agents/ocr_agent

Authentication

All endpoints require authentication. Sign up to the https://cloud.nextneural.ai to get your API key.

Endpoints

1. Health Check

Check if the OCR agent service is running.

Endpoint: GET /health

Authentication: None required

Response:

{
  "status": "healthy",
  "service": "ocr_agent"
}

2. Process PDF

Process a PDF document with OCR and store the extracted content in the database. The file is fetched securely using the document UUID - no file paths or URLs are exposed.

Endpoint: POST /process_pdf

Authentication: Required (scope: agent:ocr)

Request Body:

{
  "document_id": "550e8400-e29b-41d4-a716-446655440000"
}

Query Parameters:

ocr_choice (optional, default: "openai") - OCR engine to use
force_reparse (optional, default: false) - Force re-parsing even if already processed

Parameters:

document_id (required, string): UUID of the knowledge base document containing the PDF file

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/process_pdf?force_reparse=false" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "550e8400-e29b-41d4-a716-446655440000"
  }'

Response (New Document):

{
  "message": "PDF processed successfully.",
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "already_processed": false
}

Response (Already Processed):

{
  "message": "Document already parsed",
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "already_processed": true,
  "processed_at": "2025-01-14T10:30:00"
}

Notes:

The file path is automatically fetched from the knowledge base document using the document_id
The document_id must belong to the authenticated user (ownership is verified)
If force_reparse=true, existing OCR data will be deleted and re-processed
Company name and report year are automatically extracted using AI

3. Ask Question (RAG)

Ask questions about processed documents using RAG-based question answering.

Endpoint: POST /ask_ocr

Authentication: Required

Request Body:

{
  "question": "What was the revenue growth in 2024?",
  "document_id": "550e8400-e29b-41d4-a716-446655440000"
}

Parameters:

question (required, string): Natural language question about the document
document_id (required, string): UUID of the knowledge base document to query

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/ask_ocr" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What was the revenue growth in 2024?",
    "document_id": "550e8400-e29b-41d4-a716-446655440000"
  }'

Response:

{
  "answer": "According to the document, the revenue growth in 2024 was 15.3%, increasing from $10.2M in 2023 to $11.8M in 2024."
}

Notes:

The document must be processed first using /process_pdf
The document_id is required for security validation
The system finds relevant content and generates contextual answers

4. Get OCR Content

Retrieve the full OCR content organized by pages for a specific document.

Endpoint: POST /get_ocr_content

Authentication: Required (scope: agent:ocr)

Request Body:

{
  "document_id": "550e8400-e29b-41d4-a716-446655440000"
}

Parameters:

document_id (required, string): UUID of the knowledge base document

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/get_ocr_content" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "550e8400-e29b-41d4-a716-446655440000"
  }'

Response:

{
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "company_name": "Acme Corporation",
  "report_year": 2024,
  "total_pages": 25,
  "pages": [
    {
      "page_number": 1,
      "text_chunks": [
        {
          "text": "Annual Report 2024..."
        }
      ],
      "tables": []
    },
    {
      "page_number": 2,
      "text_chunks": [
        {
          "text": "Financial highlights..."
        }
      ],
      "tables": [
        {
          "text": "Revenue $11.8M, Profit $2.3M...",
          "structure": "| Metric | 2023 | 2024 |\n|--------|------|------|...",
          "raw_data": ["Revenue metric 2023 $10.2M", "Revenue metric 2024 $11.8M"]
        }
      ]
    }
  ]
}

Notes:

Returns structured content organized by pages
Tables include both structured format and natural language representations
Raw markdown content is returned when available (from Mistral OCR)

Conversation Management

The OCR Agent supports multi-turn conversations with context retention.

5. Create Conversation

Create a new conversation session.

Endpoint: POST /conversations/create

Authentication: Required

Request Body:

{
  "kb_document_id": "550e8400-e29b-41d4-a716-446655440000",
  "title": "Financial Analysis Q&A"
}

Parameters:

kb_document_id (optional, string): UUID of the knowledge base document
title (optional, string): Custom title for the conversation

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/create" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "kb_document_id": "550e8400-e29b-41d4-a716-446655440000",
    "title": "Financial Analysis Q&A"
  }'

Response:

{
  "id": 789,
  "user_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "title": "Financial Analysis Q&A",
  "started_at": "2025-01-14T10:30:00",
  "last_message_at": "2025-01-14T10:30:00"
}

Notes:

If kb_document_id is provided, the system finds the corresponding OCR document
Document ownership is verified before creating the conversation

6. Add Message to Conversation

Add a message (user or assistant) to an existing conversation.

Endpoint: POST /conversations/{conversation_id}/messages

Authentication: Required

Path Parameters:

conversation_id (required, integer): ID of the conversation

Request Body:

{
  "conversation_id": 789,
  "role": "user",
  "content": "What was the revenue?"
}

Parameters:

conversation_id (required, integer): ID of the conversation
role (required, string): Either "user" or "assistant"
content (required, string): Message content

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/789/messages" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "conversation_id": 789,
    "role": "user",
    "content": "What was the revenue?"
  }'

Response:

{
  "id": "a1b2c3d4-e5f6-5678-90ab-cdef12345678",
  "conversation_id": 789,
  "role": "user",
  "content": "What was the revenue?",
  "timestamp": "2025-01-14T10:31:00"
}

Notes:

role must be either "user" or "assistant"
The conversation must belong to the authenticated user
Message id is returned as a UUID for security

7. Get Conversation History

Retrieve all conversations for the authenticated user.

Endpoint: GET /conversations/history

Authentication: Required

Query Parameters:

limit (optional, integer, default: 100) - Maximum number of conversations to return

Request Example:

curl -X GET "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/history?limit=50" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

[
  {
    "id": 789,
    "fileName": "Acme Corporation 2024",
    "analyzedAt": "2025-01-14T10:31:00",
    "duration": "5 messages",
    "documentId": "550e8400-e29b-41d4-a716-446655440000"
  },
  {
    "id": 788,
    "fileName": "Q4 Report",
    "analyzedAt": "2025-01-13T15:20:00",
    "duration": "3 messages",
    "documentId": "660e8400-e29b-41d4-a716-446655440001"
  }
]

Notes:

Returns conversations in reverse chronological order (newest first)
fileName is derived from KB document title, company name, or conversation title
Only returns user's own conversations

8. Get Specific Conversation

Retrieve a specific conversation with all its messages.

Endpoint: GET /conversations/{conversation_id}

Authentication: Required

Path Parameters:

conversation_id (required, integer): ID of the conversation to retrieve

Request Example:

curl -X GET "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/789" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "id": 789,
  "user_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "title": "Financial Analysis Q&A",
  "started_at": "2025-01-14T10:30:00",
  "last_message_at": "2025-01-14T10:35:00",
  "document": {
    "document_id": "550e8400-e29b-41d4-a716-446655440000",
    "company_name": "Acme Corporation",
    "report_year": 2024
  },
  "messages": [
    {
      "id": "a1b2c3d4-e5f6-5678-90ab-cdef12345678",
      "role": "user",
      "content": "What was the revenue?",
      "timestamp": "2025-01-14T10:31:00"
    },
    {
      "id": "b2c3d4e5-f6a7-6789-01bc-def234567890",
      "role": "assistant",
      "content": "The revenue in 2024 was $11.8M.",
      "timestamp": "2025-01-14T10:31:15"
    }
  ]
}

Notes:

Only the conversation owner can access it
Returns 404 if conversation doesn't exist or access denied
Messages are ordered chronologically
Message IDs are returned as UUIDs for security

9. Delete Conversation

Delete a conversation and all its messages.

Endpoint: DELETE /conversations/{conversation_id}

Authentication: Required

Path Parameters:

conversation_id (required, integer): ID of the conversation to delete

Request Example:

curl -X DELETE "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/789" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "success": true,
  "message": "Conversation deleted successfully"
}

Notes:

Only the conversation owner can delete it
All messages are cascade deleted
Returns 404 if conversation doesn't exist or access denied

Error Responses

All endpoints may return the following error responses:

400 Bad Request:

{
  "detail": "document_id is required for security validation"
}

403 Forbidden:

{
  "detail": "Document does not belong to this user"
}

403 Forbidden (Invalid Document):

{
  "detail": "Document with UUID 550e8400-e29b-41d4-a716-446655440000 does not exist"
}

404 Not Found:

{
  "detail": "Document not found in OCR database. Please process it first."
}

404 Not Found (File):

{
  "detail": "Document file not found in knowledgebase"
}

500 Internal Server Error:

{
  "detail": "Error message describing the issue"
}

Workflow Example

# 1. Process a PDF document (file is fetched via document_id)
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/process_pdf" \
  -H "Authorization: Bearer TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"document_id": "550e8400-e29b-41d4-a716-446655440000"}'

# 2. Create a conversation
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/create" \
  -H "Authorization: Bearer TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"kb_document_id": "550e8400-e29b-41d4-a716-446655440000", "title": "Report Analysis"}'

# 3. Ask questions
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/ask_ocr" \
  -H "Authorization: Bearer TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"question": "What are the key findings?", "document_id": "550e8400-e29b-41d4-a716-446655440000"}'

# 4. Get full OCR content if needed
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/get_ocr_content" \
  -H "Authorization: Bearer TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"document_id": "550e8400-e29b-41d4-a716-446655440000"}'

Security Features

User Isolation: All queries are private to your account
Document Ownership: Only you can access your documents
Authentication: All endpoints require valid API tokens
UUID-based Identification: Secure, non-sequential identifiers for documents and messages
No Internal ID Exposure: Internal database IDs are never exposed in API responses
No File Path Exposure: File paths are never exposed in API requests or responses

Base URL​

Authentication​

Endpoints​

1. Health Check​

2. Process PDF​

3. Ask Question (RAG)​

4. Get OCR Content​

Conversation Management​

5. Create Conversation​

6. Add Message to Conversation​

7. Get Conversation History​

8. Get Specific Conversation​

9. Delete Conversation​

Error Responses​

Workflow Example​

Security Features​

Base URL

Authentication

Endpoints

1. Health Check

2. Process PDF

3. Ask Question (RAG)

4. Get OCR Content

Conversation Management

5. Create Conversation

6. Add Message to Conversation

7. Get Conversation History

8. Get Specific Conversation

9. Delete Conversation

Error Responses

Workflow Example

Security Features