Skip to main content

OCR Agent

The OCR Agent provides intelligent document processing and question-answering capabilities for PDF documents. It uses advanced OCR technology to extract text and tables, and enables intelligent question-answering about your documents.

Base URL​

/api/agents/ocr_agent

Authentication​

All endpoints require authentication. Sign up to the https://cloud.nextneural.ai to get your API key.

Endpoints​

1. Health Check​

Check if the OCR agent service is running.

Endpoint: GET /health

Authentication: None required

Response:

{
"status": "healthy",
"service": "ocr_agent"
}

2. Process PDF​

Process a PDF document with OCR and store the extracted content in the database. The file is fetched securely using the document UUID - no file paths or URLs are exposed.

Endpoint: POST /process_pdf

Authentication: Required (scope: agent:ocr)

Request Body:

{
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}

Query Parameters:

  • ocr_choice (optional, default: "openai") - OCR engine to use
  • force_reparse (optional, default: false) - Force re-parsing even if already processed

Parameters:

  • document_id (required, string): UUID of the knowledge base document containing the PDF file

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/process_pdf?force_reparse=false" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}'

Response (New Document):

{
"message": "PDF processed successfully.",
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"already_processed": false
}

Response (Already Processed):

{
"message": "Document already parsed",
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"already_processed": true,
"processed_at": "2025-01-14T10:30:00"
}

Notes:

  • The file path is automatically fetched from the knowledge base document using the document_id
  • The document_id must belong to the authenticated user (ownership is verified)
  • If force_reparse=true, existing OCR data will be deleted and re-processed
  • Company name and report year are automatically extracted using AI

3. Ask Question (RAG)​

Ask questions about processed documents using RAG-based question answering.

Endpoint: POST /ask_ocr

Authentication: Required

Request Body:

{
"question": "What was the revenue growth in 2024?",
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}

Parameters:

  • question (required, string): Natural language question about the document
  • document_id (required, string): UUID of the knowledge base document to query

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/ask_ocr" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"question": "What was the revenue growth in 2024?",
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}'

Response:

{
"answer": "According to the document, the revenue growth in 2024 was 15.3%, increasing from $10.2M in 2023 to $11.8M in 2024."
}

Notes:

  • The document must be processed first using /process_pdf
  • The document_id is required for security validation
  • The system finds relevant content and generates contextual answers

4. Get OCR Content​

Retrieve the full OCR content organized by pages for a specific document.

Endpoint: POST /get_ocr_content

Authentication: Required (scope: agent:ocr)

Request Body:

{
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}

Parameters:

  • document_id (required, string): UUID of the knowledge base document

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/get_ocr_content" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}'

Response:

{
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"company_name": "Acme Corporation",
"report_year": 2024,
"total_pages": 25,
"pages": [
{
"page_number": 1,
"text_chunks": [
{
"text": "Annual Report 2024..."
}
],
"tables": []
},
{
"page_number": 2,
"text_chunks": [
{
"text": "Financial highlights..."
}
],
"tables": [
{
"text": "Revenue $11.8M, Profit $2.3M...",
"structure": "| Metric | 2023 | 2024 |\n|--------|------|------|...",
"raw_data": ["Revenue metric 2023 $10.2M", "Revenue metric 2024 $11.8M"]
}
]
}
]
}

Notes:

  • Returns structured content organized by pages
  • Tables include both structured format and natural language representations
  • Raw markdown content is returned when available (from Mistral OCR)

Conversation Management​

The OCR Agent supports multi-turn conversations with context retention.

5. Create Conversation​

Create a new conversation session.

Endpoint: POST /conversations/create

Authentication: Required

Request Body:

{
"kb_document_id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Financial Analysis Q&A"
}

Parameters:

  • kb_document_id (optional, string): UUID of the knowledge base document
  • title (optional, string): Custom title for the conversation

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/create" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"kb_document_id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Financial Analysis Q&A"
}'

Response:

{
"id": 789,
"user_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Financial Analysis Q&A",
"started_at": "2025-01-14T10:30:00",
"last_message_at": "2025-01-14T10:30:00"
}

Notes:

  • If kb_document_id is provided, the system finds the corresponding OCR document
  • Document ownership is verified before creating the conversation

6. Add Message to Conversation​

Add a message (user or assistant) to an existing conversation.

Endpoint: POST /conversations/{conversation_id}/messages

Authentication: Required

Path Parameters:

  • conversation_id (required, integer): ID of the conversation

Request Body:

{
"conversation_id": 789,
"role": "user",
"content": "What was the revenue?"
}

Parameters:

  • conversation_id (required, integer): ID of the conversation
  • role (required, string): Either "user" or "assistant"
  • content (required, string): Message content

Request Example:

curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/789/messages" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"conversation_id": 789,
"role": "user",
"content": "What was the revenue?"
}'

Response:

{
"id": "a1b2c3d4-e5f6-5678-90ab-cdef12345678",
"conversation_id": 789,
"role": "user",
"content": "What was the revenue?",
"timestamp": "2025-01-14T10:31:00"
}

Notes:

  • role must be either "user" or "assistant"
  • The conversation must belong to the authenticated user
  • Message id is returned as a UUID for security

7. Get Conversation History​

Retrieve all conversations for the authenticated user.

Endpoint: GET /conversations/history

Authentication: Required

Query Parameters:

  • limit (optional, integer, default: 100) - Maximum number of conversations to return

Request Example:

curl -X GET "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/history?limit=50" \
-H "Authorization: Bearer YOUR_TOKEN"

Response:

[
{
"id": 789,
"fileName": "Acme Corporation 2024",
"analyzedAt": "2025-01-14T10:31:00",
"duration": "5 messages",
"documentId": "550e8400-e29b-41d4-a716-446655440000"
},
{
"id": 788,
"fileName": "Q4 Report",
"analyzedAt": "2025-01-13T15:20:00",
"duration": "3 messages",
"documentId": "660e8400-e29b-41d4-a716-446655440001"
}
]

Notes:

  • Returns conversations in reverse chronological order (newest first)
  • fileName is derived from KB document title, company name, or conversation title
  • Only returns user's own conversations

8. Get Specific Conversation​

Retrieve a specific conversation with all its messages.

Endpoint: GET /conversations/{conversation_id}

Authentication: Required

Path Parameters:

  • conversation_id (required, integer): ID of the conversation to retrieve

Request Example:

curl -X GET "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/789" \
-H "Authorization: Bearer YOUR_TOKEN"

Response:

{
"id": 789,
"user_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Financial Analysis Q&A",
"started_at": "2025-01-14T10:30:00",
"last_message_at": "2025-01-14T10:35:00",
"document": {
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"company_name": "Acme Corporation",
"report_year": 2024
},
"messages": [
{
"id": "a1b2c3d4-e5f6-5678-90ab-cdef12345678",
"role": "user",
"content": "What was the revenue?",
"timestamp": "2025-01-14T10:31:00"
},
{
"id": "b2c3d4e5-f6a7-6789-01bc-def234567890",
"role": "assistant",
"content": "The revenue in 2024 was $11.8M.",
"timestamp": "2025-01-14T10:31:15"
}
]
}

Notes:

  • Only the conversation owner can access it
  • Returns 404 if conversation doesn't exist or access denied
  • Messages are ordered chronologically
  • Message IDs are returned as UUIDs for security

9. Delete Conversation​

Delete a conversation and all its messages.

Endpoint: DELETE /conversations/{conversation_id}

Authentication: Required

Path Parameters:

  • conversation_id (required, integer): ID of the conversation to delete

Request Example:

curl -X DELETE "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/789" \
-H "Authorization: Bearer YOUR_TOKEN"

Response:

{
"success": true,
"message": "Conversation deleted successfully"
}

Notes:

  • Only the conversation owner can delete it
  • All messages are cascade deleted
  • Returns 404 if conversation doesn't exist or access denied

Error Responses​

All endpoints may return the following error responses:

400 Bad Request:

{
"detail": "document_id is required for security validation"
}

403 Forbidden:

{
"detail": "Document does not belong to this user"
}

403 Forbidden (Invalid Document):

{
"detail": "Document with UUID 550e8400-e29b-41d4-a716-446655440000 does not exist"
}

404 Not Found:

{
"detail": "Document not found in OCR database. Please process it first."
}

404 Not Found (File):

{
"detail": "Document file not found in knowledgebase"
}

500 Internal Server Error:

{
"detail": "Error message describing the issue"
}

Workflow Example​

# 1. Process a PDF document (file is fetched via document_id)
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/process_pdf" \
-H "Authorization: Bearer TOKEN" \
-H "Content-Type: application/json" \
-d '{"document_id": "550e8400-e29b-41d4-a716-446655440000"}'

# 2. Create a conversation
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/conversations/create" \
-H "Authorization: Bearer TOKEN" \
-H "Content-Type: application/json" \
-d '{"kb_document_id": "550e8400-e29b-41d4-a716-446655440000", "title": "Report Analysis"}'

# 3. Ask questions
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/ask_ocr" \
-H "Authorization: Bearer TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "What are the key findings?", "document_id": "550e8400-e29b-41d4-a716-446655440000"}'

# 4. Get full OCR content if needed
curl -X POST "https://nextneural-api.superteams.ai/api/agents/ocr_agent/get_ocr_content" \
-H "Authorization: Bearer TOKEN" \
-H "Content-Type: application/json" \
-d '{"document_id": "550e8400-e29b-41d4-a716-446655440000"}'

Security Features​

  • User Isolation: All queries are private to your account
  • Document Ownership: Only you can access your documents
  • Authentication: All endpoints require valid API tokens
  • UUID-based Identification: Secure, non-sequential identifiers for documents and messages
  • No Internal ID Exposure: Internal database IDs are never exposed in API responses
  • No File Path Exposure: File paths are never exposed in API requests or responses