Advanced Video KYC Agent
The Advanced Video KYC Agent provides comprehensive identity verification with built-in fraud detection capabilities. It processes video files containing Aadhaar card information, extracts and validates identity details from both audio and visual components, and performs advanced security checks including lip-sync analysis, speaker detection, and background noise profiling.
Base URL​
/api/agents/advance_video_kyc
Authentication​
All endpoints require authentication. Sign up to https://cloud.nextneural.ai to get your API key.
How It Works​
The Advanced Video KYC Agent performs multi-layered identity verification:
- Audio Processing: Extracts audio from video, transcribes speech, translates to English, and extracts Aadhaar details
- Visual Processing: Detects Aadhaar card in video frames and extracts text using OCR
- Fraud Detection: Runs three parallel security checks:
- Background Noise Analysis: Detects loud or unstable audio environments
- Speaker Count Detection: Identifies multiple speakers (red flag for fraud)
- Lip Sync Verification: Detects deepfakes and dubbed audio
- Cross-Validation: Compares audio and visual data with fraud penalties to calculate final verification score
- Background Processing: Uses Celery workers for scalable, persistent task processing
Key Differences from Basic Video KYC​
| Feature | Basic Video KYC | Advanced Video KYC |
|---|---|---|
| Fraud Detection | None | Full (Lip Sync, Speaker Count, Noise) |
| Processing | Synchronous | Async (Celery Workers) |
| Score Breakdown | Simple | Detailed with categories |
| Deepfake Detection | No | Yes (Lip Sync Analysis) |
| Multi-Speaker Detection | No | Yes (Replicate AI) |
| Task Persistence | No | Yes (Redis Queue) |
Endpoints​
1. Health Check​
Check if the Advanced Video KYC agent service is running.
Endpoint: GET /health
Authentication: None required
Response:
{
"status": "healthy",
"service": "Advanced Video KYC Agent"
}
2. Process KYC Video​
Process a video file for KYC verification with fraud detection. Returns immediately with "PROCESSING" status and dispatches work to background Celery workers.
Endpoint: POST /advance_video_kyc
Authentication: Required (scope: agent:advance_video_kyc)
Request Body:
{
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"force_reprocess": false
}
Parameters:
document_id(required, string): UUID of the knowledge base document containing the video fileforce_reprocess(optional, default: false): Re-process video even if already analyzed
Request Example:
curl -X POST "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/advance_video_kyc" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"force_reprocess": false
}'
Response (Processing Started):
{
"status": "PROCESSING",
"message": "Analysis queued in background worker",
"kbDocumentId": "550e8400-e29b-41d4-a716-446655440000",
"taskId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
Response (Cached Result):
{
"status": "COMPLETED",
"record": {
"kbDocumentId": "550e8400-e29b-41d4-a716-446655440000",
"audioTranscript": "My name is Rajesh Kumar Singh...",
"parsedAudioData": { ... },
"parsedVisualData": { ... },
"fraudDetection": { ... },
"verificationScore": 85.0,
"scoreBreakdown": [ ... ],
"alreadyEvaluated": true
}
}
Response (Already Processing):
{
"status": "PROCESSING",
"message": "Analysis is already in progress",
"kbDocumentId": "550e8400-e29b-41d4-a716-446655440000"
}
Notes:
- Returns immediately with
PROCESSINGstatus - use polling endpoint to check completion - Tasks persist in Redis queue even if backend restarts
- Automatic retry on failure (2 retries with exponential backoff)
- Worker pool manages concurrency limits
3. Check Processing Status​
Poll this endpoint to check if background analysis is complete.
Endpoint: GET /check_status/{document_id}
Authentication: Required (scope: agent:advance_video_kyc)
Path Parameters:
document_id(required): The UUID of the knowledge base document
Request Example:
curl -X GET "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/check_status/550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer YOUR_TOKEN"
Response (Processing):
{
"status": "PROCESSING",
"updated_at": "2025-01-14T10:30:00"
}
Response (Completed):
{
"status": "COMPLETED",
"record": {
"kbDocumentId": "550e8400-e29b-41d4-a716-446655440000",
"audioTranscript": "My name is Rajesh Kumar Singh. My date of birth is 15th August 1990. My Aadhaar number is 1234 5678 9012.",
"parsedAudioData": {
"name": "Rajesh Kumar Singh",
"dob": "15/08/1990",
"aadharId": "1234 5678 9012"
},
"parsedVisualData": {
"name": "RAJESH KUMAR SINGH",
"dob": "15/08/1990",
"aadharId": "1234 5678 9012",
"gender": "Male",
"confidence": 92.0
},
"fraudDetection": {
"backgroundNoiseDetected": false,
"backgroundNoiseType": null,
"speakerCount": 1,
"lipSyncFraudDetected": false
},
"verificationScore": 100.0,
"scoreBreakdown": [
{
"category": "Name Match",
"description": "Audio and visual names match exactly",
"impact": "+40",
"type": "positive"
},
{
"category": "Date of Birth Match",
"description": "Audio and visual DOB match exactly",
"impact": "+20",
"type": "positive"
},
{
"category": "Aadhaar Number Match",
"description": "Audio and visual Aadhaar numbers match exactly",
"impact": "+40",
"type": "positive"
},
{
"category": "Background Noise",
"description": "Clean audio environment detected",
"impact": "0",
"type": "positive"
},
{
"category": "Speaker Count",
"description": "Single speaker detected (valid)",
"impact": "0",
"type": "positive"
},
{
"category": "Lip Sync Integrity",
"description": "Lip sync verification passed",
"impact": "0",
"type": "positive"
}
],
"detectionConfidence": 92.0,
"warning": null,
"alreadyEvaluated": false,
"evaluatedAt": "2025-01-14T10:30:00",
"isReevaluation": false,
"createdAt": "2025-01-14T10:30:00",
"updatedAt": "2025-01-14T10:30:00"
}
}
Response (Failed):
{
"status": "FAILED",
"error": "Error processing video: No face detected in video"
}
Response (Not Found):
{
"status": "NOT_FOUND"
}
4. Get My KYC Records​
Retrieve all KYC records for the authenticated user.
Endpoint: GET /my_kyc_records
Authentication: Required (scope: agent:advance_video_kyc)
Query Parameters:
limit(optional, default: 100): Maximum number of records to return
Request Example:
curl -X GET "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/my_kyc_records?limit=50" \
-H "Authorization: Bearer YOUR_TOKEN"
Response:
{
"success": true,
"userId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"count": 3,
"records": [
{
"id": 789,
"analyzedAt": "2025-01-14T10:30:00",
"verificationScore": 100.0,
"name": "Rajesh Kumar Singh",
"status": "COMPLETED"
},
{
"id": 788,
"analyzedAt": "2025-01-13T15:20:00",
"verificationScore": 60.0,
"name": "Priya Sharma",
"status": "COMPLETED"
},
{
"id": 787,
"analyzedAt": "2025-01-12T09:45:00",
"verificationScore": 0,
"name": "Unknown",
"status": "PROCESSING"
}
]
}
Notes:
- Returns records in reverse chronological order (newest first)
statusfield indicates current processing state (PENDING, PROCESSING, COMPLETED, FAILED)- Use status to show appropriate UI (spinner for PROCESSING, checkmark for COMPLETED)
5. Get Specific KYC Record​
Retrieve detailed information for a specific KYC record.
Endpoint: GET /kyc_record/{record_id}
Authentication: Required (scope: agent:advance_video_kyc)
Path Parameters:
record_id(required): The ID of the KYC record to retrieve
Request Example:
curl -X GET "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/kyc_record/789" \
-H "Authorization: Bearer YOUR_TOKEN"
Response:
{
"success": true,
"record": {
"kbDocumentId": "550e8400-e29b-41d4-a716-446655440000",
"audioTranscript": "My name is Rajesh Kumar Singh...",
"parsedAudioData": {
"name": "Rajesh Kumar Singh",
"dob": "15/08/1990",
"aadharId": "1234 5678 9012"
},
"parsedVisualData": {
"name": "RAJESH KUMAR SINGH",
"dob": "15/08/1990",
"aadharId": "1234 5678 9012",
"gender": "Male",
"confidence": 92.0
},
"fraudDetection": {
"backgroundNoiseDetected": false,
"backgroundNoiseType": null,
"speakerCount": 1,
"lipSyncFraudDetected": false
},
"verificationScore": 100.0,
"scoreBreakdown": [ ... ],
"detectionConfidence": 92.0,
"warning": null,
"alreadyEvaluated": false,
"evaluatedAt": "2025-01-14T10:30:00",
"isReevaluation": false,
"createdAt": "2025-01-14T10:30:00",
"updatedAt": "2025-01-14T10:30:00"
}
}
Data Models​
Parsed Audio Data Structure​
{
"name": "Full Name",
"dob": "DD/MM/YYYY",
"aadharId": "XXXX XXXX XXXX"
}
Parsed Visual Data Structure​
{
"name": "FULL NAME",
"dob": "DD/MM/YYYY",
"aadharId": "XXXX XXXX XXXX",
"gender": "Male/Female",
"confidence": 92.0
}
Fraud Detection Structure​
{
"backgroundNoiseDetected": false,
"backgroundNoiseType": "Loud Background: Room noise is -35.5dB",
"speakerCount": 1,
"lipSyncFraudDetected": false
}
Score Breakdown Item Structure​
{
"category": "Name Match",
"description": "Audio and visual names match exactly",
"impact": "+40",
"type": "positive"
}
Type Values:
positive: Check passed, contributes positively to scorewarning: Partial match or minor issuenegative: Check failed, may reduce score
Full Record Structure​
{
"kbDocumentId": "550e8400-e29b-41d4-a716-446655440000",
"audioTranscript": "Transcribed and translated speech...",
"parsedAudioData": {
"name": "Full Name",
"dob": "DD/MM/YYYY",
"aadharId": "XXXX XXXX XXXX"
},
"parsedVisualData": {
"name": "FULL NAME",
"dob": "DD/MM/YYYY",
"aadharId": "XXXX XXXX XXXX",
"gender": "Male/Female",
"confidence": 92.0
},
"fraudDetection": {
"backgroundNoiseDetected": false,
"backgroundNoiseType": null,
"speakerCount": 1,
"lipSyncFraudDetected": false
},
"verificationScore": 100.0,
"scoreBreakdown": [ ... ],
"detectionConfidence": 92.0,
"warning": null,
"alreadyEvaluated": false,
"evaluatedAt": "2025-01-14T10:30:00",
"isReevaluation": false,
"createdAt": "2025-01-14T10:30:00",
"updatedAt": "2025-01-14T10:30:00"
}
Verification Score Calculation​
The verification score (0-100) is calculated by combining data matching and fraud detection:
Positive Score Components (Max: 100 points)​
| Comparison | Points | Condition |
|---|---|---|
| Name - Exact Match | +40 | Audio name exactly matches visual name |
| Name - Partial Match | +20 | Some name parts match between audio and visual |
| DOB - Exact Match | +20 | Date of birth matches exactly |
| Aadhaar - Exact Match | +40 | Aadhaar number matches exactly |
Fraud Detection Penalties​
| Check | Penalty | Condition |
|---|---|---|
| Background Noise | -10 | High or unstable background noise detected |
| No Speakers | -40 | No speakers detected in audio |
| Multiple Speakers | -20 | More than 1 speaker detected |
| Lip Sync Fraud | -30 | Lip movements don't match audio (deepfake indicator) |
Score Interpretation​
| Score Range | Interpretation | Recommended Action |
|---|---|---|
| 90-100 | Excellent | Auto-approve |
| 70-89 | Good | Accept with minor review |
| 50-69 | Moderate | Manual review required |
| 30-49 | Poor | Request new video |
| 0-29 | Failed / Fraud Risk | Reject, investigate if fraud flags present |
Fraud Detection Details​
1. Background Noise Analysis​
Analyzes audio quality using librosa for spectral analysis.
Detection Criteria:
- Noise Floor Limit: -42.0 dB (audio below this is considered clean)
- Stability Limit: 8.0 dB fluctuation (excessive variation indicates unstable environment)
Detected Issues:
- Loud background (room noise exceeds threshold)
- Unstable background (noise fluctuates significantly)
2. Speaker Count Detection​
Uses Replicate's whisper-diarization model to detect number of distinct speakers.
Red Flags:
- 0 speakers: No valid speech detected
- >1 speakers: Multiple people speaking (potential coaching/prompting)
Valid Scenario:
- Exactly 1 speaker detected
3. Lip Sync Verification (Deepfake Detection)​
Advanced analysis using MediaPipe face mesh and audio correlation.
Detection Methods:
- Ghost Speaker Detection: Audio is loud but mouth is closed (>35% mismatch)
- Correlation Analysis: Mouth movements don't correlate with audio (score < 0.20)
- Dubbing Detection: Audio delayed by >100ms from visual
Detected Issues:
- Lip sync mismatch (potential deepfake)
- Dubbed audio (voice recorded separately)
- Ghost speaker (audio playing without corresponding mouth movement)
Processing States​
| Status | Description |
|---|---|
PENDING | Record created, waiting for processing to start |
PROCESSING | Video is being analyzed by Celery worker |
COMPLETED | Processing finished successfully |
FAILED | Processing failed (see warning_message for details) |
Error Responses​
All endpoints may return the following error responses:
400 Bad Request:
{
"detail": "document_id is required for security validation"
}
403 Forbidden (Document Access):
{
"detail": "Access denied. Document 550e8400-e29b-41d4-a716-446655440000 does not belong to user a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
403 Forbidden (Invalid Document):
{
"detail": "Document with UUID 550e8400-e29b-41d4-a716-446655440000 does not exist in knowledgebase"
}
404 Not Found:
{
"detail": "File path not found for KB document."
}
404 Not Found (Record):
{
"detail": "KYC record not found or you don't have access to it"
}
500 Internal Server Error:
{
"detail": "Error initiating processing: [error message]"
}
Warning Messages​
The system returns prioritized warning messages:
Critical Warnings (Fraud Detected)​
- "CRITICAL: Lip sync mismatch detected (Potential Deepfake/Dubbing)"
- Highest priority, indicates possible video manipulation
Security Warnings​
- "WARNING: Multiple speakers detected in audio"
- Multiple people detected, possible coaching
- "WARNING: High background noise detected ({noise_type})"
- Audio quality compromised
Data Quality Warnings​
- "No Aadhaar card detected in video"
- OCR couldn't find Aadhaar card in any frame
- "Incomplete data extracted from video. Please upload a clearer video"
- Some fields (name, DOB, or Aadhaar) missing from extracted data
Best Practices​
Video Quality Requirements​
- Resolution: Minimum 720p recommended
- Duration: 10-30 seconds optimal
- Lighting: Good, even lighting on the Aadhaar card
- Focus: Card should be clearly visible and in focus
- Audio: Clear speech, minimal background noise
- Content: User should speak name, DOB, and Aadhaar number clearly
- Single Speaker: Only the applicant should speak in the video
Recording Guidelines​
- Record in a quiet environment
- Hold the Aadhaar card steady in frame for at least 3-5 seconds
- Ensure card is flat and not tilted
- Avoid glare or reflections on the card
- Speak clearly and at moderate pace
- Pronounce Aadhaar number digit by digit
- Use proper date format (day, month, year)
- Ensure only one person speaks throughout
Integration Workflow​
# 1. Upload video to knowledge base (via your file upload system)
# This creates a KB document with UUID
# 2. Start KYC processing (returns immediately)
curl -X POST "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/advance_video_kyc" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"document_id": "550e8400-e29b-41d4-a716-446655440000"
}'
# 3. Poll for status (recommended: every 5 seconds)
while true; do
STATUS=$(curl -s -X GET "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/check_status/550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer YOUR_TOKEN" | jq -r '.status')
if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then
break
fi
sleep 5
done
# 4. Get full record details
curl -X GET "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/kyc_record/789" \
-H "Authorization: Bearer YOUR_TOKEN"
# 5. Decision logic based on score and fraud flags:
# - Score >= 70 AND no fraud flags → Approve
# - Score >= 50 AND no critical fraud → Manual Review
# - Fraud flags present OR score < 50 → Reject/Investigate
# 6. Re-process if needed (e.g., improved AI model)
curl -X POST "https://nextneural-api.superteams.ai/api/agents/advance_video_kyc/advance_video_kyc" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"force_reprocess": true
}'
Performance Considerations​
- Processing Time: Typically 30-60 seconds per video (depends on video length)
- Fraud Checks: Run concurrently with video/audio processing
- Caching: Completed results are cached; use
force_reprocess=trueto re-analyze - Concurrent Processing: Multiple videos processed by separate Celery workers
- Task Persistence: Tasks survive server restarts (persisted in Redis)
- Retry Logic: Automatic 2 retries with exponential backoff on failure
Security Features​
- User Isolation: All queries are private to your account
- Document Ownership: Only you can access your documents
- Authentication: All endpoints require valid API tokens with
agent:advance_video_kycscope - UUID-based Identification: Secure, non-sequential identifiers for documents
- No File Path Exposure: File paths are never exposed in API requests or responses
- Fraud Detection: Multi-layered security checks for deepfakes and manipulation
- Background Workers: Processing isolated in separate worker processes
Troubleshooting​
Low Verification Scores​
Possible Causes:
- Audio and visual data don't match
- Fraud detection penalties applied
- Poor video quality
Solutions:
- Ensure person speaks details matching their Aadhaar card
- Record in quiet environment (reduces noise penalty)
- Ensure face is clearly visible (for lip sync analysis)
Lip Sync Fraud Detected​
Possible Causes:
- Video was dubbed or audio replaced
- Face not clearly visible throughout video
- Poor lighting on face
- Video recorded without audio, dubbed later
Solutions:
- Record video with live audio in one take
- Ensure face is well-lit and clearly visible
- Don't use edited or spliced videos
Multiple Speakers Detected​
Possible Causes:
- Another person speaking in background
- TV/Radio playing in background
- Someone coaching the applicant
Solutions:
- Record in a private, quiet room
- Ensure only the applicant speaks
- Turn off all audio sources
Background Noise Issues​
Possible Causes:
- Recording in noisy environment
- Poor microphone quality
- Wind or fan noise
Solutions:
- Record in a quiet indoor environment
- Use device's primary microphone
- Reduce background noise sources
No Aadhaar Card Detected​
Possible Causes:
- Card not visible in any frame
- Card too small or far from camera
- Poor lighting or focus
Solutions:
- Hold card closer to camera
- Ensure entire card is visible for 3-5 seconds
- Improve lighting conditions
- Keep card flat and avoid angles
Processing Stuck in PROCESSING State​
Possible Causes:
- Celery worker overloaded
- Worker crashed during processing
- Redis connection issues
Solutions:
- Wait and poll again (workers auto-retry)
- Use
force_reprocess=trueto restart - Contact support if persists > 10 minutes