๐Ÿง  API Reference Documentation
MR AI RAG
Memory Chatbot API
Complete reference for embedding AI memory chatbots, managing conversations, and measuring voice interaction latency.
Version
v2.0
Base URL
https://test.3rdai.co
Auth
API Key or Client Token
Format
JSON (REST)
Table of Contents
1 Authentication
2 Memory Management
3 Chatbot Conversations
4 Source Uploads
5 Reels & Video Generation
6 Voice Latency Guide
7 Embed & QR Code
8 Error Codes

1 Authentication

All API calls require an API key in the X-API-Key header, or a client session token in X-Client-Token. Obtain API keys from Dashboard โ†’ Settings โ†’ API Keys.

Note: Client endpoints require X-Client-Token. Admin endpoints require X-Admin-Token.

POST/api/clients/loginEmail + password login
FieldTypeRequiredDescription
emailstringRequiredUser email address
passwordstringRequiredAccount password
Response 200
{ "client_id": "clt-...", "token": "clt-...", "name": "John Doe", "email": "user@example.com" }
POST/api/clients/google-loginGoogle OAuth login
FieldTypeRequiredDescription
id_tokenstringRequiredGoogle ID token from Google Sign-In flow

2 Memory Management

A Memory is an AI knowledge base that stores uploaded documents, websites, and videos. Each chatbot is backed by a distinct memory.

GET/api/memoryList all memories

Header: X-Client-Token: {token}

POST/api/memoryCreate a new memory
FieldTypeRequiredDescription
namestringRequiredChatbot memory name
descriptionstringOptionalShort description
providerstringOptional"gemini" (default) | "openai"
DELETE/api/memory/{memory_id}Delete a memory (irreversible)

All sources and chat history will be permanently deleted.

3 Chatbot Conversations

Send questions to a memory chatbot. The AI answers using only the knowledge stored in that memory.

POST/api/memory/{memory_id}/askAsk the chatbot a question
FieldTypeRequiredDescription
questionstringRequiredUser's question
top_kintegerOptionalNumber of chunks to retrieve (default: 5)
historyarrayOptionalChat history e.g. [{"role": "user", "content": "..."}]
Response 200
{
  "answer": "Based on the knowledge base, ...",
  "sources": ["document.pdf", "website.com"],
  "latency_ms": 1240,
  "history_saved": true
}
GET/api/memory/{memory_id}/chat-historyGet Q&A pairs
Query ParamTypeDefaultDescription
limitinteger10Max Q&A pairs to return
offsetinteger0Pagination offset

4 Source Uploads

Add knowledge to a memory by uploading PDFs, pasting web URLs, or providing YouTube links.

POST/api/memory/{memory_id}/upload-pdfUpload a PDF document

Content-Type: multipart/form-data | Form field: file

POST/api/memory/{memory_id}/ingest-urlAdd a web page to memory
FieldTypeDescription
urlstringHTTP/HTTPS URL of the article/website
POST/api/memory/{memory_id}/ingest-youtubeAdd a YouTube video to memory
FieldTypeDescription
urlstringYouTube video link
POST/api/memory/{memory_id}/ingest-json-urlFetch and index a JSON API/file from URL
FieldTypeDescription
urlstringHTTP/HTTPS URL of the JSON data
titlestringDescriptive title for the source
POST/api/memory/{memory_id}/ingest-mongodbConnect to MongoDB & index documents (Live Real-Time)
FieldTypeRequiredDescription
connection_stringstringRequiredMongoDB URI (mongodb+srv://...)
databasestringRequiredMongoDB Database Name
collectionstringRequiredMongoDB Collection Name
titlestringOptionalDescriptive title for the source

5 Reels & Video Generation

Generate short video reels from memory content or a specific answer. Placeholder videos are returned โ€” integrate D-ID, HeyGen, or Runway ML for production use.

POST/api/memory/{memory_id}/generate-reelGenerate a video reel
FieldTypeRequiredDescription
topicstringOptionalTopic/title for the reel (default: memory name)
stylestringOptional"cinematic" | "minimal" | "energetic"
content_textstringOptionalSpecific answer text to generate from (max 5000 chars)
Response 200
{ "success": true, "video_url": "https://...", "topic": "...", "style": "cinematic" }

6 Voice Latency Guide

Voice latency is the delay between a user speaking and the chatbot starting to respond. The total is the sum of several pipeline stages:

ComponentTypicalRangeRating
Speech-to-Text (STT)
Transcribing voice to text (e.g. Whisper, Google STT)
200โ€“400ms100โ€“800msFast
Vector Search (RAG)
Finding relevant chunks in the memory
50โ€“150ms30โ€“400msFast
LLM Generation โ€” Gemini
Generating the AI answer (Google Gemini)
800โ€“1500ms500โ€“3000msModerate
LLM Generation โ€” GPT-4o
Generating the AI answer (OpenAI)
600โ€“1200ms400โ€“2500msModerate
Text-to-Speech (TTS)
Converting answer text to audio
300โ€“600ms200โ€“1200msModerate
Network Round-trip
API calls + server overhead
50โ€“200ms20โ€“500msFast
Total โ€” Non-streaming 1.5โ€“3.0s1.0โ€“5.0sAcceptable
Total โ€” Streaming (first token) 300โ€“700ms200msโ€“1.5sExcellent

Tip: Enable streaming ("stream": true in the chat request) to reduce perceived latency. The first token appears in ~300โ€“700ms, making the response feel instant to users.

Benchmarks by Scenario

ScenarioTotal LatencyNotes
Simple Q&A (1โ€“2 sources)~800msStreaming + small context
Complex multi-document query (10+ sources)~2.5sMore chunks to process
Cold start / first message~1.5โ€“2sModel warm-up overhead
Subsequent messages (warm)~600msโ€“1.2sEmbedded context cache
Full voice pipeline (STT + LLM + TTS)~2.0โ€“4.0sEnd-to-end voice interaction

7 Embed & QR Code

Embed the chatbot on any website using an iframe, or share a QR code that opens the public chat interface.

Embed iframe

<!-- Add to any webpage -->
<iframe
  src="https://your-domain.com/memory-chat-public?id={memory_id}"
  width="400" height="600"
  style="border:none;border-radius:16px"
  allow="microphone"
></iframe>

Public QR URL

https://your-domain.com/memory-chat-public?id={memory_id}

8 Error Codes

CodeErrorCauseFix
400Bad RequestMissing or invalid body fieldsCheck required parameters
401UnauthorizedMissing or invalid API key / tokenCheck X-API-Key or X-Client-Token header
403ForbiddenAccessing another user's resourceUse the correct client token
404Not FoundMemory or resource doesn't existVerify the memory_id or resource path
422Validation ErrorField type or constraint mismatchCheck field types (int vs string)
500Server ErrorUnexpected server-side failureCheck server logs; retry after moment