Best Practices & Optimization
Building a Knowledge Base is not just about uploading documents — it’s about optimizing how your data is understood, embedded, and retrieved by your Agents. This section provides advanced guidelines for improving accuracy, reducing noise, and ensuring consistent multi-Agent performance across Zaia Endless.
🧱 1. Structuring Your Knowledge
The quality of an Agent’s responses depends heavily on how the content inside your Knowledge Base is written and organized. Follow these principles to maximize retrieval precision:
✅ Do:
Use clear titles and headings. Structure long texts with
#or##(Markdown syntax) or bold section headers.Keep topics focused. Each item should represent one domain or subject area.
Segment logically. If a document covers several unrelated subjects, split it into multiple text items.
Add context identifiers. Example:
[SECTION: Product Pricing] [SECTION: Warranty Policies]
❌ Avoid:
Overlapping or redundant information between items.
Long unstructured paragraphs (over 2000 characters without breaks).
Mixing unrelated concepts (e.g. pricing + onboarding + troubleshooting in one text).
🧠 2. Optimizing for Embeddings
Each Knowledge Base undergoes an embedding process, where its text is transformed into high-dimensional vectors for semantic retrieval. Small changes in structure can significantly affect search quality.
🔍 Embedding Best Practices
Shorter segments embed better. Keep each paragraph or bullet list under ~1500 characters.
Avoid repeated keywords. Semantic models already infer meaning — keyword stuffing lowers quality.
Maintain consistent formatting. Avoid random line breaks, tabs, or inconsistent casing.
Include synonyms naturally. This helps the model connect variations of user queries.
Example: “pricing / cost / plan / subscription” within one sentence helps broaden recall.
Avoid excessive symbols. Special characters, emojis, or decorative punctuation can reduce precision.
🌐 3. Bilingual and Multi-Language Knowledge
Zaia Endless supports multilingual Agents. When building bilingual Knowledge Bases (e.g., Portuguese + English), always separate languages clearly to prevent mixed-context embeddings.
🏗️ Recommended Structure
FAQ — English
Q: What is the price of Zaia?
A: Plans start at $X/month depending on usage.
FAQ — Português
P: Qual é o preço da Zaia?
R: Os planos começam em R$X/mês dependendo do uso.Tips:
Keep both languages in the same item only if logically aligned.
Use language tags
(EN)and(PT)to separate content blocks.Do not interleave sentences from different languages.
🔄 4. Managing Updates and Retraining
Whenever a Knowledge Base item is edited or replaced, it automatically re-enters the training pipeline. To ensure clean retraining cycles:
Best Practices
Batch updates. Edit multiple items before retraining to optimize resource usage.
Avoid duplicates. If you replace an item, delete the old one before retraining.
Monitor status. Wait for the
Completedstate before testing the Agent.Periodic refresh. Retrain major bases every 30–60 days to ensure embeddings stay aligned with evolving models.
🧩 5. Multi-Agent and Squad Optimization
Since a single Knowledge Base can be shared across multiple Agents or an entire Squad, consider how each entity interacts with the same data context.
Guidelines
Centralize shared knowledge. Keep universal data (e.g., company policies) in one shared base.
Isolate specialized data. Create smaller, domain-specific bases (e.g., “Technical Docs”, “Sales FAQs”).
Avoid conflicting sources. If two bases contain similar topics, Agents may retrieve mixed or inconsistent results.
Audit link usage. Regularly check which Agents and Squads are linked to each base through the Knowledge Tool.
⚙️ 6. Retrieval Optimization (for Developers)
For technical teams customizing Agents via the API or SDK, consider fine-tuning retrieval settings:
Top-k
Number of results returned per query
3–5 for precision, 8–10 for broader recall
Similarity threshold
Minimum cosine similarity for match relevance
0.75–0.85 for most business contexts
Context window
How much text is passed to the LLM
Keep under 4000 tokens for efficiency
Cache policy
Determines when embeddings are refreshed
Refresh after major updates only
Fine-tuning these parameters helps balance speed, accuracy, and token cost.
🔐 7. Data Quality and Security
Zaia Endless processes all Knowledge Base content securely, ensuring that:
Files and texts are encrypted at rest and during transfer.
Only authorized workspace members can view or modify data.
Data used for embeddings is never shared or exposed to other tenants.
Still, follow these best practices:
Avoid uploading sensitive credentials or personally identifiable information (PII).
Sanitize internal notes before including them in Knowledge Bases used by customer-facing Agents.
Use versioned exports for compliance (e.g., ISO, GDPR, LGPD contexts).
🧭 8. Performance and Testing
After training or linking Knowledge Bases:
Test queries directly through the Agent Playground or CRM Inbox.
Ask questions that closely match and others that differ semantically.
Review how the Agent retrieves and synthesizes context.
Adjust the base content or retraining if answers are incomplete or redundant.
A well-structured base should produce confident, consistent, and concise answers with minimal hallucination.
🧩 9. Advanced Techniques (Optional)
For advanced teams:
Chunk tuning: Split documents into smaller logical pieces manually for greater control.
Metadata tagging: Prefix sections with tags (e.g.,
[PRICING],[ONBOARDING]) for scoped retrieval.Hybrid models: Combine Knowledge Bases with workflow-based tools to pre-filter sources.
Evaluation metrics: Track retrieval accuracy (R@k) and response satisfaction from real conversations.
✅ Summary
Improve accuracy
Write structured, focused content
Support multilingual retrieval
Separate and label languages
Maintain freshness
Retrain regularly
Reduce conflicts
Centralize or modularize knowledge
Optimize performance
Tune retrieval parameters per use case
🚀 Final Insight
A well-built Knowledge Base is not just a data repository — it’s a strategic foundation for scalable, intelligent, and explainable AI operations. In Zaia Endless, every great Agent starts with great knowledge — organized, optimized, and continuously improved.
Last updated
