Best Practices & Optimization

Building a Knowledge Base is not just about uploading documents — it’s about optimizing how your data is understood, embedded, and retrieved by your Agents. This section provides advanced guidelines for improving accuracy, reducing noise, and ensuring consistent multi-Agent performance across Zaia Endless.


🧱 1. Structuring Your Knowledge

The quality of an Agent’s responses depends heavily on how the content inside your Knowledge Base is written and organized. Follow these principles to maximize retrieval precision:

✅ Do:

  • Use clear titles and headings. Structure long texts with # or ## (Markdown syntax) or bold section headers.

  • Keep topics focused. Each item should represent one domain or subject area.

  • Segment logically. If a document covers several unrelated subjects, split it into multiple text items.

  • Add context identifiers. Example:

    [SECTION: Product Pricing]
    [SECTION: Warranty Policies]

❌ Avoid:

  • Overlapping or redundant information between items.

  • Long unstructured paragraphs (over 2000 characters without breaks).

  • Mixing unrelated concepts (e.g. pricing + onboarding + troubleshooting in one text).


🧠 2. Optimizing for Embeddings

Each Knowledge Base undergoes an embedding process, where its text is transformed into high-dimensional vectors for semantic retrieval. Small changes in structure can significantly affect search quality.

🔍 Embedding Best Practices

  • Shorter segments embed better. Keep each paragraph or bullet list under ~1500 characters.

  • Avoid repeated keywords. Semantic models already infer meaning — keyword stuffing lowers quality.

  • Maintain consistent formatting. Avoid random line breaks, tabs, or inconsistent casing.

  • Include synonyms naturally. This helps the model connect variations of user queries.

    Example: “pricing / cost / plan / subscription” within one sentence helps broaden recall.

  • Avoid excessive symbols. Special characters, emojis, or decorative punctuation can reduce precision.


🌐 3. Bilingual and Multi-Language Knowledge

Zaia Endless supports multilingual Agents. When building bilingual Knowledge Bases (e.g., Portuguese + English), always separate languages clearly to prevent mixed-context embeddings.

FAQ — English
Q: What is the price of Zaia?
A: Plans start at $X/month depending on usage.

FAQ — Português
P: Qual é o preço da Zaia?
R: Os planos começam em R$X/mês dependendo do uso.

Tips:

  • Keep both languages in the same item only if logically aligned.

  • Use language tags (EN) and (PT) to separate content blocks.

  • Do not interleave sentences from different languages.


🔄 4. Managing Updates and Retraining

Whenever a Knowledge Base item is edited or replaced, it automatically re-enters the training pipeline. To ensure clean retraining cycles:

Best Practices

  • Batch updates. Edit multiple items before retraining to optimize resource usage.

  • Avoid duplicates. If you replace an item, delete the old one before retraining.

  • Monitor status. Wait for the Completed state before testing the Agent.

  • Periodic refresh. Retrain major bases every 30–60 days to ensure embeddings stay aligned with evolving models.


🧩 5. Multi-Agent and Squad Optimization

Since a single Knowledge Base can be shared across multiple Agents or an entire Squad, consider how each entity interacts with the same data context.

Guidelines

  • Centralize shared knowledge. Keep universal data (e.g., company policies) in one shared base.

  • Isolate specialized data. Create smaller, domain-specific bases (e.g., “Technical Docs”, “Sales FAQs”).

  • Avoid conflicting sources. If two bases contain similar topics, Agents may retrieve mixed or inconsistent results.

  • Audit link usage. Regularly check which Agents and Squads are linked to each base through the Knowledge Tool.


⚙️ 6. Retrieval Optimization (for Developers)

For technical teams customizing Agents via the API or SDK, consider fine-tuning retrieval settings:

Parameter
Description
Recommendation

Top-k

Number of results returned per query

3–5 for precision, 8–10 for broader recall

Similarity threshold

Minimum cosine similarity for match relevance

0.75–0.85 for most business contexts

Context window

How much text is passed to the LLM

Keep under 4000 tokens for efficiency

Cache policy

Determines when embeddings are refreshed

Refresh after major updates only

Fine-tuning these parameters helps balance speed, accuracy, and token cost.


🔐 7. Data Quality and Security

Zaia Endless processes all Knowledge Base content securely, ensuring that:

  • Files and texts are encrypted at rest and during transfer.

  • Only authorized workspace members can view or modify data.

  • Data used for embeddings is never shared or exposed to other tenants.

Still, follow these best practices:

  • Avoid uploading sensitive credentials or personally identifiable information (PII).

  • Sanitize internal notes before including them in Knowledge Bases used by customer-facing Agents.

  • Use versioned exports for compliance (e.g., ISO, GDPR, LGPD contexts).


🧭 8. Performance and Testing

After training or linking Knowledge Bases:

  1. Test queries directly through the Agent Playground or CRM Inbox.

  2. Ask questions that closely match and others that differ semantically.

  3. Review how the Agent retrieves and synthesizes context.

  4. Adjust the base content or retraining if answers are incomplete or redundant.

A well-structured base should produce confident, consistent, and concise answers with minimal hallucination.


🧩 9. Advanced Techniques (Optional)

For advanced teams:

  • Chunk tuning: Split documents into smaller logical pieces manually for greater control.

  • Metadata tagging: Prefix sections with tags (e.g., [PRICING], [ONBOARDING]) for scoped retrieval.

  • Hybrid models: Combine Knowledge Bases with workflow-based tools to pre-filter sources.

  • Evaluation metrics: Track retrieval accuracy (R@k) and response satisfaction from real conversations.


✅ Summary

Goal
Action

Improve accuracy

Write structured, focused content

Support multilingual retrieval

Separate and label languages

Maintain freshness

Retrain regularly

Reduce conflicts

Centralize or modularize knowledge

Optimize performance

Tune retrieval parameters per use case


🚀 Final Insight

A well-built Knowledge Base is not just a data repository — it’s a strategic foundation for scalable, intelligent, and explainable AI operations. In Zaia Endless, every great Agent starts with great knowledge — organized, optimized, and continuously improved.

Last updated