Rylvo Prompts: Version, Optimize, and Perfect Your AI Agent Instructions
If you have ever built an AI bot, you know the feeling. You write a system prompt, test it in a playground, think it is perfect, deploy it to production, and within days customers are complaining that the bot is too formal, too casual, too verbose, too terse, or just plain wrong. You tweak the prompt, redeploy, and the cycle repeats. Before long, you have twenty different versions of the prompt scattered across notes, emails, chat messages, and deployment scripts — and no idea which one is actually running in production right now.
This is the prompt management problem, and it is one of the most underrated challenges in AI development. A great prompt is the difference between a bot that delights customers and one that frustrates them. But great prompts do not emerge fully formed. They evolve through iteration, testing, measurement, and refinement. And to evolve effectively, you need a system that treats prompts as first-class software artifacts — versioned, tested, optimized, and governed with the same rigor as your application code.
Rylvo Prompts is that system. It is a full-featured prompt management platform built directly into Rylvo, designed for teams who take prompt engineering seriously. Every prompt is versioned with immutable snapshots. Every edit creates a new version you can inspect, compare, promote, or roll back. Curated templates give you a head start for common use cases. Placeholder variables make prompts reusable across bots and contexts. Parent-child inheritance lets you build modular prompt hierarchies. Self-improvement optimization automatically generates and evaluates better variants using real performance data. A/B testing lets you compare versions side by side. Analytics show you exactly how each version performs. And everything integrates natively with the bot runtime, so the prompt your bot uses is always the one you intended.
In this guide, we will walk through the complete Rylvo Prompts system — from creating your first prompt to running automated optimization to analyzing performance trends. Whether you are a solo developer building your first bot or a platform team managing dozens of agents, this is everything you need to know.
What Is the Rylvo Prompts System?
At its core, the Rylvo Prompts system is a prompt lifecycle management platform. It covers the full journey of a prompt from creation to retirement:
Creation — Start from a blank slate or choose from curated templates tailored to common bot roles. Define placeholder variables, assign the prompt to a bot or agent group, set a parent prompt for inheritance, and tag it for organization.
Versioning — Every content edit creates an immutable version snapshot. Versions include the full prompt text, placeholder definitions, a change note, performance metrics, and the source of the change (manual edit, auto-optimization, or AI architect).
Editing — Update prompt content and placeholders through an inline editor. Save creates a new version automatically. No more guessing whether you are editing the production prompt or a draft.
Promotion — Any version can be promoted to production with a single click. The previous version remains intact, so you can roll back instantly if the new version underperforms.
Optimization — Enable self-improvement and the system automatically generates candidate variants, scores them against real chat traces, and promotes the winner if it beats the baseline by your configured threshold.
Testing — Run A/B tests between two prompt versions to see which performs better with real traffic or simulated conversations.
Analytics — Track per-prompt performance metrics including sample size, success rate, verification pass rate, and latency trends over time.
Governance — Archive old prompts, restrict editing with allowed-editor lists, and audit every change through the audit log.
All of this lives in a single dashboard at /dashboard/prompts with a clean list view and a rich detail page for every prompt.
The Eight Agent Categories: Prompts with Purpose
Every prompt in Rylvo is tagged with an agent category. This is not just a label — it determines how the bot runtime assembles the prompt into the final system prompt. Think of categories as roles in a play. Each actor has a specific job, and the director knows which lines to give them first.
Response Composer is the star of the show. It carries the highest runtime priority and is injected as the primary instructions section. This is where you define the bot's personality, tone, task instructions, and overall behavior. Every bot should have exactly one response composer prompt, and it is the most important prompt you will write.
Stage Classifier handles workflow routing. It tells the bot how to classify the current stage of a conversation — greeting, information gathering, problem resolution, closing — and what to do at each stage. This is essential for bots with multi-step workflows.
Action Selector governs tool-calling decisions. It tells the bot when to use tools, which tools to consider, and how to interpret tool results. If your bot integrates with external APIs or databases, the action selector prompt controls those decisions.
Escalation Classifier decides when to hand off to a human. It defines the conditions, keywords, and contexts that trigger escalation. A good escalation classifier prevents the bot from overreaching while ensuring real emergencies reach human agents quickly.
Session Summarizer provides summarization guidelines. When the bot needs to summarize a conversation, ticket, or meeting transcript, this prompt controls the structure, length, and style of the summary.
Verifier enforces output validation rules. It tells the bot what to check before delivering a response — policy compliance, fact accuracy, tone consistency, format requirements. The verifier prompt is your quality gate.
Retrieval controls knowledge base behavior. It tells the bot how to retrieve information, how to cite sources, what to do when no relevant information is found, and how to format retrieved content into responses.
Custom is the catch-all for domain-specific instructions that do not fit the standard categories. Use this for experimental prompts, specialized workflows, or niche use cases.
Each category has a color-coded badge, a description, and a curated starter template. When you create a prompt, selecting a category automatically fills in a template appropriate for that role. This accelerates setup and ensures category-appropriate structure.
Curated Templates: Start with Best Practices
Writing a great prompt from scratch is hard. Rylvo includes eight curated templates covering the most common bot use cases, each designed by prompt engineering experts and refined for real-world deployment.
Support Triage is a stage classifier template for customer support bots. It classifies incoming messages by urgency and routes them to the right team. The template includes rules for one-word responses, no follow-up questions, and context-aware classification.
Sales Qualifier is an action selector template for lead qualification bots. It gathers company size, pain points, budget range, and timeline through consultative questioning. The template includes scoring logic and next-step recommendations.
Technical Debugger is a response composer template for technical support bots. It structures troubleshooting into empathy, clarifying questions, suggested fixes, and escalation paths. The template includes constraints against guessing and requirements for exact step-by-step instructions.
Compliance Checker is a verifier template for content moderation and policy enforcement bots. It reviews messages for PII exposure, financial data leakage, off-topic content, and abusive language, returning structured JSON with pass/fail verdicts.
Onboarding Guide is a response composer template for new user onboarding bots. It structures the first five minutes into greeting, goal confirmation, first-step guidance, celebration, and next-step offers. The template targets an eighth-grade reading level and includes constraints against overwhelming the user.
Escalation Router is an escalation classifier template for support bots. It defines explicit escalation triggers — human requests, frustration signals, high-severity issues, repeated failures, and legal mentions — with a binary output format.
Knowledge Base Retrieval is a retrieval template for FAQ and documentation bots. It structures answers into direct response, detailed explanation, source citation, and related topics. The template includes a hard constraint against hallucinating when no relevant information is found.
Session Summarizer is a session summarizer template for support and sales bots. It produces structured summaries with issue description, resolution status, action items, sentiment, and tags. The template enforces objectivity and language consistency.
Each template includes placeholder variables for company name, product name, tone, and other configurable parameters. You fill in the placeholders and the template adapts to your brand and context. Templates are tagged by difficulty — beginner, intermediate, or advanced — so you can choose the right complexity level for your team.
Placeholder Variables: Reusable, Configurable Prompts
Static prompts are brittle. Every time your company name, product name, or tone changes, you have to hunt through every prompt and update it manually. Rylvo's placeholder system solves this by making prompts dynamic.
Placeholders use the {{key}} syntax. You declare them in the placeholder editor, where you specify the key name, a human-readable label, a description, whether the placeholder is required or optional, and a default value. At runtime, the bot runtime substitutes known placeholders with actual values — bot metadata, user context, or configured defaults.
For example, a support prompt might include {{company_name}} and {{product_name}}. Instead of hardcoding "Acme Corp" and "Acme Platform" into every prompt, you set these once in the placeholder editor. If your company rebrands, you change the default value in one place and every prompt updates automatically.
The placeholder editor validates that all declared placeholders are present in the prompt content and warns about unused placeholders. Required placeholders are flagged with an asterisk, so you never accidentally deploy a prompt with missing critical variables.
Placeholders also enable runtime personalization. A bot can substitute {{user_name}} with the actual user's name, making responses feel personal and engaging. A sales bot can substitute {{tone}} with "consultative and warm" or "direct and professional" depending on the customer segment.
The live preview panel shows the fully substituted prompt before you save it, so you can verify that placeholders resolve correctly and the resulting text reads naturally.
Parent-Child Inheritance: Build Modular Prompt Hierarchies
As your bot ecosystem grows, you will find that many bots share common instructions — brand voice, tone guidelines, policy constraints, safety rules. Copying these into every prompt is error-prone and hard to maintain. Rylvo's parent-child inheritance solves this.
Any prompt can declare a parentPromptId, linking it to another prompt as its parent. At runtime, the child prompt can reference {{parent}}, which resolves to the parent's current content. This means the child prompt includes the parent's text inline, as if it were part of the child's own content.
The typical pattern is to create a unified master prompt with shared rules — "Always be polite. Never promise refunds. Use simple language." This prompt is assigned as the parent of all specialist prompts. Each specialist prompt then focuses on its domain-specific instructions while inheriting the shared rules from the master.
For example, a customer support team might have:
- Master Prompt — brand voice, tone, safety rules, escalation policy
- Billing Specialist Prompt — child of master, with billing-specific instructions
- Technical Specialist Prompt — child of master, with technical troubleshooting logic
- Sales Specialist Prompt — child of master, with lead qualification guidelines
If the brand voice changes, you update the master prompt once. All children inherit the change automatically. This is the DRY principle applied to prompt engineering.
The dependency graph tab on the prompt detail page visualizes the parent-child relationships, group sharing links, and bot assignments as a directed acyclic graph. You can see at a glance which prompts depend on which, making it easy to understand the impact of changing a parent prompt.
Shared Prompts and Agent Groups
Rylvo supports multi-agent groups where multiple specialist bots collaborate under an orchestrator. In these setups, shared prompts ensure consistency across all agents in the group.
A prompt with sharedAcrossGroup: true and an agentGroupId is available to every bot in that group. This is the standard pattern for the unified master prompt — one shared prompt with common rules, inherited by every specialist in the group.
You can also scope a prompt to a single bot with botId, making it private to that bot. Or you can leave both botId and agentGroupId empty, making the prompt organization-wide and available to any bot that loads it.
This three-level scoping — org-wide, group-shared, or bot-private — gives you precise control over prompt visibility and reuse.
Full Version History: Never Lose a Good Prompt
Every content edit in Rylvo creates a new immutable version. This is not a simple save-overwrite — it is a complete snapshot of the prompt at that point in time, stored as a separate document with full metadata.
Each version includes:
- A sequential version number (1, 2, 3, and so on)
- The full prompt content
- The placeholder definitions
- A change note describing what changed
- The source of the change — manual edit, auto-optimization, or AI architect
- Performance metrics — sample size, success rate, verification pass rate, average latency
- The timestamp when this version was promoted to production, if ever
- The identity of the actor who created the version
Versions are never deleted or overwritten. Even if you roll back to version 3 after promoting version 5, version 5 remains in the history. You can always promote it again later.
The version list shows all versions with their number, source badge, change note, metrics, and promotion status. Click any version to see its full content. Click "Promote" to make it the live version instantly. The bot will use the promoted version on its next conversation.
The diff viewer compares any two versions side by side, highlighting added and removed text, placeholder changes, and structural differences. This makes code review for prompts as straightforward as code review for software.
Self-Improvement Optimization: Let AI Improve Your Prompts
Manually iterating prompts is time-consuming and subjective. You think a change makes the prompt better, but without data, you are just guessing. Rylvo's self-improvement optimization replaces guesswork with evidence.
When you enable optimization on a prompt, the system periodically loads recent chat traces as evaluation samples, generates candidate prompt variants using proven optimization strategies, scores each variant against the samples, and promotes the winner if it beats the baseline by your configured improvement threshold.
Optimization Strategies
Rylvo supports four optimization strategies, each with different strengths:
OPRO (Optimization by Prompting) generates new prompt variants by asking an LLM to improve the current prompt based on observed failures. It is effective for fixing specific issues like tone problems, missing constraints, or unclear instructions.
APE (Automatic Prompt Engineering) uses few-shot prompting to generate high-quality instruction variants. It is particularly effective for classification and decision-making prompts where small wording changes have large impact.
DSPy Bootstrap uses the DSPy framework's teleprompter to optimize prompt structure and examples. It is ideal for complex prompts with multiple components that need balanced weighting.
PromptBreeder evolves prompts through genetic mutation and crossover, testing hundreds of variants and selecting the fittest. It is the most thorough strategy, suitable for prompts where you want exhaustive exploration of the optimization space.
How Optimization Works
The optimization pipeline follows a clear sequence:
- The scheduler triggers based on your configured frequency — hourly, daily, or weekly.
- The system loads the current prompt, its version history, and recent chat traces as evaluation samples.
- The selected strategy generates candidate variants — typically 5 to 20 different versions of the prompt with varied wording, structure, or constraints.
- Each variant is scored against the evaluation samples using LLM-as-a-judge or signal-based scoring.
- The highest-scoring variant is compared to the current baseline.
- If the winner beats the baseline by your improvement threshold (for example, 10% better success rate), a new version is created with source "auto_optimized."
- If auto-promote is enabled, the new version becomes the live version immediately.
- If auto-promote is disabled, the new version appears in the version list for your manual review.
- Performance metrics are updated, and the optimization run is recorded in the run history.
You can also trigger optimization manually by clicking "Run Now" on the optimization panel. This is useful when you want immediate feedback after a significant change or when you are setting up a new prompt.
The optimization panel lets you configure:
- Enabled toggle — turn optimization on or off
- Strategy selector — choose OPRO, APE, DSPy, or PromptBreeder
- Evaluation metric — success rate, verification pass rate, latency, or custom
- Improvement threshold — the minimum score improvement required to promote a variant
- Auto-promote — whether winning variants go live automatically or await approval
- Run frequency — how often the scheduler triggers
The run history panel shows every optimization run with its strategy, sample size, baseline score, best variant score, whether an improvement was found, and the resulting version number.
A/B Testing: Compare Versions with Real Data
Sometimes you have two prompt versions that both seem good, and you need real-world data to decide which is better. Rylvo's A/B testing framework lets you run controlled experiments between prompt versions.
Create an A/B test from the Versions tab, selecting two versions to compare. Configure the traffic split, evaluation metric, and minimum sample size. The system routes conversations to each version according to the split, collects performance data, and reports the results.
When a test completes, you see a statistical comparison — success rate difference, confidence interval, p-value, and a recommendation. If one version wins decisively, you can promote it directly from the test results. If the test is inconclusive, you can extend it with more samples or try different versions.
A/B testing is the gold standard for prompt optimization because it measures performance in the only environment that matters: production conversations with real users.
Analytics: Understand How Your Prompts Perform
The Analytics tab shows per-prompt performance over time. Key metrics include:
- Sample size — how many conversations used this prompt
- Success rate — percentage of conversations that achieved their goal
- Verification pass rate — percentage of responses that passed quality verification
- Average latency — response time trend across versions
- Version performance comparison — how each version scored on the same metrics
These metrics help you correlate prompt changes with performance changes. If you updated the response composer on Monday and the success rate dropped on Tuesday, the analytics make the connection obvious. If an auto-optimized version improved verification pass rate by 15%, the analytics confirm the improvement with hard numbers.
How Prompts Integrate with Bot Runtime
Prompts are not standalone documents — they are the foundation of every bot conversation. Understanding how they integrate with the runtime helps you write better prompts.
When a user opens a bot chat, the runtime executes this sequence:
-
Load Bot Context — The system fetches the bot document, all active prompts linked to this bot, active guardrails, active connectors, and active MCP servers in parallel.
-
Assemble System Prompt — The
buildSystemPromptfunction composes the final system prompt in a specific order:- Bot identity line with name and role
- Primary instructions section containing the response_composer prompt
- Additional agent instructions section containing all other category prompts, prefixed with their category name
- Guardrail behavioral rules
- Connector awareness block
- MCP tool specifications
-
Layer Runtime Augmentations — On each turn, the system dynamically appends:
- Learned rules from the Self-Evolving Agent
- Matched skills from the Skills Engine
- Episodic recall from similar past conversations
- User context from the per-conversation user model
- Retrieved knowledge base chunks
- Mission Control operator whisper instructions with highest priority
-
LLM Call — The assembled prompt is sent to the configured LLM.
-
Output Evaluation — The LLM response is evaluated by output guardrails before delivery.
-
Persistence — The conversation is saved to Firestore, and a chat trace is created with a snapshot of which prompts were used.
This architecture ensures that prompts are the stable foundation while runtime augmentations provide dynamic context. Changing a prompt affects the bot's core behavior. Changing a skill or knowledge base connection affects the bot's capabilities without modifying the prompt.
The key rule is that the response_composer prompt is the only one elevated to primary status. All other prompts are additive context. This prevents prompt competition — a common problem where multiple prompts contradict each other and the bot becomes confused about which instruction to follow.
Creating Your First Prompt: A Step-by-Step Guide
Step 1: Open the Prompts Dashboard
Navigate to /dashboard/prompts. You see the prompt library with cards showing all your prompts, their version counts, optimization status, improvement counts, and metadata.
Step 2: Click "Create New Prompt"
The create modal opens. Start by selecting an agent category. The category auto-fills a starter template. For your first prompt, choose "Response Composer" — this is the main personality and instruction prompt.
Step 3: Fill in the Basics
Give your prompt a clear name like "Customer Support Personality." Add a brief description. The name and description are for your team's reference and appear in the prompt list and detail pages.
Step 4: Customize the Content
The content editor shows the template text. Customize it for your brand and use case. Replace placeholder values like {{company_name}} with your actual company name, or leave them as placeholders and configure the values in the placeholder editor.
Step 5: Define Placeholders
Open the placeholder editor. Review the placeholders from the template. Add, remove, or modify them. Set required placeholders for values the bot cannot function without. Set default values for optional placeholders.
Step 6: Assign to a Bot
Select the bot that will use this prompt from the dropdown. If you want the prompt available to multiple bots, leave the bot assignment empty and optionally set an agent group with shared access.
Step 7: Set a Parent Prompt (Optional)
If you have a master prompt with shared rules, select it as the parent. The child prompt will include the parent's content at runtime via {{parent}}.
Step 8: Save
Click "Create Prompt." The system creates the prompt document, the first version snapshot, and links it to the selected bot. The bot will use this prompt on its next conversation.
Step 9: Test in the Chat Playground
Go to /dashboard/bots/{botId}/chat and test the bot. The resource summary shows which prompts are loaded. If something feels off, go back to the prompt, edit the content, and save a new version. The cycle of edit, save, test, repeat is now as fast as writing code.
Real-World Use Cases
Solo Developer: From Blank Page to Working Bot in 30 Minutes
Alex wants to build a FAQ bot for a SaaS product. They open the Prompts dashboard, select the "Knowledge Base Retrieval" template, fill in the company name and max length placeholders, assign it to their bot, and save. The bot immediately uses the prompt. Alex tests in the chat playground, notices the responses are too long, reduces the max_length placeholder from 150 to 100, saves a new version, and tests again. Total time: 25 minutes. The bot is live with a well-structured, evidence-citing prompt.
QA Team: Version-Controlled Prompt Reviews
A mid-sized company has five people editing prompts. Before Rylvo, they used a shared Google Doc with no version control, and multiple people overwrote each other's changes. With Rylvo Prompts, every edit creates a version with the editor's identity and change note. The QA lead reviews the version diff every morning, approving good changes and rolling back problematic ones. The team has not had a single production regression caused by a prompt conflict since adopting versioned prompts.
Platform Team: Auto-Optimization at Scale
A large enterprise runs 30 bots across customer support, sales, technical documentation, and internal IT. The platform team enables self-improvement on all response_composer prompts with daily optimization runs. Over three months, the system auto-generates and evaluates 1,200 prompt variants, promoting 47 improvements across the fleet. Average verification pass rate improves from 71% to 89%. The platform team spends their time on strategic prompt architecture instead of manual wording tweaks.
Comparison: Rylvo Prompts vs. Basic Prompt Management
| Capability | Basic Prompt Storage | Rylvo Prompts |
|---|---|---|
| Versioning | None or manual copies | Immutable versions with full metadata |
| Rollback | Copy-paste from memory | One-click promotion of any version |
| Diff viewing | Manual comparison | Side-by-side diff with highlighted changes |
| Auto-optimization | None | OPRO, APE, DSPy, PromptBreeder strategies |
| A/B testing | Manual bot cloning | Built-in traffic split and statistical comparison |
| Templates | Copy from documentation | 8 curated templates with placeholders |
| Inheritance | Copy-paste duplication | Parent-child with runtime resolution |
| Placeholders | String replacement in code | Declared variables with validation and defaults |
| Analytics | Guesswork | Per-prompt success rate, latency, verification trends |
| Bot integration | Manual paste into bot config | Automatic assembly into system prompt |
| Audit trail | None | Full audit log with actor and timestamp |
| Group sharing | Manual distribution | SharedAcrossGroup with agent groups |
FAQ
What is a prompt in Rylvo? A prompt is a reusable instruction document that defines part of a bot's behavior. Prompts are categorized by agent role, versioned with immutable snapshots, and assembled into the bot's system prompt at runtime.
How many prompts can a bot have? A bot can have multiple prompts, one per category. The response_composer is the primary prompt. Additional prompts in other categories provide supplementary instructions. There is no hard limit, but the runtime assembles them in priority order.
What happens when I edit a prompt? Saving the content creates a new version snapshot. The new version is automatically promoted to production unless you have disabled auto-promotion. The old version remains in the version history.
Can I undo a prompt change? Yes. Go to the Versions tab, find the previous version, and click "Promote." The bot will use the previous version immediately. No data is lost — versions are never deleted.
What are placeholders?
Placeholders are dynamic variables using the {{key}} syntax. They let you reuse the same prompt across different bots or contexts by substituting different values at runtime. You declare placeholders in the editor with labels, descriptions, and default values.
How does parent-child inheritance work?
A child prompt links to a parent prompt via parentPromptId. At runtime, {{parent}} in the child resolves to the parent's current content. This lets you maintain shared rules in one master prompt and domain-specific instructions in child prompts.
What is self-improvement optimization? When enabled, the system automatically generates prompt variants, scores them against real chat traces, and promotes the winner if it beats the baseline. Strategies include OPRO, APE, DSPy Bootstrap, and PromptBreeder.
Is auto-promotion safe? Auto-promotion promotes winning variants automatically. If you prefer manual review, disable auto-promotion. Winning variants will appear as new versions for your approval.
Can I run A/B tests? Yes. The Versions tab includes A/B testing. Select two versions, configure the traffic split and evaluation metric, and run the test. Results include statistical comparison with confidence intervals.
How do curated templates work? When creating a prompt, selecting an agent category auto-fills a starter template appropriate for that role. You customize the template content and placeholders for your specific use case.
What plans include prompt optimization? Prompt creation and versioning are available on all plans. Self-improvement optimization and A/B testing are available on Team and higher plans. Check your plan details in Settings.
Do prompts work with multi-agent groups? Yes. Prompts can be shared across an agent group, inherited via parent-child links, or scoped to individual bots. The dependency graph visualizes all relationships.
Ready to Build Better Prompts?
Prompts are the most important part of your AI bot. A well-written, well-versioned, well-optimized prompt is the difference between a bot that customers love and one they abandon. Rylvo Prompts gives you the complete toolkit — templates to get started, placeholders for flexibility, inheritance for maintainability, versioning for safety, optimization for improvement, A/B testing for validation, and analytics for insight.
Open the Prompts Dashboard today and create your first prompt. In thirty minutes, you will have a versioned, bot-integrated prompt that is ready to evolve.
Build better bots, one prompt at a time.
