Documentation
What is Trimpt?
Trimpt compresses AI agent system prompts by ~46% while guaranteeing quality with an AI Judge. Your system prompt is sent with every single API call — if you make 1,000 calls/day with a 300-word prompt, that's 300,000 words/day of repeated tokens you're paying for.
Trimpt uses evolutionary DNA compression with 30 learned genes and 6 mutation strategies, then verifies quality with an AI Judge that scores 1-10 and requires minimum 7/10 to approve.
How It Works
Trimpt works like evolution in nature. Multiple compression strategies compete. The best ones survive and improve. An AI quality check verifies every result with quality score 7+/10 — if quality drops below threshold, the original prompt is kept unchanged.
How you save money
Your system prompt goes from 200 words to 108 words. Fewer words = fewer tokens = lower API bill. Just copy the compressed prompt into your code. Every API call is now 46% cheaper on input tokens.
When you optimize multiple agents at once, Trimpt puts shared rules (like GDPR, tone, language support) at the beginning of every prompt. AI providers (Anthropic, OpenAI) automatically cache the beginning of your prompt. If the first 100 words are identical across all your agents, the provider charges only 10% for those words after the first request. You don't need to configure anything — just use the optimized prompts.
- Paste your prompt(s) in Trimpt
- Copy the optimized result
- Use it in your code
- Save money on every API call — automatically
- No SDK to install (unless you want to)
- No infrastructure changes
- No provider configuration
- No code changes beyond replacing the prompt text
Fleet Optimization (Metabolism)
When you have 5-100 AI agents, many share common rules (tone, compliance, language support). Paste all agents at once using --- Agent: Name --- separators. Trimpt automatically detects shared rules, compresses once, and optimizes each agent separately.
--- Agent: Customer Support ---
You are a friendly support agent for CloudMetrics...
--- Agent: Billing ---
You are a billing specialist for CloudMetrics...Example result with 5 agents:
Shared rules: GDPR, tone, language support → compressed once (39 words)
Each unique: agent-specific rules → compressed separately
Result: -34% reduction, -57% with cache alignment
Cache alignment: Shared prefix is placed first in every agent's prompt. LLM providers cache this prefix at 90% discount, so you only pay full price for the unique part.
Quick Start
- Try on the homepage — paste any prompt, click Optimize. No signup needed.
- Sign up free — 3 optimizations/month, no credit card.
- Copy the result — use the compressed prompt in your code.
- Use the API — automate with Python/JavaScript. Get your key in Settings.
API Reference
/api/v1/optimizecurl -X POST https://trimpt.com/api/v1/optimize \
-H "Authorization: Bearer trpt_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "Your system prompt...", "test_queries": ["test1"]}'/api/v1/optimize-fleetTeam+curl -X POST https://trimpt.com/api/v1/optimize-fleet \
-H "Authorization: Bearer trpt_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"agents": [{"name":"support","prompt":"..."},{"name":"billing","prompt":"..."}]}'Rate limits: Free 2/h · Builder 50/h · Team 200/h · Scale 1000/h
Response statuses: OK (success) · ROLLED_BACK (quality too low) · BLOCKED (already optimized)
Integrations
Python
import requests
r = requests.post("https://trimpt.com/api/v1/optimize",
headers={"Authorization": "Bearer trpt_KEY"},
json={"prompt": "Your system prompt..."})
print(r.json()["optimized_prompt"])JavaScript
const res = await fetch("https://trimpt.com/api/v1/optimize", {
method: "POST",
headers: {"Authorization": "Bearer trpt_KEY", "Content-Type": "application/json"},
body: JSON.stringify({prompt: "Your system prompt..."})
});
const data = await res.json();CrewAI
from crewai import Agent
import requests
def trimpt(p):
return requests.post("https://trimpt.com/api/v1/optimize",
headers={"Authorization": "Bearer trpt_KEY"},
json={"prompt": p}).json().get("optimized_prompt", p)
agent = Agent(role="Support", backstory=trimpt("You are a support agent..."))LangChain
# LangChain
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
system_prompt = trimpt("You are a helpful research assistant...")
# Use system_prompt in your chainLangGraph
# LangGraph
from langgraph.graph import StateGraph
system_prompt = trimpt("You are a planning agent that...")
# Use system_prompt in your graph nodesOpenAI
# OpenAI
import openai
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": trimpt("Your system prompt...")},
{"role": "user", "content": user_question}
]
)FAQ
Does compression reduce quality?+
AI Judge guarantees minimum 7/10. If quality drops below threshold, the original prompt is preserved (ROLLED_BACK). You never get a worse prompt.
What prompts work best?+
System prompts 50+ words with instructions, rules, compliance requirements. Short prompts (< 50 words) may already be optimal.
Do you store my prompts?+
Prompts are processed and discarded after optimization. Only metadata (word counts, scores, timestamps) is saved for your dashboard.
What LLM do you use?+
DeepSeek for compression and judging. Your agents continue using whatever LLM you want (GPT-4o, Claude, Gemini, etc).
How does cache alignment work?+
When you optimize a fleet, Trimpt places shared rules (GDPR, tone, language) at the beginning of every agent's prompt. AI providers like Anthropic and OpenAI automatically cache repeated prompt prefixes at 90% discount. You don't need to configure anything — just use the optimized prompts and the savings happen automatically.
How is this different from PromptPerfect?+
DNA evolution (not one-shot), quality verification (not trust-me), Fleet metabolism with cache alignment (unique to Trimpt), adaptive floor protection.