Documentation

What is Trimpt?

Trimpt compresses AI agent system prompts by ~46% while guaranteeing quality with an AI Judge. Your system prompt is sent with every single API call — if you make 1,000 calls/day with a 300-word prompt, that's 300,000 words/day of repeated tokens you're paying for.

Trimpt uses evolutionary DNA compression with 30 learned genes and 6 mutation strategies, then verifies quality with an AI Judge that scores 1-10 and requires minimum 7/10 to approve.

How It Works

Trimpt works like evolution in nature. Multiple compression strategies compete. The best ones survive and improve. An AI quality check verifies every result with quality score 7+/10 — if quality drops below threshold, the original prompt is kept unchanged.

DNA Evolution Engine: 30 compression genes (learned rules like "Replace verbose explanations with key:value pairs"). 5 protected genes that never die (compliance, numbers, error handling). Genes with high win rate are selected more often.
6 Mutation Strategies: Shortener, Essentialist, Reformulator, Structurator, Instructor, ContextKiller. Each applies different compression technique with 2 random DNA genes.
Quality Verification: Auto-generates edge case queries, tests both original and compressed prompts, compares responses. Requires quality score 7+/10 to pass. If quality drops, original is preserved (ROLLED_BACK).
Adaptive Floor: Complex prompts (compliance, multi-language, escalation flows) get higher minimum word count (40-55% of original). Prevents over-compression.
Already-Optimized Detector: Detects prompts that are already compressed (key:value format, low filler words, compression patterns). Returns BLOCKED instead of wasting time.

How you save money

1. Compression (automatic)
Your system prompt goes from 200 words to 108 words. Fewer words = fewer tokens = lower API bill. Just copy the compressed prompt into your code. Every API call is now 46% cheaper on input tokens.
2. Cache alignment (fleet only, automatic)
When you optimize multiple agents at once, Trimpt puts shared rules (like GDPR, tone, language support) at the beginning of every prompt. AI providers (Anthropic, OpenAI) automatically cache the beginning of your prompt. If the first 100 words are identical across all your agents, the provider charges only 10% for those words after the first request. You don't need to configure anything — just use the optimized prompts.
What you do:
  1. Paste your prompt(s) in Trimpt
  2. Copy the optimized result
  3. Use it in your code
  4. Save money on every API call — automatically
What you DON'T need to do:
  • No SDK to install (unless you want to)
  • No infrastructure changes
  • No provider configuration
  • No code changes beyond replacing the prompt text

Fleet Optimization (Metabolism)

When you have 5-100 AI agents, many share common rules (tone, compliance, language support). Paste all agents at once using --- Agent: Name --- separators. Trimpt automatically detects shared rules, compresses once, and optimizes each agent separately.

--- Agent: Customer Support ---
You are a friendly support agent for CloudMetrics...

--- Agent: Billing ---
You are a billing specialist for CloudMetrics...

Example result with 5 agents:

Shared rules: GDPR, tone, language support → compressed once (39 words)

Each unique: agent-specific rules → compressed separately

Result: -34% reduction, -57% with cache alignment

Cache alignment: Shared prefix is placed first in every agent's prompt. LLM providers cache this prefix at 90% discount, so you only pay full price for the unique part.

Quick Start

  1. Try on the homepage — paste any prompt, click Optimize. No signup needed.
  2. Sign up free — 3 optimizations/month, no credit card.
  3. Copy the result — use the compressed prompt in your code.
  4. Use the API — automate with Python/JavaScript. Get your key in Settings.

API Reference

POST/api/v1/optimize
curl -X POST https://trimpt.com/api/v1/optimize \
  -H "Authorization: Bearer trpt_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Your system prompt...", "test_queries": ["test1"]}'
POST/api/v1/optimize-fleetTeam+
curl -X POST https://trimpt.com/api/v1/optimize-fleet \
  -H "Authorization: Bearer trpt_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"agents": [{"name":"support","prompt":"..."},{"name":"billing","prompt":"..."}]}'

Rate limits: Free 2/h · Builder 50/h · Team 200/h · Scale 1000/h

Response statuses: OK (success) · ROLLED_BACK (quality too low) · BLOCKED (already optimized)

Integrations

Python

import requests
r = requests.post("https://trimpt.com/api/v1/optimize",
    headers={"Authorization": "Bearer trpt_KEY"},
    json={"prompt": "Your system prompt..."})
print(r.json()["optimized_prompt"])

JavaScript

const res = await fetch("https://trimpt.com/api/v1/optimize", {
  method: "POST",
  headers: {"Authorization": "Bearer trpt_KEY", "Content-Type": "application/json"},
  body: JSON.stringify({prompt: "Your system prompt..."})
});
const data = await res.json();

CrewAI

from crewai import Agent
import requests
def trimpt(p):
    return requests.post("https://trimpt.com/api/v1/optimize",
        headers={"Authorization": "Bearer trpt_KEY"},
        json={"prompt": p}).json().get("optimized_prompt", p)
agent = Agent(role="Support", backstory=trimpt("You are a support agent..."))

LangChain

# LangChain
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
system_prompt = trimpt("You are a helpful research assistant...")
# Use system_prompt in your chain

LangGraph

# LangGraph
from langgraph.graph import StateGraph
system_prompt = trimpt("You are a planning agent that...")
# Use system_prompt in your graph nodes

OpenAI

# OpenAI
import openai
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": trimpt("Your system prompt...")},
        {"role": "user", "content": user_question}
    ]
)

FAQ

Does compression reduce quality?+

AI Judge guarantees minimum 7/10. If quality drops below threshold, the original prompt is preserved (ROLLED_BACK). You never get a worse prompt.

What prompts work best?+

System prompts 50+ words with instructions, rules, compliance requirements. Short prompts (< 50 words) may already be optimal.

Do you store my prompts?+

Prompts are processed and discarded after optimization. Only metadata (word counts, scores, timestamps) is saved for your dashboard.

What LLM do you use?+

DeepSeek for compression and judging. Your agents continue using whatever LLM you want (GPT-4o, Claude, Gemini, etc).

How does cache alignment work?+

When you optimize a fleet, Trimpt places shared rules (GDPR, tone, language) at the beginning of every agent's prompt. AI providers like Anthropic and OpenAI automatically cache repeated prompt prefixes at 90% discount. You don't need to configure anything — just use the optimized prompts and the savings happen automatically.

How is this different from PromptPerfect?+

DNA evolution (not one-shot), quality verification (not trust-me), Fleet metabolism with cache alignment (unique to Trimpt), adaptive floor protection.