PromptEval Documentation
Professional semantic testing framework for LLM applications.
v1.0.0Python 3.10+API Key Auth
π§ͺ Semantic Validation
ML-powered matching using sentence transformers (85%+ accuracy)
π API Key Authentication
Secure access with pe_ prefixed keys
π Usage Tracking
Monitor test consumption with monthly limits
π CI/CD Ready
Integrate with GitHub Actions, GitLab CI, etc.
Quick Start
Get started with PromptEval in under 5 minutes:
1. Install PromptEval
pip install prompteval-core2. Set your API Key
# Option A: Environment variable (recommended)
export PROMPTEVAL_API_KEY=pe_your_api_key_here
# Option B: Pass directly to commands
prompteval run adapter.yml --api-key pe_xxxxx3. Check your account status
prompteval status
# Output:
# ============================================================
# π PROMPTEVAL ACCOUNT STATUS
# ============================================================
#
# π¦ License: abc123def456...
# π Plan: PRO
# β
Status: Active
#
# π Usage This Month:
# [ββββββββββββββββββββ] 40.0%
# Tests used: 200
# Tests remaining: 300
# Monthly limit: 500
#
# π API Keys: 2 active
# ============================================================4. Create test files
adapter.yml - Your LLM endpoint configuration:
name: my-llm-api
description: My LLM Testing
endpoint:
url: https://api.example.com/v1/chat
method: POST
timeout: 30
request:
headers:
Content-Type: application/json
template:
prompt: "{{PROMPT}}"
type: "{{TYPE}}"
max_tokens: 150
# Response extraction
response:
type: json
path: choices.0.message.content
# Validation
validation:
ml_threshold: 0.75
use_semantic: true
# Test execution (these override config.ini if present)
execution:
parallel_limit: 10
batch_delay: 0.3
output_dir: ./output
tests.yml - Your test cases:
tests:
- name: duration_basic
id: FIT-001
description: Pregunta sobre tiempo de entrenamiento
prompt: "Ask about training duration"
context:
PROMPT: "Ask about training duration"
TYPE: "duration"
expected: "Desde cuando estas entrenando este ejercicio o rutina"
variants:
- "Hace cuanto tiempo empezaste con este entrenamiento"
- "Cuanto tiempo llevas entrenando este ejercicio"
- "Por favor, dime cuΓ‘ndo empezaste con este entrenamiento."
- "CuΓ©ntame, ΒΏdesde cuΓ‘ndo entrenas asΓ?"
threshold: 0.70
tags:
- duration5. Run tests
prompteval run adapter.yml --tests tests.yml
# Output:
# π Running evaluation...
# Adapter: adapter.yml
#
# ==================================================
# π EVALUATION RESULTS
# ==================================================
# Total tests: 2
# β
Passed: 2
# β Failed: 0
# β οΈ Errors: 0
# Success rate: 100.0%
# Duration: 1523.45ms
# ==================================================
#
# πΎ Results saved: results.json6. Generate HTML report
prompteval report results.json --output report.htmlInstallation
Requirements
- Python 3.10 or higher
- Valid PromptEval API Key
Install via pip
pip install prompteval-coreVerify installation
prompteval --help
# Output shows available commands:
# run, validate, report, status, licensesCLI Reference
Commands Overview
| Command | Description |
|---|---|
| prompteval run | Execute tests against your LLM |
| prompteval validate | Validate YAML syntax (no API call) |
| prompteval report | Generate HTML report from results |
| prompteval status | Show account status and quota |
| prompteval licenses | List your licenses |
Run Tests
# Basic usage
prompteval run adapter.yml --tests tests.yml --api-key pe_xxxxx
# With environment variable
export PROMPTEVAL_API_KEY=pe_xxxxx
prompteval run adapter.yml --tests tests.yml
# Custom output file
prompteval run adapter.yml -t tests.yml -o my_results.json
# Against local/staging API
prompteval run adapter.yml -t tests.yml -u http://localhost:8000Validate YAML
# Validate test file (doesn't consume quota)
prompteval validate tests.yml
# Output:
# π Validating: tests.yml
# β
Valid YAML - 5 test cases foundGenerate Report
# Basic report
prompteval report results.json
# Custom output
prompteval report results.json --output my-report.htmlAccount Status
# Check quota and usage
prompteval status --api-key pe_xxxxx
# Or with env variable
prompteval statusPython SDK
Basic Usage
from prompteval import PromptEval
# Initialize client
client = PromptEval(api_key="pe_xxxxx")
# Run tests from files
result = client.run_from_files(
adapter_path="adapter.yml",
tests_path="tests.yml"
)
# Check results
print(f"Success rate: {result.success_rate}%")
print(f"Passed: {result.total_passed}/{result.total_tests}")
if result.success:
print("β
All tests passed!")
else:
print("β Some tests failed:")
for test in result.failed_tests:
print(f" - {test.id_test}: {test.similarity:.1%} similarity")Account Methods
from prompteval import PromptEval
client = PromptEval(api_key="pe_xxxxx")
# Get licenses
licenses = client.get_licenses()
for lic in licenses:
print(f"Plan: {lic.plan}")
print(f"Tests remaining: {lic.tests_remaining}")
# Get usage details
usage = client.get_usage()
print(f"Tests this month: {usage['tests_this_month']}")Error Handling
from prompteval import PromptEval
from prompteval.exceptions import (
AuthenticationError,
QuotaExceededError,
RateLimitError,
APIError
)
client = PromptEval(api_key="pe_xxxxx")
try:
result = client.run_from_files("adapter.yml", "tests.yml")
except AuthenticationError:
print("Invalid API key")
except QuotaExceededError:
print("Monthly quota exceeded - upgrade your plan")
except RateLimitError:
print("Too many requests - slow down")
except APIError as e:
print(f"API error: {e}")CI/CD Integration
GitHub Actions
name: LLM Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install PromptEval
run: pip install prompteval-core
- name: Run LLM Tests
env:
PROMPTEVAL_API_KEY: ${{ secrets.PROMPTEVAL_API_KEY }}
run: |
prompteval run adapter.yml --tests tests.yml
- name: Generate Report
if: always()
run: prompteval report results.json --output report.html
- name: Upload Report
if: always()
uses: actions/upload-artifact@v4
with:
name: test-report
path: report.htmlGitLab CI
llm-tests:
image: python:3.11
script:
- pip install prompteval-core
- prompteval run adapter.yml --tests tests.yml
- prompteval report results.json --output report.html
artifacts:
paths:
- report.html
when: always
variables:
PROMPTEVAL_API_KEY: $PROMPTEVAL_API_KEYPricing Plans
| Feature | Free | Pro | Team | Enterprise |
|---|---|---|---|---|
| Price | $0/mo | $99/mo | $299/mo | Custom |
| Tests/Month | 30 | 1,000 | 5,000 | On Demand |
| API Keys | 1 | 1 | 5 | On Demand |
| Semantic Validation | β | β | β | β |
| HTML Reports | β | β | β | β |
| CI/CD Integration | β | β | β | β |
| Priority Support | β | β | β | β |
| SLA | β | β | β | β |
Semantic Validation
PromptEval uses sentence transformers to compute semantic similarity between expected and actual LLM outputs.
How It Works
- Both expected and actual responses are converted to embeddings
- Cosine similarity is calculated between embeddings
- If similarity β₯ threshold, the test passes
Example
Expected: "The capital of France is Paris"
Actual: "Paris is the capital city of France"
Similarity: 94% β
(threshold: 75%)Threshold Guidelines
| Threshold | Use Case |
|---|---|
| 0.90+ | Exact facts, numbers, specific answers |
| 0.75-0.89 | General responses, similar meaning |
| 0.60-0.74 | Loose matching, related topics |
| <0.60 | Very flexible matching |
Ready to Get Started?
Get your API key today and start testing your LLM applications professionally.