Claude Skill

LLMeval

by spenceryonce

Evaluate and compare large language models (LLMs) for chatbot applications, using various LLMs as evaluators, and manage prompt templates and binary preferences.

11 stars1 forksMaintainedPython
62
Good

📊 Score Breakdown

🛡️Security30%
3.0/5
Utility30%
2.0/5
🔄Maintenance25%
4.0/5
💎Uniqueness15%
4.0/5

Overall = Security (30%) + Utility (30%) + Maintenance (25%) + Uniqueness (15%). Full methodology →

ℹ️ Details

Category

🤖 AI & LLM Tools

Ecosystem

Claude Skill

Language

Python

Pricing

Free

License

Status

Maintained

Platforms

claude

📈 GitHub Signals

11

Stars

1

Forks

0

Commits (30d)

0

Open Issues

Last commit: 5 months ago

anthropicchatgptclaudecohereevaluationevaluatorllmopenai

The Weekly Index 📬

New MCP servers, Claude skills, stale alerts, and picks — every Thursday.

Data last verified: 3 weeks ago. See something wrong? Report it →