Claude Skill

LLMeval

by spenceryonce

Evaluate and compare large language models (LLMs) for chatbot applications, using various LLMs as evaluators, and manage prompt templates and binary preferences.

11 stars1 forksStalePython
62
Good

📊 Score Breakdown

🛡️Security30%
3.0/5
Utility30%
2.0/5
🔄Maintenance25%
4.0/5
💎Uniqueness15%
4.0/5

Overall = Security (30%) + Utility (30%) + Maintenance (25%) + Uniqueness (15%). Full methodology →

ℹ️ Details

Category

🤖 AI & LLM Tools

Ecosystem

Claude Skill

Language

Python

Pricing

Free

License

Status

Stale

Platforms

claude

📈 GitHub Signals

11

Stars

1

Forks

0

Commits (30d)

0

Open Issues

Last commit: 7 months ago

anthropicchatgptclaudecohereevaluationevaluatorllmopenai

The Weekly Index 📬

New MCP servers, Claude skills, stale alerts, and picks — every Thursday.

Data last verified: 1 months ago. See something wrong? Report it →