LLMeval

Name: LLMeval
Availability: InStock
Author: spenceryonce

Evaluate and compare large language models (LLMs) for chatbot applications, using various LLMs as evaluators, and manage prompt templates and binary preferences.

★ 11 stars1 forksStalePython

Good

View on GitHub →

📊 Score Breakdown

🛡️Security30%

3.0/5

⚡Utility30%

2.0/5

🔄Maintenance25%

4.0/5

💎Uniqueness15%

4.0/5

Overall = Security (30%) + Utility (30%) + Maintenance (25%) + Uniqueness (15%). Full methodology →

ℹ️ Details

📈 GitHub Signals

Stars

Forks

Commits (30d)

Open Issues

Last commit: 9 months ago

anthropicchatgptclaudecohereevaluationevaluatorllmopenai

Similar Tools

View all AI & LLM Tools →

Claude Skill★ Pick

100

lobehub

by lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level —

★ 80.6KTypeScript Active

Claude Skill★ Pick

100

litellm

by BerriAI

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, A

★ 54.1KPython Active

Claude Skill

100

ai

by vercel

The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source library for building AI-powered applications and agents

★ 25.7KTypeScript ActiveOfficial

MCP Server★ Pick

100

context7

by upstash

Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors

★ 59.5KTypeScript Active

🏅 Show your score

Scored 62/100 for security, utility and maintenance. Add the badge to your README or site to show it, verified by an independent directory.

Markdown

[![LLMeval scored 62/100 on SkillsIndex](https://skillsindex.dev/api/badge/spenceryonce-llmeval)](https://skillsindex.dev/tools/spenceryonce-llmeval/)

HTML

<a href="https://skillsindex.dev/tools/spenceryonce-llmeval/"><img src="https://skillsindex.dev/api/badge/spenceryonce-llmeval" alt="LLMeval scored 62/100 on SkillsIndex" height="20"></a>

Know before you install 📬

We score every tool 0-100 for security, maintenance and utility. Get the weekly shortlist of the highest-scored, vetted tools, plus an alert when a package you rely on goes stale. Free.

Data last verified: 3 months ago. See something wrong? Report it →