OpenClaw Skill

ml-model-eval-benchmark

Compare model candidates using weighted metrics and deterministic ranking outputs. Use for benchmark leaderboards and model promotion decisions.

Install

$npx clawhub@latest install ml-model-eval-benchmark

All-time installs3

Active installs3

Stars0

ML Model Eval Benchmark

Produce consistent model ranking outputs from metric-weighted evaluation inputs.

Created by

Persistent memory

Give your OpenClaw agent a memory layer

Mem0 remembers users and context across sessions so you send fewer tokens and get better answers.