Benchmark and compare LLMs on Hebrew reasoning, comprehension, sentiment, translation, and Israeli cultural knowledge. Wraps the HuggingFace Open Hebrew LLM Leaderboard tasks (HeQ, HebrewSentiment, Hebrew Winograd, translation) plus DictaLM 3.0 benchmark tasks (Summarization, Nikud, Israeli Trivia) into a reproducible evaluation harness. Runs evals against Claude, GPT, Gemini, AI21 Jamba, DictaLM, Llama, and local HuggingFace models. Produces comparison scorecards in JSON and markdown. Use when choosing an LLM for a Hebrew product, answering procurement questions about Hebrew performance, validating a fine-tuned Hebrew model, or tracking Hebrew regressions after a model upgrade. Do NOT use for Arabic NLP, ASR benchmarking, or general English benchmarks.
Trust score 84/100 (Trusted) · 10+ installs · 3 GitHub contributors · MIT license
Israeli product teams pick LLMs blind. There is no standardized Hebrew benchmark that a PM can run in an afternoon to compare Claude against GPT against DictaLM against AI21 Jamba on their actual use case. The HuggingFace Open Hebrew LLM Leaderboard is built for base models and few-shot prompts, not for API-hosted chat models. DictaLM publishes benchmark results but only for its own suite. Teams end up guessing, testing informally, or trusting marketing claims.
npx skills-il add skills-il/developer-tools@v1.1.0-hebrew-llm-eval-suite --skill hebrew-llm-eval-suite -a claude-codeWe are building a Hebrew news summarization feature and need to pick between Claude Sonnet, GPT-5, and DictaLM-3.0-24B. Run the relevant benchmarks (HeQ, DictaLM Summarization, Winograd) with 1000 samples and 3 runs, and recommend a model with reasoning.
Anthropic released a new version of claude-sonnet. Run the hebrew-core suite on the new and previous versions and tell me if there was any regression over 2 points on any benchmark.
I am building a Hebrew chatbot and deciding between Claude Haiku and AI21 Jamba 1.5 Mini. Compare them on HeQ, HebrewSentiment, and HebNLI with 500 samples and 3 runs, and provide a scorecard with a recommendation.
We have a data residency constraint requiring a local model. Run Hebrew benchmarks on DictaLM-3.0-Nemotron-12B-Instruct and compare to Claude Sonnet quality. How much quality am I giving up?
HEBREW-MMLU, lm-evaluation-harness + inspect_ai cross-refs, verified DictaLM 2.0/3.0, Aya/Hebrew-Mistral/Hebrew-Gemma comparators, claude-opus-4-7, fixed HE table row, tokenizer fairness section.
Apr 25, 2026
Build and manage shipping integrations with Israeli carriers, including Israel Post, Cheetah, HFD, and Mahir Li, plus locker pickup services (BOX2GO, Shlager, Done). Use when user asks about "shipping Israel", "Cheetah delivery", "meshloach", "shipping label", "HFD", "locker pickup Israel", or setting up carrier integrations for an e-commerce store. Covers carrier selection, Israeli address formatting, label generation, cross-carrier tracking system setup, and customer delivery notifications. Do NOT use for looking up a specific package tracking status (direct users to mypost.israelpost.co.il or hfd.co.il). Do NOT use for international shipping outside Israel or customs/import.
Manage media assets through Cloudinary's REST API -- upload, transform, optimize, and deliver images and videos. Use when user asks about image upload, media optimization, image transformations, responsive images, video management, CDN delivery, or mentions Cloudinary specifically. Covers Upload API, Admin API, URL-based transformations, AI-powered effects (gen_remove, gen_replace, background removal), and delivery optimization. Israeli-founded (2012) with R&D in Petach Tikva; global HQ in Santa Clara, California. Do NOT use for non-Cloudinary media hosting or local image processing without cloud upload.
Best practices for programmatic video creation in React with Remotion, including full Hebrew RTL support. Covers animations, compositions, sequencing, transitions, TikTok-style captions with word highlighting, AI voiceover, 3D, charts, Hebrew Google Fonts, and bidirectional text animations. Use when working with Remotion code or creating Hebrew social media videos and marketing content. Do NOT use for non-Remotion video editing or general React development.
Want to build your own skill? Try the Skill Creator · Submit a Skill