Hugging Face is looking for reasoning datasets beyond math, science and coding

Reasoning Datasets Competition TL;DR: Bespoke Labs, Hugging Face, and Together.ai are launching a competition to find the most innovative reasoning datasets. Create a great proof-of-concept reasoning dataset and win prizes to help you scale your work! The Deepseek moment for datasets Since the launch of DeepSeek-R1 in January 2025, we've seen remarkable growth in reasoning-focused datasets on the Hugging Face Hub, such as OpenThoughts-114k, OpenCodeReasoning, and codeforces-cot. These primarily cover math, coding, and science: domains with clearly verifiable answers. Now, reasoning is expanding into: Financial analysis Medical reasoning Multi-domain reasoning OpenThoughts-114k alone has helped train over 230 models! We believe future breakthroughs won’t come from architecture alone, but from better data, datasets that reflect real-world complexity, uncertainty, and richness. To accelerate progress, we're launching a Reasoning Dataset Competition. How the competition works The goal: create impactful proof-of-concept reasoning datasets and share them on the Hugging Face Hub. The best submissions will win prizes to help scale these datasets and train models using them.

Apr 16, 2025 - 17:47
 0
Hugging Face is looking for reasoning datasets beyond math, science and coding

Reasoning Datasets Competition

TL;DR: Bespoke Labs, Hugging Face, and Together.ai are launching a competition to find the most innovative reasoning datasets. Create a great proof-of-concept reasoning dataset and win prizes to help you scale your work!

The Deepseek moment for datasets

Since the launch of DeepSeek-R1 in January 2025, we've seen remarkable growth in reasoning-focused datasets on the Hugging Face Hub, such as OpenThoughts-114k, OpenCodeReasoning, and codeforces-cot. These primarily cover math, coding, and science: domains with clearly verifiable answers.

Now, reasoning is expanding into:

OpenThoughts-114k alone has helped train over 230 models! We believe future breakthroughs won’t come from architecture alone, but from better data, datasets that reflect real-world complexity, uncertainty, and richness.

To accelerate progress, we're launching a Reasoning Dataset Competition.

image/png

How the competition works

The goal: create impactful proof-of-concept reasoning datasets and share them on the Hugging Face Hub. The best submissions will win prizes to help scale these datasets and train models using them.