Latest News
Leaderboard Ready for Task 3 - Financial Decision Making
The Task 3 leaderboard is live. Deploy your endpoint and receive daily evaluations on BTC and TSLA decision-making.
Dataset: TheFinAI/CLEF_Task3_Trading
New awards for top submissions
We are recognizing the best systems with Best Paper, Outstanding Paper, and Merit / Encouragement awards.
See details in the Awards section.
Ready-to-use splits for all tasks
Get calibrated training splits for exam-style Q&A, multilingual financial reasoning, and trading decision-making. Each dataset card includes format, licensing, and citation guidance.
Check each dataset card for citation, licensing, and format details.
About the Lab
FinMMEval Lab integrates financial reasoning, multilingual understanding, and decision-making into a unified evaluation suite designed to promote robust, transparent, and globally competent financial AI. The 2026 edition introduces three interconnected tasks spanning five languages.
Multi-modal inputs: news, filings, macro indicators, tests.
Multiple languages with low-resource representations: English, Chinese, Arabic, Hindi, Greek, Japanese, Spanish.
Tasks spanning Q&A, and decision making.
Metrics centered on Accuracy, ROUGE-1, BLEURT and performance quantitative metrics (e.g. CR, SR, MD).
"How can I tailor my setup to make an LLM exceptionally good at finance?"
Tasks
Choose one or more tasks. Each submission must provide calibrated confidence scores and an evidence trace.
Task 1 - Financial Exam Q&A
Given a stand-alone multiple-choice question Q with four candidate options { A1, A2, A3, A4 }, the system must select the correct answer Aβ. Questions cover valuation, accounting, ethics, corporate finance, and regulatory knowledge.
Motivation
Professional financial qualification exams (e.g., CFA, EFPA) require the integration of theoretical and regulatory knowledge with applied reasoning. Existing LLMs often rely on factual recall without demonstrating the analytical rigor expected from human candidates.
Data
- EFPA (Spanish): 50 exam-style financial questions on investment and regulation.
- GRFinQA (Greek): 225 multiple-choice finance questions from university-level exams.
- CFA (English): 600 exam-style multiple-choice questions covering nine core domains.
- CPA (Chinese): 300 exam-style financial questions focusing on major modules.
- BBF (Hindi): 500-1000 exam-style financial multiple-choice questions covering over 30 domains.
Evaluation
Models are required to output the correct answer label. Performance is measured by accuracy, defined as the proportion of correctly identified options in the test set.
Important Dates
Specific dates will be announced once they are fixed.
-
Lab registration opens17 November 2025
-
Training data released15 December 2025 β’ Available now via the Hugging Face collection
-
Lab registration closes23 April 2026
-
Task 3 submission deadline28 April 2026
-
Beginning of the evaluation cycle (test sets release)May 2026
-
End of the evaluation cycle (run submission)07 May 2026
-
Deadline for the submission of working notes [CEUR-WS]28 May 2026
-
Review process of participant papers28 May β 30 June 2026
-
Submission of Condensed Lab Overviews [LNCS]08 June 2026
-
Notification of Acceptance for Condensed Lab Overviews [LNCS]15 June 2026
-
Camera Ready Copy of Condensed Lab Overviews [LNCS] due22 June 2026
-
Notification of Acceptance for Participant Papers [CEUR-WS]30 June 2026
-
Camera Ready Copy of Participant Papers and Extended Lab Overviews [CEUR-WS] due06 July 2026
-
CLEF 2026 Conference21β24 September 2026 β’ Jena, Germany
Awards
Top submissions will be recognized with monetary awards.
Single award for the top-ranked submission and paper.
Three awards recognizing strong submissions and writing.
Two awards to celebrate promising approaches.
How to Participate
Engage with the challenges in a way that suits you - from a quick, one-time experiment to a detailed research project. While we invite you to share your findings in our workshop notes, you are also free to develop promising results into a full paper for an archival journal.
The workshop itself is a perfect opportunity to refine your ideas through discussion with peers.
Ready to join?
Sign up via the CLEF registration form (FinMMEval section)
Packaging Checklist
-
βResults JSONL (per task)
-
βSystem Card (architecture, data usage, risks)
-
βReproducibility (seed, versions, hardware)
-
βLicense compliance acknowledgements (if applicable)
Organizers
Organizing committee and partner institutions.
Frequently Asked Questions
Who can participate?
Researchers and practitioners from academia and industry. Student teams are particularly welcome.
How is data licensed?
Research-only license; redistribution of raw sources may be restricted.
Can we submit to multiple tasks?
Yes. Submit independent result bundles per task.
Are ensembles allowed?
Yes, but disclose all components in the system card.
Contact
zhuohan.xie@mbzuai.ac.ae