The FinMMEval Hugging Face collection is the official public data release for Task 1. Participants may use those released datasets for training and may reorganize or re-split them as needed.
The language-specific dev sets on this page are separate organizer-held evaluation data. They are intended for validation on the live leaderboard and should not be added back into training. Final-test questions are released through the test portals, while final-test labels remain private for official evaluation.
English, Chinese, Arabic, and Hindi dev leaderboards are live.
At present, we do not enforce a hard submission cap for Task 1 dev leaderboards. Participants may submit multiple times as needed, but should avoid unnecessary rapid resubmission.
Rows marked as Baseline are organizer sanity checks: Random, Always A, Round Robin, and Qwen2.5-0.5B-Instruct zero-shot.