|
<!doctype html> |
|
<html> |
|
<head> |
|
<meta charset="utf-8" /> |
|
<meta name="viewport" content="width=device-width" /> |
|
<title>Iqra’Eval Shared Task</title> |
|
<link rel="stylesheet" href="style.css" /> |
|
</head> |
|
<body> |
|
<div class="card"> |
|
<h1>Iqra’Eval Shared Task</h1> |
|
|
|
|
|
<h2>Overview</h2> |
|
<p> |
|
<strong>Iqra’Eval</strong> is a shared task aimed at advancing <strong>automatic assessment of Qur’anic recitation pronunciation</strong> by leveraging computational methods to detect and diagnose pronunciation errors. The focus on Qur’anic recitation provides a standardized and well-defined context for evaluating Modern Standard Arabic (MSA) pronunciation, where precise articulation is not only valued but essential for correctness according to established Tajweed rules. |
|
</p> |
|
<p> |
|
Participants will develop systems capable of: |
|
</p> |
|
<ul> |
|
<li>Detecting whether a segment of Qur’anic recitation contains pronunciation errors.</li> |
|
<li>Diagnosing the nature of the error (e.g., substitution, deletion, or insertion of phonemes).</li> |
|
</ul> |
|
|
|
|
|
<h2>Timeline</h2> |
|
<ul> |
|
<li><strong>June 1, 2025</strong>: Official announcement of the shared task</li> |
|
<li><strong>June 5, 2025</strong>: Release of training data, development set (QuranMB), phonetizer script, and baseline systems</li> |
|
<li><strong>July 24, 2025</strong>: Registration deadline and release of test data</li> |
|
<li><strong>July 27, 2025</strong>: End of evaluation cycle (test set submission closes)</li> |
|
<li><strong>July 30, 2025</strong>: Final results released</li> |
|
<li><strong>August 15, 2025</strong>: System description paper submissions due</li> |
|
<li><strong>August 22, 2025</strong>: Notification of acceptance</li> |
|
<li><strong>September 5, 2025</strong>: Camera-ready versions due</li> |
|
</ul> |
|
|
|
|
|
|
|
<h2>🔊 Task Description</h2> |
|
<p> |
|
The Iqra'Eval task focuses on <strong>automatic pronunciation assessment</strong> in Qur’anic context. |
|
Given a spoken audio clip of a verse and its fully vowelized reference text, your system should predict |
|
the <strong>correct phoneme sequence</strong> actually spoken by the reciter. |
|
</p> |
|
<p> |
|
By comparing this predicted sequence to the reference text and the gold phoneme sequence annotation, we can automatically detect pronunciation issues, such as: |
|
</p> |
|
<ul> |
|
<li><strong>Substitutions</strong>: e.g., saying /k/ instead of /q/</li> |
|
<li><strong>Insertions</strong>: adding a sound not present in the reference</li> |
|
<li><strong>Deletions</strong>: skipping a required phoneme</li> |
|
</ul> |
|
<p> |
|
This task helps diagnose and localize pronunciation errors, enabling educational feedback in applications like Qur’anic tutoring or speech evaluation tools. |
|
</p> |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<h2>Dataset Description</h2> |
|
<p> |
|
All data are hosted on Hugging Face. Two main splits are provided: |
|
</p> |
|
<ul> |
|
<li> |
|
<strong>Training set:</strong> 79 hours of Modern Standard Arabic (MSA) speech, augmented with multiple Qur’anic recitations. |
|
<br /> |
|
<code>df = load_dataset("IqraEval/Iqra_train", split="train")</code> |
|
</li> |
|
<li> |
|
<strong>Development set:</strong> 3.4 hours reserved for tuning and validation. |
|
<br /> |
|
<code>df = load_dataset("IqraEval/Iqra_train", split="dev")</code> |
|
</li> |
|
</ul> |
|
<p> |
|
<strong>Column Definitions:</strong> |
|
</p> |
|
<ul> |
|
<li><code>audio</code>: Speech Array.</li> |
|
<li><code>sentence</code>: Original sentence text (may be partially diacritized or non-diacritized).</li> |
|
<li><code>index</code>: If from the Quran, the verse index (0–6265, including Basmalah); otherwise <code>-1</code>.</li> |
|
<li><code>tashkeel_sentence</code>: Fully diacritized sentence (auto-generated via a diacritization tool).</li> |
|
<li><code>phoneme</code>: Phoneme sequence corresponding to the diacritized sentence (Nawar Halabi phonetizer).</li> |
|
</ul> |
|
<p> |
|
<strong>Data Splits:</strong> |
|
<br /> |
|
• Training (train): 79 hours total<br /> |
|
• Development (dev): 3.4 hours total |
|
</p> |
|
|
|
|
|
<h2>TTS Data (Optional Use)</h2> |
|
<p> |
|
We also provide a high-quality TTS corpus for auxiliary experiments (e.g., data augmentation, synthetic pronunciation error simulation). This TTS set can be loaded via: |
|
</p> |
|
<ul> |
|
<li><code>df_tts = load_dataset("IqraEval/Iqra_TTS")</code></li> |
|
</ul> |
|
|
|
|
|
<h2>Resources</h2> |
|
<ul> |
|
<li> |
|
<a href="https://huggingface.co/datasets/IqraEval/Iqra_train" target="_blank"> |
|
Training & Development Data on Hugging Face |
|
</a> |
|
</li> |
|
<li> |
|
<a href="https://huggingface.co/datasets/IqraEval/Iqra_TTS" target="_blank"> |
|
IqraEval TTS Data on Hugging Face |
|
</a> |
|
</li> |
|
<li> |
|
<a href="https://github.com/Iqra-Eval/interspeech_IqraEval" target="_blank"> |
|
Baseline systems & training scripts (GitHub) |
|
</a> |
|
</li> |
|
</ul> |
|
<p> |
|
<em> |
|
For detailed instructions on data access, phonetizer installation, and baseline usage, please refer to the GitHub README. |
|
</em> |
|
</p> |
|
|
|
<h2>Evaluation Criteria</h2> |
|
<p> |
|
Systems will be scored on their ability to detect and correctly classify phoneme-level errors: |
|
</p> |
|
<ul> |
|
<li><strong>Detection accuracy:</strong> Did the system spot that a phoneme-level error occurred in the segment?</li> |
|
<li><strong>Classification F1-score:</strong> Mispronunciation Detection F1-score</li> |
|
</ul> |
|
<p> |
|
<em>(Detailed evaluation weights and scripts will be made available on June 5, 2025.)</em> |
|
</p> |
|
|
|
|
|
<h2>Submission Details (Draft)</h2> |
|
<p> |
|
Participants are required to submit a CSV file named <code>submission.csv</code> containing the predicted phoneme sequences for each audio sample. The file must have exactly two columns: |
|
</p> |
|
<ul> |
|
<li><strong>ID:</strong> Unique identifier of the audio sample.</li> |
|
<li><strong>Labels:</strong> The predicted phoneme sequence, with each phoneme separated by a single space.</li> |
|
</ul> |
|
<p> |
|
Below is a minimal example illustrating the required format: |
|
</p> |
|
<pre> |
|
ID,Labels |
|
0000_0001, i n n a m a a y a k h a l l a h a m i n ʕ i b a a d i h u l ʕ u l a m |
|
0000_0002, m a a n a n s a k h u m i n i ʕ a a y a t i n |
|
0000_0003, y u k h i k u m u n n u ʔ a u ʔ a m a n a t a n m m i n h u |
|
… |
|
</pre> |
|
<p> |
|
The first column (ID) should match exactly the audio filenames (without extension). The second column (Labels) is the predicted phoneme string. |
|
</p> |
|
<p> |
|
<strong>Important:</strong> |
|
<ul> |
|
<li>Use UTF-8 encoding.</li> |
|
<li>Do not include extra spaces at the start or end of each line.</li> |
|
<li>Submit a single CSV file (no archives). Filename must be <code>submission.csv</code>.</li> |
|
</ul> |
|
</p> |
|
|
|
|
|
<h2>Future Updates</h2> |
|
<p> |
|
Further details on <strong>evaluation criteria</strong> (exact scoring weights), <strong>submission templates</strong>, and any clarifications will be posted on the shared task website when test data are released (June 5, 2025). Stay tuned! |
|
</p> |
|
</div> |
|
</body> |
|
</html> |
|
|
|
|