Spaces:

IqraEval
/

SharedTask_ArabicNLP2025

Running

App Files Files Community

SharedTask_ArabicNLP2025 / index.html

01Yassine

Update index.html

5756d9d verified 3 months ago

raw

history blame

11.6 kB

	<!doctype html>
	<html>
	<head>
	<meta charset="utf-8" />
	<meta name="viewport" content="width=device-width" />
	<title>Iqra’Eval Shared Task</title>
	<link rel="stylesheet" href="style.css" />
	</head>
	<body>
	<div class="card">
	<h1>Iqra’Eval Shared Task</h1>

	<!-- Overview Section -->
	<h2>Overview</h2>
	<p>
	<strong>Iqra’Eval</strong> is a shared task aimed at advancing <strong>automatic assessment of Qur’anic recitation pronunciation</strong> by leveraging computational methods to detect and diagnose pronunciation errors. The focus on Qur’anic recitation provides a standardized and well-defined context for evaluating Modern Standard Arabic (MSA) pronunciation, where precise articulation is not only valued but essential for correctness according to established Tajweed rules.
	</p>
	<p>
	Participants will develop systems capable of:
	</p>
	<ul>
	<li>Detecting whether a segment of Qur’anic recitation contains pronunciation errors.</li>
	<li>Diagnosing the nature of the error (e.g., substitution, deletion, or insertion of phonemes).</li>
	</ul>

	<!-- Timeline Section -->
	<h2>Timeline</h2>
	<ul>
	<li><strong>June 1, 2025</strong>: Official announcement of the shared task</li>
	<li><strong>June 5, 2025</strong>: Release of training data, development set (QuranMB), phonetizer script, and baseline systems</li>
	<li><strong>July 24, 2025</strong>: Registration deadline and release of test data</li>
	<li><strong>July 27, 2025</strong>: End of evaluation cycle (test set submission closes)</li>
	<li><strong>July 30, 2025</strong>: Final results released</li>
	<li><strong>August 15, 2025</strong>: System description paper submissions due</li>
	<li><strong>August 22, 2025</strong>: Notification of acceptance</li>
	<li><strong>September 5, 2025</strong>: Camera-ready versions due</li>
	</ul>

	<!-- Task Description -->

	<h2>🔊 Task Description</h2>
	<p>
	The Iqra'Eval task focuses on <strong>automatic pronunciation assessment</strong> in Qur’anic context.
	Given a spoken audio clip of a verse and its fully vowelized reference text, your system should predict
	the <strong>correct phoneme sequence</strong> actually spoken by the reciter.
	</p>
	<p>
	By comparing this predicted sequence to the reference text and the gold phoneme sequence annotation, we can automatically detect pronunciation issues, such as:
	</p>
	<ul>
	<li><strong>Substitutions</strong>: e.g., saying /k/ instead of /q/</li>
	<li><strong>Insertions</strong>: adding a sound not present in the reference</li>
	<li><strong>Deletions</strong>: skipping a required phoneme</li>
	</ul>
	<p>
	This task helps diagnose and localize pronunciation errors, enabling educational feedback in applications like Qur’anic tutoring or speech evaluation tools.
	</p>

	<!-- <h2>Task Description</h2>
	<p>
	The Iqra’Eval shared task focuses on automatic mispronunciation detection and diagnosis in Qur’anic recitation. Given:
	</p>
	<ol>
	<li>A speech segment (an audio clip of a Qur’anic verse recitation), and</li>
	<li>A fully vowelized reference transcript (the corresponding Qur’anic text, fully diacritized),</li>
	</ol>
	<p>
	the goal is to identify any pronunciation errors, localize them within the phoneme sequence, and classify the type of error based on Tajweed rules.
	</p>
	<p>
	Each participant’s system must predict the sequence of phonemes that the reciter actually produced. A standardized phonemizer (Nawar Halabi’s phonetizer) will be used to generate the “gold” phoneme sequence from the reference transcript for comparison.
	</p>
	<p>
	<strong>Key subtasks:</strong>
	</p>
	<ul>
	<li>Compare predicted phoneme sequence vs. gold reference.</li>
	<li>Detect substitutions (e.g., pronouncing /q/ as /k/), deletions (e.g., dropping a hamza), or insertions (e.g., adding an extra vowel) of phonemes.</li>
	<li>Localize the error to a specific phoneme index in the utterance.</li>
	<li>Classify what type of mistake occurred based on Tajweed (e.g., madd errors, ikhfa, idgham, etc.).</li>
	</ul> -->

	<!-- Example & Illustration -->
	<!-- <h2>Example</h2>
	<p>
	Suppose the reference verse (fully vowelized) is:
	</p>
	<blockquote>
	<p>
	إِنَّ اللَّهَ عَلَىٰ كُلِّ شَيْءٍ قَدِيرٌ
	<br />
	(inna l-lāha ʿalā kulli shay’in qadīrun)
	</p>
	</blockquote>
	<p>
	The gold phoneme sequence (using the standard phonemizer) might be:
	</p>
	<pre>
	inna l l aa h a ʕ a l a k u l l i ʃ a y ’ i n q a d i r u n
	</pre>
	<p>
	If a reciter mispronounces “قَدِيرٌ” (qadīrun) as “كَدِيرٌ” (kadīrun), that corresponds to a substitution at the very start of that word: phoneme /q/ → /k/.
	</p>
	<p>
	A well-trained system should:
	</p>
	<ol>
	<li>Flag the pronunciation of “قَدِيرٌ” as erroneous,</li>
	<li>Identify that the first phoneme in that word was substituted (“/q/” → “/k/”), and</li>
	<li>Classify it under the Tajweed error category “Ghunnah/Qaf vs. Kaf error.”</li>
	</ol>
	<div style="text-align: center; margin: 1em 0;">
	<img src="images/pronunciation_assessment_arabic.png" alt="Pronunciation Assessment in Arabic" style="max-width: 100%; height: auto;" />
	<p style="font-size: 0.9em; color: #555;">
	<em>Figure: Example of a phoneme-level comparison between reference vs. predicted for an Arabic Qur’anic recitation.</em>
	</p>
	</div> -->

	<!-- Evaluation Criteria -->

	<!-- Dataset Description -->
	<h2>Dataset Description</h2>
	<p>
	All data are hosted on Hugging Face. Two main splits are provided:
	</p>
	<ul>
	<li>
	<strong>Training set:</strong> 79 hours of Modern Standard Arabic (MSA) speech, augmented with multiple Qur’anic recitations.
	<br />
	<code>df = load_dataset("IqraEval/Iqra_train", split="train")</code>
	</li>
	<li>
	<strong>Development set:</strong> 3.4 hours reserved for tuning and validation.
	<br />
	<code>df = load_dataset("IqraEval/Iqra_train", split="dev")</code>
	</li>
	</ul>
	<p>
	<strong>Column Definitions:</strong>
	</p>
	<ul>
	<li><code>audio</code>: Speech Array.</li>
	<li><code>sentence</code>: Original sentence text (may be partially diacritized or non-diacritized).</li>
	<li><code>index</code>: If from the Quran, the verse index (0–6265, including Basmalah); otherwise <code>-1</code>.</li>
	<li><code>tashkeel_sentence</code>: Fully diacritized sentence (auto-generated via a diacritization tool).</li>
	<li><code>phoneme</code>: Phoneme sequence corresponding to the diacritized sentence (Nawar Halabi phonetizer).</li>
	</ul>
	<p>
	<strong>Data Splits:</strong>
	<br />
	• Training (train): 79 hours total<br />
	• Development (dev): 3.4 hours total
	</p>

	<!-- Additional TTS Data -->
	<h2>TTS Data (Optional Use)</h2>
	<p>
	We also provide a high-quality TTS corpus for auxiliary experiments (e.g., data augmentation, synthetic pronunciation error simulation). This TTS set can be loaded via:
	</p>
	<ul>
	<li><code>df_tts = load_dataset("IqraEval/Iqra_TTS")</code></li>
	</ul>

	<!-- Resources & Links -->
	<h2>Resources</h2>
	<ul>
	<li>
	<a href="https://huggingface.co/datasets/IqraEval/Iqra_train" target="_blank">
	Training & Development Data on Hugging Face
	</a>
	</li>
	<li>
	<a href="https://huggingface.co/datasets/IqraEval/Iqra_TTS" target="_blank">
	IqraEval TTS Data on Hugging Face
	</a>
	</li>
	<li>
	<a href="https://github.com/Iqra-Eval/interspeech_IqraEval" target="_blank">
	Baseline systems & training scripts (GitHub)
	</a>
	</li>
	</ul>
	<p>
	<em>
	For detailed instructions on data access, phonetizer installation, and baseline usage, please refer to the GitHub README.
	</em>
	</p>

	<h2>Evaluation Criteria</h2>
	<p>
	Systems will be scored on their ability to detect and correctly classify phoneme-level errors:
	</p>
	<ul>
	<li><strong>Detection accuracy:</strong> Did the system spot that a phoneme-level error occurred in the segment?</li>
	<li><strong>Classification F1-score:</strong> Mispronunciation Detection F1-score</li>
	</ul>
	<p>
	<em>(Detailed evaluation weights and scripts will be made available on June 5, 2025.)</em>
	</p>

	<!-- Submission Details -->
	<h2>Submission Details (Draft)</h2>
	<p>
	Participants are required to submit a CSV file named <code>submission.csv</code> containing the predicted phoneme sequences for each audio sample. The file must have exactly two columns:
	</p>
	<ul>
	<li><strong>ID:</strong> Unique identifier of the audio sample.</li>
	<li><strong>Labels:</strong> The predicted phoneme sequence, with each phoneme separated by a single space.</li>
	</ul>
	<p>
	Below is a minimal example illustrating the required format:
	</p>
	<pre>
	ID,Labels
	0000_0001, i n n a m a a y a k h a l l a h a m i n ʕ i b a a d i h u l ʕ u l a m
	0000_0002, m a a n a n s a k h u m i n i ʕ a a y a t i n
	0000_0003, y u k h i k u m u n n u ʔ a u ʔ a m a n a t a n m m i n h u
	…
	</pre>
	<p>
	The first column (ID) should match exactly the audio filenames (without extension). The second column (Labels) is the predicted phoneme string.
	</p>
	<p>
	<strong>Important:</strong>
	<ul>
	<li>Use UTF-8 encoding.</li>
	<li>Do not include extra spaces at the start or end of each line.</li>
	<li>Submit a single CSV file (no archives). Filename must be <code>submission.csv</code>.</li>
	</ul>
	</p>

	<!-- Placeholder for Future Details -->
	<h2>Future Updates</h2>
	<p>
	Further details on <strong>evaluation criteria</strong> (exact scoring weights), <strong>submission templates</strong>, and any clarifications will be posted on the shared task website when test data are released (June 5, 2025). Stay tuned!
	</p>
	</div>
	</body>
	</html>