This model is fine-tuned on the Tamil dataset from Common Voice 16.1, preprocessed using Epitran for transliterating text into IPA. The 'tam-Taml' code was employed to generate a precise phoneme list, crucial for capturing the nuances of Tamil phonetics:

  • Vowels:

    • Monophthongs:'a', 'aː', 'e', 'eː', 'i', 'iː', 'o', 'oː', 'u', 'uː'
    • Diphthongs: 'aj', 'aʋ'
  • Consonants:

    • Nasals: 'm', 'n̪', 'n', 'ɳ', 'ɲ', 'ŋ'
    • Stops: 'p', 't̪', 'ʈ', 'k',
    • Affricates: 't͡ʃ', 'd͡ʒ'
    • Fricatives: 's', 'ʂ', 'ʃ', 'h'
    • Tap: 'ɾ'
    • Trill: 'r'
    • Approximants: 'ʋ','ɻ', 'j', 'l', 'ɭ'
    • Consonant cluster: 'kʂ'
  • Special Symbols: '்' (denotes the absence of inherent vowel)

Downloads last month
37
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train speech31/XLS-R-tamil-phoneme