# Changelog # 📢 Release v1.0.3 - 🚨 The `IndicProcessor` class has been re-written in [Cython](https://github.com/cython/cython) for faster implementation. This gives us atleast `+10 lines/s`. - A new `visualize` argument as been added to `preprocess_batch` to track the processing with a `tqdm` bar. # 📢 Release v1.0.2 - The repository has been renamed to `IndicTransToolkit`. - 🚨 The custom tokenizer is now **removed** from the repository. Please revert to a previous commit ([v1.0.1](https://github.com/VarunGumma/IndicTransToolkit/tree/0e68fb5872f4d821578a5252f90ad43c9649370f)) to use it **(strongly discouraged)**. The official _(and only tokenizer)_ is available on HF along with the models. # 📢 Release v1.0.0 - The [PreTrainedTokenizer](https://huggingface.co/docs/transformers/main_classes/tokenizer) for IndicTrans2 is now available on HF 🎉🎉 Note that, you still need the `IndicProcessor` to pre-process the sentences before tokenization. - 🚨 **In favor of the standard PreTrainedTokenizer, we deprecated the custom tokenizer. However, this custom tokenizer will still be available here for backward compatibility, but no further updates/bug-fixes will be provided.** - The `indic_evaluate` function is now consolidated into a concrete `IndicEvaluator` class. - The data collation function for training is consolidated into a concrete `IndicDataCollator` class. - A simple batching method is now available in the `IndicProcessor`.