--- license: unknown language: - ar tags: - Wikipedia - Wikipedia_Categories --- # enwiki_to_arwiki_categories Dataset This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories. ## Files 1. **[langlinks.json](langlinks.json) (818,354)** * This file contains all category links from enwiki to arwiki. * Dataset at: [Ibrahemqasim/categories_en2ar](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar) 2. **[filtered_data.json](filtered_data.json) (231,314)** * This file contains the mappings after filtering out those that do not contain a 4-digit year. * Dataset at: [Ibrahemqasim/categories_en2ar_with_years](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar_with_years) 3. **[cats_2000.json](cats_2000.json) (21,170)** * This file contains the mappings from `filtered_data.json` with these changes: 1. Replacing all 4-digit years with the year `2000`. * Dataset at: [Ibrahemqasim/categories_en2ar-cats_2000](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar-cats_2000) 4. **[cats_2000_country.json](cats_2000_country.json) (1,234)** * This file contains the mappings from `filtered_data.json` with these changes: 1. Replacing all 4-digit years with the year `2000`. 2. Replacing country names with the word `country`. It contains 1,234 mappings. * Dataset at: [Ibrahemqasim/categories_en2ar-cats_2000_contry](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar-cats_2000_contry)