File size: 1,562 Bytes
62c1af6 04ee729 ccbe58c eb90bad 04ee729 7828cff eb90bad f2fca4b 440236c 04ee729 f7d8dff 8bf81f5 eb90bad 04ee729 f7d8dff |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
---
license: unknown
language:
- ar
tags:
- Wikipedia
- Wikipedia_Categories
---
# enwiki_to_arwiki_categories Dataset
This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.
## Files
1. **[langlinks.json](langlinks.json) (818,354)**
* This file contains all category links from enwiki to arwiki.
* Dataset at: [Ibrahemqasim/categories_en2ar](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar)
2. **[filtered_data.json](filtered_data.json) (231,314)**
* This file contains the mappings after filtering out those that do not contain a 4-digit year.
* Dataset at: [Ibrahemqasim/categories_en2ar_with_years](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar_with_years)
3. **[cats_2000.json](cats_2000.json) (21,170)**
* This file contains the mappings from `filtered_data.json` with these changes:
1. Replacing all 4-digit years with the year `2000`.
* Dataset at: [Ibrahemqasim/categories_en2ar-cats_2000](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar-cats_2000)
4. **[cats_2000_country.json](cats_2000_country.json) (1,234)**
* This file contains the mappings from `filtered_data.json` with these changes:
1. Replacing all 4-digit years with the year `2000`.
2. Replacing country names with the word `country`. It contains 1,234 mappings.
* Dataset at: [Ibrahemqasim/categories_en2ar-cats_2000_contry](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar-cats_2000_contry) |