File size: 1,562 Bytes
62c1af6
 
 
 
 
 
 
 
04ee729
ccbe58c
 
 
 
 
 
eb90bad
04ee729
7828cff
eb90bad
 
 
 
f2fca4b
440236c
04ee729
 
f7d8dff
8bf81f5
eb90bad
04ee729
 
f7d8dff
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: unknown
language:
- ar
tags:
- Wikipedia
- Wikipedia_Categories
---

# enwiki_to_arwiki_categories Dataset

This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.

## Files

1.  **[langlinks.json](langlinks.json)  (818,354)**
    * This file contains all category links from enwiki to arwiki.
    * Dataset at: [Ibrahemqasim/categories_en2ar](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar)
    
2.  **[filtered_data.json](filtered_data.json)  (231,314)**
    * This file contains the mappings after filtering out those that do not contain a 4-digit year.
    * Dataset at: [Ibrahemqasim/categories_en2ar_with_years](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar_with_years)

3.  **[cats_2000.json](cats_2000.json) (21,170)**
    * This file contains the mappings from `filtered_data.json` with these changes:
        1.  Replacing all 4-digit years with the year `2000`.
    * Dataset at: [Ibrahemqasim/categories_en2ar-cats_2000](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar-cats_2000)

4.  **[cats_2000_country.json](cats_2000_country.json) (1,234)**
    * This file contains the mappings from `filtered_data.json` with these changes:
        1.  Replacing all 4-digit years with the year `2000`.
        2.  Replacing country names with the word `country`. It contains 1,234 mappings.
    * Dataset at: [Ibrahemqasim/categories_en2ar-cats_2000_contry](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar-cats_2000_contry)