Ibrahemqasim commited on
Commit
eb90bad
·
verified ·
1 Parent(s): 7828cff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -10
README.md CHANGED
@@ -1,6 +1,13 @@
1
- ---
2
- license: unknown
3
- ---
 
 
 
 
 
 
 
4
 
5
  # enwiki_to_arwiki_categories Dataset
6
 
@@ -8,19 +15,19 @@ This dataset contains mappings between English Wikipedia categories and their co
8
 
9
  ## Files
10
 
11
- 1. **langlinks.json (818,354)**
12
  * This file contains all category links from enwiki to arwiki.
13
- * Dataset at: [Ibrahemqasim/categories_en2ar_with_years](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar_with_years)
14
-
15
- 2. **filtered_data.json (231,314)**
16
- * This file contains the mappings after filtering out those that do not contain a 4-digit year.
17
  * Dataset at: [Ibrahemqasim/categories_en2ar](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar)
 
 
 
 
18
 
19
- 3. **cats_2000.json (231,314)**
20
  * This file contains the mappings from `filtered_data.json` with these changes:
21
  1. Replacing all 4-digit years with the year `2000`.
22
 
23
- 4. **cats_2000_country.json (1,234)**
24
  * This file contains the mappings from `filtered_data.json` with these changes:
25
  1. Replacing all 4-digit years with the year `2000`.
26
  2. Replacing country names with the word `country`. It contains 1,234 mappings.
 
1
+ ---
2
+ license: unknown
3
+ language:
4
+ - ar
5
+ tags:
6
+ - wikipedia
7
+ datasets:
8
+ - Ibrahemqasim/categories_en2ar
9
+ - Ibrahemqasim/categories_en2ar_with_years
10
+ ---
11
 
12
  # enwiki_to_arwiki_categories Dataset
13
 
 
15
 
16
  ## Files
17
 
18
+ 1. **[langlinks.json](langlinks.json) (818,354)**
19
  * This file contains all category links from enwiki to arwiki.
 
 
 
 
20
  * Dataset at: [Ibrahemqasim/categories_en2ar](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar)
21
+
22
+ 2. **[filtered_data.json](filtered_data.json) (231,314)**
23
+ * This file contains the mappings after filtering out those that do not contain a 4-digit year.
24
+ * Dataset at: [Ibrahemqasim/categories_en2ar_with_years](https://huggingface.co/datasets/Ibrahemqasim/categories_en2ar_with_years)
25
 
26
+ 3. **[cats_2000.json](cats_2000.json) (231,314)**
27
  * This file contains the mappings from `filtered_data.json` with these changes:
28
  1. Replacing all 4-digit years with the year `2000`.
29
 
30
+ 4. **[cats_2000_country.json](cats_2000_country.json) (1,234)**
31
  * This file contains the mappings from `filtered_data.json` with these changes:
32
  1. Replacing all 4-digit years with the year `2000`.
33
  2. Replacing country names with the word `country`. It contains 1,234 mappings.