Ibrahemqasim's picture
Update README.md
440236c verified
metadata
license: unknown
language:
  - ar
tags:
  - Wikipedia
  - Wikipedia_Categories

enwiki_to_arwiki_categories Dataset

This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.

Files

  1. langlinks.json (818,354)

  2. filtered_data.json (231,314)

  3. cats_2000.json (21,170)

  4. cats_2000_country.json (1,234)

    • This file contains the mappings from filtered_data.json with these changes:
      1. Replacing all 4-digit years with the year 2000.
      2. Replacing country names with the word country. It contains 1,234 mappings.
    • Dataset at: Ibrahemqasim/categories_en2ar-cats_2000_contry