File size: 2,115 Bytes
6a89f1b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b7163a3
 
 
 
 
 
6a89f1b
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# Preeti to Unicode Translator

This Streamlit app translates text written in Preeti font to Unicode readable format. It supports both plain text input and PDF file uploads.

## Features

- Translate Preeti font text to Unicode
- Support for both plain text input and PDF file uploads
- Easy-to-use web interface powered by Streamlit

## Installation

1. Clone this repository:
   ```
   git clone https://huggingface.co/rockerritesh/preeti-unicode
   cd preeti-unicode
   ```

2. Install the required dependencies:
   ```
   pip install -r requirements.txt
   ```

## Usage

1. Run the Streamlit app:
   ```
   streamlit run app.py
   ```

2. Open your web browser and navigate to the URL displayed in the terminal (usually `http://localhost:8501`).

3. Use the app:
   - For text input: Enter or paste your Preeti font text in the text area and click "Translate".
   - For PDF input: Upload your PDF file containing Preeti font text and click "Translate PDF".

4. View the translated Unicode text in the output area.

## Dependencies

- streamlit
- PyPDF2

## How it works

This app uses a mapping of Preeti font characters to their Unicode equivalents. When text is input or a PDF is uploaded, the app processes the content, replacing Preeti characters with their Unicode counterparts.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the [MIT](https://mit-license.org/) file for details.

## Understanding the Challenge:

Character Overlap: Preeti font uses ASCII characters to represent Nepali letters. This means both English and Nepali texts are encoded using the same characters (A-Z, a-z, 0-9, and symbols).

Difficulty in Differentiation: It's challenging to distinguish between English and Nepali words automatically since they share the same character set in Preeti encoding.

## Acknowledgements

- Streamlit for the web app framework
- PyPDF2 for PDF processing
- (Add any other acknowledgements)

## Contact

If you have any questions or feedback, please open an issue on this GitHub repository.