Spaces:

S-Dreamer
/

LinguaCanvas

Build error

App Files Files Community

S-Dreamer commited on Mar 16

Commit

b98a046

verified ·

1 Parent(s): 1afd40d

Upload 12 files

Browse files

Files changed (13) hide show

.gitattributes +1 -0
.gitignore +66 -0
.replit +38 -0
CONTRIBUTING.md +75 -0
README.md +94 -12
css.py +80 -0
cultural_utils.py +43 -0
generated-icon.png +3 -0
pyproject.toml +24 -0
replit.nix +10 -0
styles.css +81 -0
utils.py +94 -0
uv.lock +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+generated-icon.png filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,66 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual Environment
+venv/
+ENV/
+env/
+.env
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+.project
+.pydevproject
+# Logs
+*.log
+logs/
+log/
+# Testing
+.coverage
+htmlcov/
+.pytest_cache/
+.tox/
+# Distribution
+*.tar.gz
+*.zip
+# Replit specific
+.replit
+replit.nix
+.breakpoints
+.upm/
+# Model files
+*.pt
+*.pth
+*.bin
+*.onnx
+# Other
+.DS_Store
+Thumbs.db

.replit ADDED Viewed

	@@ -0,0 +1,38 @@

+modules = ["python-3.11", "python3"]
+[nix]
+channel = "stable-24_05"
+[workflows]
+runButton = "Project"
+[[workflows.workflow]]
+name = "Project"
+mode = "parallel"
+author = "agent"
+[[workflows.workflow.tasks]]
+task = "workflow.run"
+args = "Translation App"
+[[workflows.workflow]]
+name = "Translation App"
+author = "agent"
+[workflows.workflow.metadata]
+agentRequireRestartOnSave = false
+[[workflows.workflow.tasks]]
+task = "packager.installForAll"
+[[workflows.workflow.tasks]]
+task = "shell.exec"
+args = "python app.py"
+waitForPort = 8000
+[deployment]
+run = ["sh", "-c", "python app.py"]
+[[ports]]
+localPort = 8000
+externalPort = 80

CONTRIBUTING.md ADDED Viewed

	@@ -0,0 +1,75 @@

+# Contributing to English-Farsi Translation Interface
+Thank you for your interest in contributing to our project! This document provides guidelines and best practices for contributions.
+## Code of Conduct
+By participating in this project, you agree to maintain a respectful and inclusive environment for all contributors.
+## Getting Started
+1. Fork the repository
+2. Create a new branch for your feature/fix
+3. Write clean, documented code
+4. Submit a pull request
+## Development Guidelines
+### Code Style
+- Follow PEP 8 style guide for Python code
+- Use meaningful variable and function names
+- Add docstrings to functions and classes
+- Keep functions focused and single-purpose
+- Include type hints where applicable
+### Testing
+- Write unit tests for new features
+- Ensure all tests pass before submitting PR
+- Add integration tests for complex features
+### Documentation
+- Update README.md if adding new features
+- Document API changes
+- Include docstrings for new functions/classes
+- Add comments for complex logic
+### Commit Messages
+- Use clear, descriptive commit messages
+- Start with a verb (Add, Fix, Update, etc.)
+- Keep messages concise but informative
+Example:
+```
+Add text preprocessing for special characters
+```
+### Pull Request Process
+1. Update documentation
+2. Add/update tests
+3. Ensure CI passes
+4. Request review from maintainers
+5. Address review feedback
+## Feature Requests
+- Use issue tracker for feature requests
+- Clearly describe the feature and its benefits
+- Include use cases where applicable
+## Bug Reports
+Include:
+- Clear description of the issue
+- Steps to reproduce
+- Expected vs actual behavior
+- System information
+- Screenshots if applicable
+## Questions?
+Feel free to open an issue for any questions about contributing!

README.md CHANGED Viewed

@@ -1,12 +1,94 @@
----
-title: LinguaCanvas
-emoji: 🏃
-colorFrom: indigo
-colorTo: gray
-sdk: gradio
-sdk_version: 5.21.0
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# English-Farsi Translation Interface 🌐
+A sophisticated translation interface that provides culturally-sensitive translations between English and Farsi, powered by machine learning and enhanced with cultural context annotations.
+## ✨ Features
+- **Bidirectional Translation**: Seamless translation between English and Farsi
+- **Cultural Context**: Provides explanations for idioms and cultural expressions
+- **User-Friendly Interface**: Clean, intuitive Gradio-based web interface
+- **Real-time Translation**: Instant translation with cultural annotations
+- **RTL Support**: Full support for right-to-left text in Farsi
+## 🚀 Quick Start
+1. Clone the repository:
+```bash
+git clone https://github.com/yourusername/english-farsi-translator.git
+cd english-farsi-translator
+```
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+3. Run the application:
+```bash
+python app.py
+```
+The application will be available at `http://0.0.0.0:8000`
+## 🛠️ Project Structure
+```
+├── app.py              # Main application file
+├── utils.py            # Utility functions
+├── cultural_utils.py   # Cultural context handling
+├── css.py             # CSS styles for Gradio interface
+├── styles.css         # Additional CSS styles
+├── docs/              # Documentation
+├── tests/             # Test files
+└── requirements.txt   # Project dependencies
+```
+## 💡 Usage
+1. Select source and target languages from the dropdown menus
+2. Enter text in the input box
+3. Click "Translate" to get the translation
+4. View cultural context annotations below the translation
+## 🔍 Features in Detail
+- **Text Preprocessing**: Handles special characters and formatting
+- **Cultural Context Detection**: Identifies and explains cultural idioms
+- **Language Detection**: Automatic detection of input language
+- **Error Handling**: Robust error management with helpful messages
+- **Responsive Design**: Works on both desktop and mobile devices
+## 🤝 Contributing
+We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details on:
+- Code style
+- Development process
+- Pull request process
+- Testing requirements
+## 📝 Documentation
+For detailed information about the API and installation process, check:
+- [API Documentation](docs/API.md)
+- [Installation Guide](docs/INSTALLATION.md)
+## ⚙️ Technical Requirements
+- Python 3.8+
+- Required packages:
+  - transformers
+  - gradio
+  - torch
+  - sentencepiece
+  - protobuf
+## 🔒 License
+This project is licensed under the MIT License.
+## 🙏 Acknowledgments
+- Persian NLP community for the translation model
+- Contributors and maintainers
+- Gradio team for the interface framework

css.py ADDED Viewed

	@@ -0,0 +1,80 @@

+# Custom CSS for the Gradio interface
+custom_css = """
+.gradio-container {
+    font-family: 'Noto Sans', 'Vazirmatn', sans-serif;
+    background-color: #F7F7F7;
+    color: #333333;
+}
+.primary-btn {
+    background-color: #2D8EFF !important;
+    color: white !important;
+    border: none !important;
+    padding: 10px 20px !important;
+    border-radius: 5px !important;
+}
+.secondary-btn {
+    background-color: #34B233 !important;
+    color: white !important;
+}
+.error-text {
+    color: #FF6B6B !important;
+}
+/* Input/Output containers */
+.input-container, .output-container {
+    padding: 20px;
+    background: white;
+    border-radius: 8px;
+    box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+}
+/* RTL Support */
+[dir="rtl"] {
+    text-align: right;
+}
+/* Responsive Design */
+@media (max-width: 768px) {
+    .gradio-container {
+        padding: 10px;
+    }
+    .input-container, .output-container {
+        padding: 15px;
+    }
+}
+/* Loading State */
+.loading {
+    border: 2px solid #2D8EFF;
+    border-radius: 50%;
+    border-top: 2px solid transparent;
+    animation: spin 1s linear infinite;
+}
+@keyframes spin {
+    0% { transform: rotate(0deg); }
+    100% { transform: rotate(360deg); }
+}
+/* Custom Scrollbar */
+::-webkit-scrollbar {
+    width: 8px;
+}
+::-webkit-scrollbar-track {
+    background: #f1f1f1;
+}
+::-webkit-scrollbar-thumb {
+    background: #2D8EFF;
+    border-radius: 4px;
+}
+/* Font imports */
+@import url('https://fonts.googleapis.com/css2?family=Noto+Sans:wght@400;700&display=swap');
+@import url('https://cdn.jsdelivr.net/gh/rastikerdar/[email protected]/Vazirmatn-font-face.css');
+"""

cultural_utils.py ADDED Viewed

	@@ -0,0 +1,43 @@

+"""Utility module for managing cultural context annotations."""
+from typing import Dict, Tuple, List
+# Initial database of idioms and their cultural context
+# Format: {idiom: (literal_translation, cultural_explanation)}
+ENGLISH_IDIOMS: Dict[str, Tuple[str, str]] = {
+    "break the ice": ("شکستن یخ", "To initiate social interaction and reduce tension. In Persian culture, this concept is similar to 'گرم گرفتن' (warm taking) which emphasizes creating a warm, friendly atmosphere."),
+    "costs an arm and a leg": ("به قیمت یک دست و پا", "Very expensive. In Persian, a similar expression is 'سر به فلک کشیدن' (reaching the sky) to describe extremely high prices."),
+    "piece of cake": ("تکه کیک", "Something very easy to do. In Persian culture, the equivalent idiom is 'آب خوردن' (like drinking water) to describe a task that's very simple.")
+}
+PERSIAN_IDIOMS: Dict[str, Tuple[str, str]] = {
+    "آب خوردن": ("drinking water", "Used to describe something very easy, similar to the English 'piece of cake'."),
+    "دست و پنجه نرم کردن": ("softening hand and fingers", "To struggle or deal with something difficult, similar to 'wrestling with' in English."),
+    "دیوار موش داره موش هم گوش داره": ("the wall has mice and mice have ears", "Be careful what you say as others might be listening, similar to 'walls have ears' in English.")
+}
+def detect_idioms(text: str, source_lang: str) -> List[Tuple[str, str, str]]:
+    """
+    Detect idioms in the input text and return their cultural context.
+    Returns:
+        List of tuples (idiom, literal_translation, cultural_explanation)
+    """
+    idioms_db = ENGLISH_IDIOMS if source_lang == "en" else PERSIAN_IDIOMS
+    found_idioms = []
+    for idiom in idioms_db:
+        if idiom.lower() in text.lower():
+            found_idioms.append((idiom, *idioms_db[idiom]))
+    return found_idioms
+def get_cultural_context(text: str, source_lang: str) -> Dict[str, List[Tuple[str, str, str]]]:
+    """
+    Get cultural context annotations for a given text.
+    Returns:
+        Dictionary with 'idioms' key containing list of detected idioms and their context
+    """
+    return {
+        'idioms': detect_idioms(text, source_lang)
+    }

generated-icon.png ADDED Viewed

Git LFS Details

SHA256: 766d677fdba3af260a7038945f8b7a0680c3147e7225698cb8c349d0d6c4391b
Pointer size: 132 Bytes
Size of remote file: 1.2 MB

pyproject.toml ADDED Viewed

	@@ -0,0 +1,24 @@

+[project]
+name = "repl-nix-workspace"
+version = "0.1.0"
+description = "Add your description here"
+requires-python = ">=3.11"
+dependencies = [
+    "blobfile>=3.0.0",
+    "css>=0.1",
+    "gradio>=5.15.0",
+    "protobuf>=5.29.3",
+    "sentencepiece>=0.2.0",
+    "tiktoken>=0.8.0",
+    "torch>=2.6.0",
+    "transformers>=4.48.3",
+]
+[[tool.uv.index]]
+explicit = true
+name = "pytorch-cpu"
+url = "https://download.pytorch.org/whl/cpu"
+[tool.uv.sources]
+torch = [{ index = "pytorch-cpu", marker = "platform_system == 'Linux'" }]
+torchvision = [{ index = "pytorch-cpu", marker = "platform_system == 'Linux'" }]

replit.nix ADDED Viewed

	@@ -0,0 +1,10 @@

+{pkgs}: {
+  deps = [
+    pkgs.pkg-config
+    pkgs.rustc
+    pkgs.libiconv
+    pkgs.cargo
+    pkgs.protobuf
+    pkgs.ffmpeg-full
+  ];
+}

styles.css ADDED Viewed

	@@ -0,0 +1,81 @@

+/* Font imports */
+@import url('https://fonts.googleapis.com/css2?family=Noto+Sans:wght@400;700&display=swap');
+@import url('https://cdn.jsdelivr.net/gh/rastikerdar/[email protected]/Vazirmatn-font-face.css');
+/* Base styles */
+.gradio-container {
+    font-family: 'Noto Sans', 'Vazirmatn', sans-serif;
+    background-color: #F7F7F7;
+    color: #333333;
+}
+/* Button styles */
+.primary-btn {
+    background-color: #2D8EFF !important;
+    color: white !important;
+    border: none !important;
+    padding: 10px 20px !important;
+    border-radius: 5px !important;
+}
+.secondary-btn {
+    background-color: #34B233 !important;
+    color: white !important;
+}
+/* Container styles */
+.input-container, .output-container {
+    padding: 20px;
+    background: white;
+    border-radius: 8px;
+    box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+}
+/* RTL Support */
+[dir="rtl"] {
+    text-align: right;
+}
+/* Status indicators */
+.error-text {
+    color: #FF6B6B !important;
+}
+.loading {
+    border: 2px solid #2D8EFF;
+    border-radius: 50%;
+    border-top: 2px solid transparent;
+    animation: spin 1s linear infinite;
+}
+/* Animations */
+@keyframes spin {
+    0% { transform: rotate(0deg); }
+    100% { transform: rotate(360deg); }
+}
+/* Responsive Design */
+@media (max-width: 768px) {
+    .gradio-container {
+        padding: 10px;
+    }
+    .input-container, .output-container {
+        padding: 15px;
+    }
+}
+/* Custom Scrollbar */
+::-webkit-scrollbar {
+    width: 8px;
+}
+::-webkit-scrollbar-track {
+    background: #f1f1f1;
+}
+::-webkit-scrollbar-thumb {
+    background: #2D8EFF;
+    border-radius: 4px;
+}

utils.py ADDED Viewed

	@@ -0,0 +1,94 @@

+import re
+class TextProcessor:
+    """Handles text processing operations for translation."""
+    MAX_LENGTH = 512
+    PERSIAN_NUMBERS = {
+        '0': '۰', '1': '۱', '2': '۲', '3': '۳', '4': '۴',
+        '5': '۵', '6': '۶', '7': '۷', '8': '۸', '9': '۹'
+    }
+    @staticmethod
+    def preprocess_text(text: str) -> str:
+        """
+        Clean and prepare text for translation.
+        Args:
+            text: Input text to process
+        Returns:
+            Processed text ready for translation
+        """
+        if not text:
+            return ""
+        # Normalize whitespace and remove special characters
+        text = ' '.join(text.split())
+        text = re.sub(r'[^\w\s.,!?-]', '', text)
+        return text[:TextProcessor.MAX_LENGTH]
+    @staticmethod
+    def postprocess_translation(text: str) -> str:
+        """
+        Clean up translated text and normalize numbers.
+        Args:
+            text: Translated text to process
+        Returns:
+            Cleaned and normalized text
+        """
+        if not text:
+            return ""
+        # Clean up model artifacts
+        text = text.replace("<pad>", "").replace("</s>", "").replace("<s>", "")
+        text = re.sub(r'\s+([.,!?])', r'\1', text)
+        text = ' '.join(text.split())
+        # Convert to Persian numbers
+        for en, fa in TextProcessor.PERSIAN_NUMBERS.items():
+            text = text.replace(en, fa)
+        return text.strip()
+    @staticmethod
+    def detect_language(text: str) -> str:
+        """
+        Detect if text is primarily English or Farsi.
+        Args:
+            text: Input text to analyze
+        Returns:
+            'Farsi' or 'English' based on character frequency
+        """
+        farsi_chars = len(re.findall(r'[\u0600-\u06FF]', text))
+        english_chars = len(re.findall(r'[a-zA-Z]', text))
+        return "Farsi" if farsi_chars > english_chars else "English"
+    @staticmethod
+    def validate_input(text: str) -> tuple[bool, str]:
+        """
+        Validate input text length and content.
+        Args:
+            text: Input text to validate
+        Returns:
+            Tuple of (is_valid, error_message)
+        """
+        if not text or len(text.strip()) < 1:
+            return False, "Please enter text to translate"
+        if len(text) > TextProcessor.MAX_LENGTH:
+            return False, f"Input text is too long (maximum {TextProcessor.MAX_LENGTH} characters)"
+        return True, ""
+# Expose static methods for backward compatibility
+preprocess_text = TextProcessor.preprocess_text
+postprocess_translation = TextProcessor.postprocess_translation
+detect_language = TextProcessor.detect_language
+validate_input = TextProcessor.validate_input

uv.lock ADDED Viewed

The diff for this file is too large to render. See raw diff