Spaces:
Running
Running
File size: 9,194 Bytes
8713346 12229cf 8713346 d727530 89754c4 8713346 d727530 8713346 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Audio Input with Tabs and Features</title>
<style>
.loading {
display: none !important;
}
</style>
</head>
<body style="font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;">
<div style="width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;">
<!-- Tabs -->
<div style="display: flex; border-bottom: 2px solid #e9ecef; margin-bottom: 1rem;">
<button id="singleAudioTab" class="tab active" style="flex: 1; text-align: center; padding: 1rem; background: transparent; border: none; border-bottom: 3px solid #007bff; font-weight: bold; cursor: pointer; color: #007bff;">Single Audio Stream</button>
<button id="multistreamTab" class="tab" style="flex: 1; text-align: center; padding: 1rem; background: transparent; border: none; border-bottom: 3px solid transparent; font-weight: bold; cursor: pointer; color: #6c757d;">Multistream Demo</button>
</div>
<!-- Language Selection -->
<div style="margin-bottom: 1rem; text-align: center; display: flex; justify-content: center; gap: 1rem; flex-wrap: wrap;">
<label style="display: inline-flex; align-items: center; gap: 0.5rem; padding: 0.5rem 1rem; border: 1px solid #ced4da; border-radius: 4px; cursor: pointer;">
<input type="radio" name="language" value="en" checked style="margin: 0;" />
<img src="https://flagcdn.com/us.svg" alt="US Flag" style="width: 20px; height: 14px;" /> English
</label>
<label style="display: inline-flex; align-items: center; gap: 0.5rem; padding: 0.5rem 1rem; border: 1px solid #ced4da; border-radius: 4px; cursor: pointer;">
<input type="radio" name="language" value="de" style="margin: 0;" />
<img src="https://flagcdn.com/de.svg" alt="Germany Flag" style="width: 20px; height: 14px;" /> German
</label>
<label style="display: inline-flex; align-items: center; gap: 0.5rem; padding: 0.5rem 1rem; border: 1px solid #ced4da; border-radius: 4px; cursor: pointer;">
<input type="radio" name="language" value="fr" style="margin: 0;" />
<img src="https://flagcdn.com/fr.svg" alt="France Flag" style="width: 20px; height: 14px;" /> French
</label>
</div>
<!-- Single Audio Stream Content -->
<div id="singleAudioContent" class="tab-content loading">
<div style="display: flex; gap: 1.5rem;">
<!-- Input Section -->
<div style="flex: 1; display: flex; flex-direction: column; gap: 1rem;">
<div style="font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; display: flex; align-items: center; gap: 0.5rem; color: #6c757d;">
<span style="line-height: 1;">🎵</span> Input
</div>
<!-- Drag and Drop / File Upload -->
<div id="dropzone" style="border: 2px dashed #ced4da; border-radius: 8px; padding: 2rem; text-align: center; color: #6c757d; cursor: pointer; background-color: #f8f9fa; transition: background-color 0.3s, border-color 0.3s; position: relative;">
<input type="file" id="fileInput" accept="audio/*" style="position: absolute; top: 0; left: 0; opacity: 0; width: 100%; height: 100%; cursor: pointer;" />
<p style="margin: 0;">Drop Audio Here<br>- or -<br>Click to Upload</p>
</div>
<!-- Record Microphone Button -->
<button id="recordBtn" style="padding: 0.5rem 1rem; border: 1px solid #e9ecef; border-radius: 4px; background-color: #fff; color: #d9534f; cursor: pointer; font-size: 1rem;">
<span style="font-size: 0.8rem; border-radius: 50%; background-color: #d9534f; width: 10px; height: 10px; display: inline-block;"></span>
Use Microphone
</button>
</div>
<!-- Output Section -->
<div style="flex: 1; display: flex; flex-direction: column; gap: 1rem;">
<div style="font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; color: #6c757d;">Transcript</div>
<textarea id="results" placeholder="Output will appear here..." readonly style="flex: 1; padding: 0.75rem; font-size: 1rem; border: 1px solid #ced4da; border-radius: 8px; resize: none; background-color: #f8f9fa;"></textarea>
<audio id="audioPlayback" controls style="display: none; margin-top: 1rem; width: 100%;"></audio>
</div>
</div>
</div>
<!-- Multistream Demo Content -->
<div id="multistreamContent" class="tab-content loading" style="display: none;">
<div style="text-align: center; padding: 1rem;">
<button id="playAllBtn" style="padding: 0.75rem 1.5rem; background-color: #007bff; color: #fff; border: none; border-radius: 4px; cursor: pointer; font-size: 1rem;">Play All Streams</button>
</div>
<div style="display: flex; flex-wrap: wrap; gap: 1rem;">
<div class="audio-container" style="flex: 1; min-width: 250px;">
<audio id="audio1" controls style="width: 100%;"></audio>
<textarea id="transcript1" readonly placeholder="Transcript for Audio 1"
style="width: 100%; height: 4rem; margin-top: 0.5rem; font-size: 0.9rem; padding: 0.5rem;
border: 1px solid #ced4da; border-radius: 4px; resize: none;"></textarea>
</div>
<div class="audio-container" style="flex: 1; min-width: 250px;">
<audio id="audio2" controls style="width: 100%;"></audio>
<textarea id="transcript2" readonly placeholder="Transcript for Audio 2"
style="width: 100%; height: 4rem; margin-top: 0.5rem; font-size: 0.9rem; padding: 0.5rem;
border: 1px solid #ced4da; border-radius: 4px; resize: none;"></textarea>
</div>
<div class="audio-container" style="flex: 1; min-width: 250px;">
<audio id="audio3" controls style="width: 100%;"></audio>
<textarea id="transcript3" readonly placeholder="Transcript for Audio 3"
style="width: 100%; height: 4rem; margin-top: 0.5rem; font-size: 0.9rem; padding: 0.5rem;
border: 1px solid #ced4da; border-radius: 4px; resize: none;"></textarea>
</div>
<div class="audio-container" style="flex: 1; min-width: 250px;">
<audio id="audio4" controls style="width: 100%;"></audio>
<textarea id="transcript4" readonly placeholder="Transcript for Audio 4"
style="width: 100%; height: 4rem; margin-top: 0.5rem; font-size: 0.9rem; padding: 0.5rem;
border: 1px solid #ced4da; border-radius: 4px; resize: none;"></textarea>
</div>
<div class="audio-container" style="flex: 1; min-width: 250px;">
<audio id="audio5" controls style="width: 100%;"></audio>
<textarea id="transcript5" readonly placeholder="Transcript for Audio 5"
style="width: 100%; height: 4rem; margin-top: 0.5rem; font-size: 0.9rem; padding: 0.5rem;
border: 1px solid #ced4da; border-radius: 4px; resize: none;"></textarea>
</div>
</div>
</div>
<div id="status">Loading...</div>
</div>
<!-- Footer Section -->
<div style="width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;">
<h3>Model Licensing</h3>
<ul>
<li><strong>The models in this space are dual licensed. (License under consideration, probably Coqui)</strong></li>
<li>model download: <a href="https://huggingface.co/Banafo/Kroko-ASR" target="_blank">Banafo/Kroko-ASR</a></li>
<li>commercial model demo with half the latency and lower error rate: <a href="https://banafo.ai/en/realtime-demo/live-transcript" target="_blank">banafo.ai</a></li>
<li>For commercial licensing, please contact [email protected]</li>
<li></li>
<li>Pricing examples:</li>
<li> - 25$ per year for single user. </li>
<li> - 99$ / year for pbx with up to 10 users / extensions. </li>
<li> - contact us for your use case</li>
</ul>
<h3>About This Demo</h3>
<ul>
<li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) using your browser. No server is involved, ensuring privacy and security.</li>
<li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>
<li><strong>Versatile Audio Handling:</strong> Can simultaneously manage microphone and speaker inputs, making it ideal for webRTC-based call centers or web meetings.</li>
</ul>
<h3>Latest Update</h3>
<ul>
<li>Added support for <strong>German (de-DE)</strong>.</li>
</ul>
</div>
<script src="./sherpa-onnx-asr.js"></script>
<script src="./app-asr.js"></script>
</body>
</html> |