105 59 756

Doron Adler PRO

Norod78

https://linktr.ee/Norod78

AI & ML interests

Fooling around with Generative machine learning models.

Recent Activity

liked a model about 12 hours ago

ivrit-ai/whisper-large-v3-turbo-ggml

liked a dataset about 14 hours ago

teknium/OpenHermes-2.5

liked a Space 1 day ago

ginigen/text3d-r1

View all activity

Organizations

Norod78's activity

liked a model about 12 hours ago

ivrit-ai/whisper-large-v3-turbo-ggml

Updated about 14 hours ago • 2

liked a dataset about 14 hours ago

teknium/OpenHermes-2.5

Viewer • Updated Apr 15, 2024 • 1M • 1.52k • 710

liked a Space 1 day ago

3D Style Image Gen R1

🖼

3D Style Image Generator R1: Fast & High Quality Mode

reacted to schuler's post with 👍 1 day ago

Post

6010

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

updated a Space 1 day ago

Hebrew Lyrics Generator-gemma2 2b

👨

מחולל שירים מטופשים מבוסס גאמא 2 קטן

liked a dataset 1 day ago

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 266k • 563

liked a model 2 days ago

tencent/Hunyuan3D-2

Image-to-3D • Updated 9 days ago • 50.4k • 869

updated 2 models 3 days ago

Norod78/hebrew_lyrics-gemma2_2b

Text Generation • Updated 3 days ago • 96

Norod78/hebrew_lyrics-gemma2_2b-unsloth-gguf

Updated 3 days ago • 182

upvoted an article 3 days ago

Article

Hugging Face partners with Wiz Research to Improve AI Security

Apr 4, 2024

• 14

reacted to grimjim's post with 👍 3 days ago

Post

2245

I've made yet another merge of reasoning models with incremental gains on the current Open LLM leaderboard.
open-llm-leaderboard/open_llm_leaderboard

Merging in DeepSeek R1 distillation to Llama 3.1 8B (at 10% task arithmetic weight, using the Llama 3.1 8B base model as the case rather than the instruct model) with a prior best merge resulted in a slightly lower IFEval, but a higher result in every other benchmark save for MMLU-PRO, which went down only marginally. MATH Lvl5 and GPQA went up palpably.
grimjim/DeepSauerHuatuoSkywork-R1-o1-Llama-3.1-8B

This result is currently my best Llama 3.1 8B merge result to date. The actual R1 distillation itself scored quite badly, so this would seem to be another case of unexpected formatting (reflected in IFEval) hurting the evaluation results, obscuring the strength of a model.

It is also possible to use the text generation feature of this model to generate roleplay completions. Based on informal testing, this model's bias toward problem-solving will subtly impact narration.

New activity in apple/coreml-mobileclip 3 days ago

Conversion script (PyTorch to CoreML)

#1 opened 13 days ago by

mrdbourke

updated 2 datasets 4 days ago

Norod78/hebrew_lyrics_prompting_finetune

Viewer • Updated 4 days ago • 40.2k • 49

Norod78/hebrew_lyrics_prompting

Viewer • Updated 4 days ago • 40.2k • 73

liked 2 models 5 days ago

apple/coreml-mobileclip

Updated Nov 19, 2024 • 312 • 40

homebrewltd/llama3-s-instruct-v0.2

Updated Aug 23, 2024 • 101 • 45

liked a Space 5 days ago

112

Llama3.1 S V0.2 Checkpoint 2024 08 20

😻

Convert text to audio and vice versa

upvoted a collection 5 days ago

VisCoT

Collection

Visual CoT: Unleashing Chain-of-Thought Reasoning in the Multi-Modal Language Model • 5 items • Updated Jun 13, 2024 • 3

liked a model 5 days ago

ibm-granite/granite-vision-3.1-2b-preview

Image-Text-to-Text • Updated about 4 hours ago • 3.12k • 43

reacted to merve's post with 🚀 5 days ago

Post

2134

IBM released ibm-granite/granite-vision-3.1-2b-preview, a small vision LM with impressive performance on different tasks 😮🔥

it comes with transformers and vLLM support from the get-go 💗
you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb