Lira Mirui

LiraMirui

AI & ML interests

AGI waifu when

Recent Activity

reacted to sequelbox's post with 🧠 about 22 hours ago

Raiden is here! 63k creative-reasoning and analytic-reasoning prompts answered by DeepSeek's 685b R1 model! - All prompts from https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1 and all responses from https://huggingface.co/deepseek-ai/DeepSeek-R1 - A deep look at R1's reasoning skills! Use as you will. Get it now: https://huggingface.co/datasets/sequelbox/Raiden-DeepSeek-R1 for everyone :)

reacted to etemiz's post with 😔 about 22 hours ago

Some things are simple

reacted to KnutJaegersberg's post with 👀 4 days ago

A Brief Survey of Associations Between Meta-Learning and General AI The paper titled "A Brief Survey of Associations Between Meta-Learning and General AI" explores how meta-learning techniques can contribute to the development of Artificial General Intelligence (AGI). Here are the key points summarized: 1. General AI (AGI) and Meta-Learning: - AGI aims to develop algorithms that can handle a wide variety of tasks, similar to human intelligence. Current AI systems excel at specific tasks but struggle with generalization to unseen tasks. - Meta-learning or "learning to learn" improves model adaptation and generalization, allowing AI systems to tackle new tasks efficiently using prior experiences. 2. Neural Network Design in Meta-Learning: - Techniques like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks enable self-improvement and adaptability for deep models, supporting generalization across tasks. - Highway networks and ResNet-style models use shortcuts for efficient backpropagation, allowing deeper models that can be used in meta-learning frameworks. 3. Coevolution: - Coevolution involves the mutual evolution of multiple components, such as learners or task-solvers, to improve overall performance. - Coevolution between learners enhances collaboration and competition within AI systems, while coevolution between tasks and solvers (e.g., POWERPLAY and AI-GA frameworks) pushes solvers to adapt to increasingly complex tasks. 4. Curiosity in Meta-Learning: - Curiosity-based exploration encourages AI systems to discover new, diverse features of the environment, avoiding local optima. - Curiosity-based objectives can be combined with performance-based objectives to ensure efficient exploration and adaptation in complex tasks. 5. Forgetting Mechanisms: - Forgetting is crucial to avoid memory overload in AI systems https://arxiv.org/abs/2101.04283

View all activity

Organizations

None yet

LiraMirui's activity

reacted to sequelbox's post with 🧠 about 22 hours ago

Post

2551

Raiden is here! 63k creative-reasoning and analytic-reasoning prompts answered by DeepSeek's 685b R1 model!

- All prompts from microsoft/orca-agentinstruct-1M-v1 and all responses from deepseek-ai/DeepSeek-R1
- A deep look at R1's reasoning skills! Use as you will.

Get it now: sequelbox/Raiden-DeepSeek-R1

for everyone :)

reacted to etemiz's post with 😔 about 22 hours ago

Post

1631

Some things are simple

reacted to KnutJaegersberg's post with 👀 4 days ago

Post

2591

A Brief Survey of Associations Between Meta-Learning and General AI

The paper titled "A Brief Survey of Associations Between Meta-Learning and General AI" explores how meta-learning techniques can contribute to the development of Artificial General Intelligence (AGI). Here are the key points summarized:

1. General AI (AGI) and Meta-Learning:
- AGI aims to develop algorithms that can handle a wide variety of tasks, similar to human intelligence. Current AI systems excel at specific tasks but struggle with generalization to unseen tasks.
- Meta-learning or "learning to learn" improves model adaptation and generalization, allowing AI systems to tackle new tasks efficiently using prior experiences.

2. Neural Network Design in Meta-Learning:
- Techniques like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks enable self-improvement and adaptability for deep models, supporting generalization across tasks.
- Highway networks and ResNet-style models use shortcuts for efficient backpropagation, allowing deeper models that can be used in meta-learning frameworks.

3. Coevolution:
- Coevolution involves the mutual evolution of multiple components, such as learners or task-solvers, to improve overall performance.
- Coevolution between learners enhances collaboration and competition within AI systems, while coevolution between tasks and solvers (e.g., POWERPLAY and AI-GA frameworks) pushes solvers to adapt to increasingly complex tasks.

4. Curiosity in Meta-Learning:
- Curiosity-based exploration encourages AI systems to discover new, diverse features of the environment, avoiding local optima.
- Curiosity-based objectives can be combined with performance-based objectives to ensure efficient exploration and adaptation in complex tasks.

5. Forgetting Mechanisms:
- Forgetting is crucial to avoid memory overload in AI systems

https://arxiv.org/abs/2101.04283

reacted to schuler's post with 👍 4 days ago

Post

7132

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

reacted to reedmayhew's post with 👀 17 days ago

Post

1464

Check out my latest article about DeepSeek R1 and it's complex censorship tactics:

DeepSeek R1: How an AI Exposed Its Own Censorship and Manipulation
https://reedmayhew.medium.com/deepseek-r1-how-an-ai-exposed-its-own-censorship-and-manipulation-edb2c7f8e2cf

5 replies

reacted to AlexBodner's post with 👀 17 days ago

Post

1377

After today you NEED to know how Deepseek made its magic, check out this thread breaking down the paper: https://x.com/AlexBodner_/status/1883602267317927965

liked a model 17 days ago

m-a-p/YuE-s1-7B-anneal-en-cot

Text Generation • Updated 15 days ago • 45.1k • 363

liked a Space 2 months ago

MV Adapter T2MV Anime

👁

Generate anime-style multi-view images from texts

reacted to m-ric's post with 🔥 2 months ago

Post

2261

Last week was crazy in OS AI, with important models and datasets releases every day.

Here are the most important ones I've pinned:

🌎 Cohere relased GLobal-MMLU, a multilingual version of MMLU, to evaluate AI models' world knowledge in many languages!

🦙 Meta released Llama-3.3-70B-Instruct, a 70B model that's on par with Llama-3.1-405B-Instruct, GPT-4o and Claude. Probably my new go-to for agentic workflows.

🔉 FishAudio released fish-speech-1.5, multilingual text to speech model

🎨 Microsoft Research released TRELLIS, an extremely impressive image-to-3D model, which you can try here: JeffreyXiang/TRELLIS

📚 Yesterday, Hugging Face release FineWeb 2, a new version that extends the previous FineWeb to over 1000 languages, including extended coverage in Russina, Mandarin, German, Japanese, Spanish, French, so a huge, high-quality dataset of > 3 trillion words! HuggingFaceFW/fineweb-2

Now let's go build to make this week as productive as last one!