wi-lab commited on
Commit
1cb31c1
·
verified ·
1 Parent(s): c853e0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -30
README.md CHANGED
@@ -16,21 +16,21 @@ base_model:
16
 
17
  **[🚀 Click here to try the Interactive Demo Based on LWM-v1.0!](https://huggingface.co/spaces/wi-lab/lwm-interactive-demo)**
18
 
19
- LWM-v1.1 is a powerful **pre-trained** model developed as a **universal feature extractor** for wireless channels. Building on the foundation of LWM-v1.0, this enhanced version incorporates key advancements to handle **diverse channel configurations**, improve **generalization**, and process **larger, more complex datasets**. As a state-of-the-art foundation model, LWM-v1.1 leverages transformers to extract refined representations from simulated datasets like DeepMIMO and real-world wireless data.
20
 
21
- ### **How is LWM-v1.1 built?**
22
 
23
- The LWM-v1.1 architecture is built on transformers, designed to capture **dependencies** in wireless channel data. The model employs an updated version of **Masked Channel Modeling (MCM)**, increasing the masking ratio to make pretraining more challenging and effective. With **2D patch segmentation**, the model learns intricate relationships across both antennas and subcarriers, while **bucket-based batching** ensures efficient processing of variable-sized inputs. These enhancements make LWM-v1.1 highly scalable and adaptable, offering robust embeddings for diverse scenarios.
24
 
25
- ### **What does LWM-v1.1 offer?**
26
 
27
- LWM-v1.1 provides a versatile feature extraction framework for wireless communication and sensing tasks. Pretrained on a larger and more diverse dataset, it generalizes well across environments—from dense urban cities to synthetic setups—capturing channel characteristics that facilitate reliable performance. With increased capacity and optimized pretraining, LWM-v1.1 embeddings are even more refined, enabling improved results across downstream applications.
28
 
29
- ### **How is LWM-v1.1 used?**
30
 
31
- LWM-v1.1 is designed to be seamlessly integrated into downstream tasks as a source of high-quality **embeddings**. By feeding raw wireless channel data into the model, users obtain contextualized representations that capture critical spatial relationships and dependencies. These embeddings enable efficient and accurate performance with limited labeled data.
32
 
33
- ### **Advantages of Using LWM-v1.1**
34
 
35
  - **Enhanced Flexibility**: Handles diverse channel configurations with no size limitations.
36
  - **Refined Embeddings**: Improved feature extraction through advanced pretraining and increased model capacity.
@@ -38,7 +38,7 @@ LWM-v1.1 is designed to be seamlessly integrated into downstream tasks as a sour
38
  - **Broad Generalization**: Trained on a larger, more diverse dataset for reliable performance across environments.
39
  - **Task Adaptability**: Fine-tuning options enable seamless integration into a wide range of applications.
40
 
41
- For example, the following figure demonstrates the advantages of using **LWM-v1.1-based highly compact CLS embeddings** and **high-dimensional channel embeddings** over raw channels for the LoS/NLoS classification task. The raw dataset is derived from channels of size (128, 32) between BS 3 and 8,299 users in the densified Denver scenario of the DeepMIMO dataset.
42
 
43
  <p align="center">
44
  <img src="https://huggingface.co/wi-lab/lwm-v1.1/resolve/main/images/los_perf.png" alt="LoS/NLoS Classification Performance" width="600"/>
@@ -50,33 +50,36 @@ For example, the following figure demonstrates the advantages of using **LWM-v1.
50
 
51
  ---
52
 
53
- # **🧩 Puzzle Pieces that Redefine LWM-v1.0**
54
 
55
- #### **1️⃣ Breaking Barriers**
56
- 🔓 No Channel Size Limitation
57
- 📏 Support for Larger Input Sizes
58
 
59
- #### **2️⃣ Smarter Foundations**
60
- 🌍 A More Diverse Dataset
61
- 🎭 Tougher Masking Challenges with 40% MCM Ratio
 
62
 
63
- #### **3️⃣ Amplified Power**
64
- 🔢 Expanded Capacity: 2.5M Parameters
65
- 📐 Realistic 2D Patch Segmentation
66
 
67
- #### **4️⃣ Efficiency Engineered**
68
- ⚙️ Optimized Training with AdamW + Cosine Decay
69
- Faster Computation with Streamlined Attention Heads
70
 
71
- ### **🌀 See the Difference at a Glance**
 
 
72
 
73
- | Feature | LWM-v1.0 | **LWM-v1.1** |
74
- |-----------------------------|-------------------------|-----------------------|
75
- | Channel Size Limitation | Fixed at (32, 32) | **Dynamic** |
76
- | Pre-training Samples | 820K | **1.05M** |
77
- | Pre-training Scenarios | 15 | **140** |
78
- | Masking Ratio | 15% | **40%** |
79
- | Parameters | 600K | **2.5M** |
80
  | Sequence Length Support | 128 | **512** |
81
 
82
  # **Detailed Changes in LWM-v1.1**
 
16
 
17
  **[🚀 Click here to try the Interactive Demo Based on LWM-v1.0!](https://huggingface.co/spaces/wi-lab/lwm-interactive-demo)**
18
 
19
+ LWM 1.1 is an **updated pre-trained model** designed for **feature extraction** in wireless channels. Extending LWM 1.0, this version introduces key modifications to improve **scalability**, **generalization**, and **efficiency** across diverse channel configurations. The model is pre-trained on an expanded dataset covering multiple **(N, SC) pairs**, ensuring robustness to varying antenna and subcarrier configurations. LWM 1.1 retains its transformer-based architecture and **Masked Channel Modeling (MCM)** pretraining approach, enabling it to learn structured representations from both **simulated (e.g., DeepMIMO) and real-world** wireless channels. The model supports variable-length inputs, incorporates **bucket-based batching** for memory efficiency, and enables fine-tuning for task-specific adaptation.
20
 
21
+ ### **How is LWM 1.1 built?**
22
 
23
+ LWM 1.1 is a **transformer-based architecture** designed to model **spatial and frequency dependencies** in wireless channel data. It utilizes an enhanced **Masked Channel Modeling (MCM)** pretraining approach, with an increased masking ratio to improve feature learning and generalization. The introduction of **2D patch segmentation** allows the model to jointly process spatial (antenna) and frequency (subcarrier) relationships, providing a more structured representation of the channel. Additionally, **bucket-based batching** is employed to efficiently handle variable-sized inputs without excessive padding, ensuring memory-efficient training and inference. These modifications enable LWM 1.1 to extract meaningful embeddings from a wide range of wireless scenarios, improving its applicability across different system configurations.
24
 
25
+ ### **What does LWM 1.1 offer?**
26
 
27
+ LWM 1.1 serves as a **general-purpose feature extractor** for wireless communication and sensing tasks. Pretrained on an expanded and more diverse dataset, it effectively captures channel characteristics across various environments, including **dense urban areas, simulated settings, and real-world deployments**. The model's increased capacity and optimized pretraining strategy improve the quality of extracted representations, enhancing its applicability for downstream tasks.
28
 
29
+ ### **How is LWM 1.1 used?**
30
 
31
+ LWM 1.1 is designed for seamless integration into **wireless communication pipelines** as a pre-trained **embedding extractor**. By processing raw channel data, the model generates structured representations that encode **spatial, frequency, and propagation characteristics**. These embeddings can be directly used for downstream tasks, reducing the need for extensive labeled data while improving model efficiency and generalization across different system configurations.
32
 
33
+ ### **Advantages of Using LWM 1.1**
34
 
35
  - **Enhanced Flexibility**: Handles diverse channel configurations with no size limitations.
36
  - **Refined Embeddings**: Improved feature extraction through advanced pretraining and increased model capacity.
 
38
  - **Broad Generalization**: Trained on a larger, more diverse dataset for reliable performance across environments.
39
  - **Task Adaptability**: Fine-tuning options enable seamless integration into a wide range of applications.
40
 
41
+ For example, the following figure demonstrates the advantages of using **LWM 1.1-based highly compact CLS embeddings** and **high-dimensional channel embeddings** over raw channels for the LoS/NLoS classification task. The raw dataset is derived from channels of size (128, 32) between BS 3 and 8,299 users in the densified Denver scenario of the DeepMIMO dataset.
42
 
43
  <p align="center">
44
  <img src="https://huggingface.co/wi-lab/lwm-v1.1/resolve/main/images/los_perf.png" alt="LoS/NLoS Classification Performance" width="600"/>
 
50
 
51
  ---
52
 
53
+ # **Key Improvements in LWM-v1.1**
54
 
55
+ ### **1️⃣ Expanded Input Flexibility**
56
+ - **Removed Fixed Channel Size Constraints**: Supports multiple **(N, SC)** configurations instead of being restricted to (32, 32).
57
+ - **Increased Sequence Length**: Extended from **128 to 512**, allowing the model to process larger input dimensions efficiently.
58
 
59
+ ### **2️⃣ Enhanced Dataset and Pretraining**
60
+ - **Broader Dataset Coverage**: Increased the number of training scenarios from **15 to 140**, improving generalization across environments.
61
+ - **Higher Masking Ratio in MCM**: Increased from **15% to 40%**, making the **Masked Channel Modeling (MCM)** task more challenging and effective for feature extraction.
62
+ - **Larger Pretraining Dataset**: Expanded from **820K to 1.05M** samples for more robust representation learning.
63
 
64
+ ### **3️⃣ Improved Model Architecture**
65
+ - **Increased Model Capacity**: Parameter count expanded from **600K to 2.5M**, enhancing representational power.
66
+ - **2D Patch Segmentation**: Instead of segmenting channels along a single dimension (antennas or subcarriers), patches now span **both antennas and subcarriers**, improving spatial-frequency feature learning.
67
 
68
+ ### **4️⃣ Optimized Training and Efficiency**
69
+ - **Adaptive Learning Rate Schedule**: Implemented **AdamW with Cosine Decay**, improving convergence stability.
70
+ - **Computational Efficiency**: Reduced the number of attention heads per layer from **12 to 8**, balancing computational cost with feature extraction capability.
71
 
72
+ ---
73
+
74
+ ### **Comparison of LWM Versions**
75
 
76
+ | Feature | LWM-v1.0 | **LWM-v1.1** |
77
+ |-----------------------------|-------------------------|-----------------------|
78
+ | Channel Size Limitation | Fixed at (32, 32) | **Supports multiple (N, SC) pairs** |
79
+ | Pre-training Samples | 820K | **1.05M** |
80
+ | Pre-training Scenarios | 15 | **140** |
81
+ | Masking Ratio | 15% | **40%** |
82
+ | Parameters | 600K | **2.5M** |
83
  | Sequence Length Support | 128 | **512** |
84
 
85
  # **Detailed Changes in LWM-v1.1**