Spaces:
				
			
			
	
			
			
		Sleeping
		
	
	
	
			
			
	
	
	
	
		
		
		Sleeping
		
	Upload README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -1,3 +1,14 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 1 | 
             
            # What is M E S ?
         | 
| 2 | 
             
            M E S (short for MAMBA ENCODER SWARM) is a novel architecture that comprises of MAMBA's structured state space, configured to implement a multiple encoder swarm that are dynamically, sparsely routed to spread the heavy QxKxV matrix multiplication computional intensity across multiple MAMBA encoders (between 5 to 1000) and with the output sparsely aggregated with a MAMBA decoder, thereby bypassing the high cost of inference without sacrificing on the response generation quality.
         | 
| 3 |  | 
|  | |
| 1 | 
            +
            title: Mamba Encoder Swarm
         | 
| 2 | 
            +
            emoji: 🐍
         | 
| 3 | 
            +
            colorFrom: orange
         | 
| 4 | 
            +
            colorTo: yellow
         | 
| 5 | 
            +
            sdk: gradio
         | 
| 6 | 
            +
            sdk_version: "4.0.0"
         | 
| 7 | 
            +
            app_file: app.py
         | 
| 8 | 
            +
            pinned: false
         | 
| 9 | 
            +
            license: mit
         | 
| 10 | 
            +
             | 
| 11 | 
            +
             | 
| 12 | 
             
            # What is M E S ?
         | 
| 13 | 
             
            M E S (short for MAMBA ENCODER SWARM) is a novel architecture that comprises of MAMBA's structured state space, configured to implement a multiple encoder swarm that are dynamically, sparsely routed to spread the heavy QxKxV matrix multiplication computional intensity across multiple MAMBA encoders (between 5 to 1000) and with the output sparsely aggregated with a MAMBA decoder, thereby bypassing the high cost of inference without sacrificing on the response generation quality.
         | 
| 14 |  | 
