|
--- |
|
license: llama2 |
|
language: |
|
- en |
|
tags: |
|
- code |
|
- blockchain |
|
- solidity |
|
- smart contract |
|
- transformers |
|
--- |
|
# Smart-Solidity-beta Overview |
|
|
|
Smart Solidity beta is a fine-tuned version of CodeLlama-7B-Instruct, tailored for generating and understanding Solidity smart contract code. This model specializes in producing high-quality Solidity code based on user-provided instructions and use cases. It was fine-tuned to ensure robust performance and precise outputs in the blockchain and smart contract domain. |
|
|
|
# Key Features |
|
**Language**: Trained exclusively for Solidity and related blockchain development.<br> |
|
**Purpose**: Tailored for creating, debugging, and understanding Solidity smart contracts.<br> |
|
**Ease of Use**: Provides concise and accurate responses to Solidity-specific queries. |
|
|
|
# Use Cases |
|
**Code Generation**: Generate boilerplate or advanced Solidity code snippets for smart contracts.<br> |
|
**Code Explanation**: Understand complex Solidity logic by receiving step-by-step explanations.<br> |
|
**Debugging**: Identify and suggest fixes for potential bugs or inefficiencies in smart contract code.<br> |
|
**Optimization**: Propose refactored versions of Solidity code for gas efficiency and maintainability.<br> |
|
**Learning**: Assist blockchain developers in learning Solidity through practical examples. |
|
|
|
# Model Details |
|
**Base Model**: Meta’s CodeLlama-7b.<br> |
|
**Fine-tuned Data**: Processed dataset containing GPT-generated human instruction and Solidity source code data pairs.<br> |
|
**Model Size**: 7 billion parameters, balancing high-quality output with reasonable computational requirements. |
|
|
|
# Technical Specifications |
|
**Input Format**: Accepts Solidity code snippets, prompts, or questions in natural language.<br> |
|
**Output Format**: Provides Solidity code, recommendations, or explanations in plain text. |
|
|
|
# Training Loss Table |
|
| Step | Training Loss | |
|
|-------|---------------| |
|
| 100 | 0.3309 | |
|
| 1000 | 0.2995 | |
|
| 2000 | 0.2275 | |
|
| 3000 | 0.2695 | |
|
| 4000 | 0.2514 | |
|
| 5000 | 0.2405 | |
|
|
|
|
|
# Hardware: |
|
GPU: 1x NVIDIA GeForce GTX 1080Ti <br> |
|
Training time: ~48 hours |
|
|
|
## Example Usage |
|
|
|
The following Python code demonstrates how to use the **EclipseNomad/Smart-Solidity-beta** model to generate Solidity smart contract code: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
# Load the tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("EclipseNomad/Smart-Solidity-beta") |
|
|
|
# Load the model with 4-bit quantization |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"EclipseNomad/Smart-Solidity-beta", |
|
load_in_4bit=True, |
|
device_map="auto" |
|
) |
|
|
|
# Define the instruction for the model |
|
instruction = ( |
|
"Create a smart contract to manage a whitelist of wallet addresses. " |
|
"Include functionality for a DAO to approve/revoke wallets and a mechanism to set a validator address." |
|
) |
|
|
|
# Tokenize the instruction |
|
input_ids = tokenizer(instruction, return_tensors="pt").input_ids.to("cuda") |
|
|
|
# Generate Solidity code |
|
outputs = model.generate( |
|
input_ids=input_ids, |
|
max_new_tokens=512, |
|
temperature=0.7, |
|
top_p=0.9, |
|
pad_token_id=tokenizer.pad_token_id, |
|
) |
|
|
|
# Decode and print the generated Solidity code |
|
solidity_code = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(solidity_code) |
|
|
|
|
|
|
|
# Limitations |
|
The model may occasionally produce incorrect or suboptimal Solidity code; thorough human review is recommended before deploying to production. |
|
Its training data might not encompass the most recent Solidity updates or EVM standards. Ensure compatibility with the latest versions. |
|
Future Work |
|
The model is open to further fine-tuning and community contributions to enhance its accuracy and support for emerging Solidity standards and advanced blockchain use cases. |
|
|