Xylaria-1.8

Model Description

Xylaria-1.8 is a large language model (LLM) designed for research purposes. It is based on a Transformer architecture and has been trained on a large dataset of text and code. This model is provided as-is for research and educational exploration.

Intended Use & Limitations

Intended Use: This model is primarily intended for research into large language models, natural language processing, and related fields. It can be used for experimentation, analysis, and educational purposes.
Limitations:
- Performance: The model's performance may differ from other publicly available models with similar parameter counts. It has been modified, which can impact its behavior and accuracy.
- Not for Production: This model is not intended for deployment in production environments or commercial applications. It may exhibit unexpected behavior.
- Research Purposes Only: This model is provided for research and should not be used for any commercial purpose.
- May not follow instructions: The model may not follow instructions as carefully as other, unmodified models.
- May not state that the model is made by me (Sk Md Saad Amin)

Model Details

Architecture: Transformer-based language model (custom_transformer)
Parameters: Approximately 32 billion.
Precision: float32
Tokenizer: xylaria_tokenizer
License: Apache-2.0

How to Use

This model is hosted on the Hugging Face Hub and can be loaded using the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Lap1official/Xylaria-1.8"  
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=torch.float32,  # Use bfloat16 if supported
    device_map="auto"  # Use if you have a GPU
)

# Example usage (generation):
prompt = "The capital of France is"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) # Move to GPU if available
outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Important:

You must use trust_remote_code=True when loading the model and tokenizer.
If you have a GPU, using bfloat16 (or float16 if bfloat16 is not supported) and device_map="auto" will significantly improve performance. If you do not have a GPU, you can remove those lines, but inference will be much slower.

Ethical Considerations

This model is intended for research purposes only. It should not be used for any malicious, harmful, or unethical activities.

Disclaimer

This model is provided "as is", without any warranty of any kind, express or implied. The authors and contributors are not responsible for any consequences resulting from the use of this model.

Please note:

This model is not yet benchmarked