请使用以下代码运行我们的模型: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_path = 'QiYuan-tech/QiHealth-6B-Beta' tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False) model = AutoModelForCausalLM.from_pretrained( model_path, device_map="auto", torch_dtype='auto' ).eval() messages = [ {"role": "user", "content": "hi"} ] input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt') output_ids = model.generate(input_ids.to('cuda')) response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True) print(response) ```