User Simulator Model

Overview

GitHub repository for exploring the source code and additional resources: https://github.com/wangkevin02/USP

Our User Simulator with Implicit Profiles (USP) replicates human-like conversational behavior in interactions with large language models (LLMs). By emulating diverse user dynamics based on predefined profiles, it reconstructs realistic user-LLM dialogues, leveraging the LLaMA-3-base-8B architecture with Conditional Supervised Fine-Tuning (SFT) and Reinforcement Learning with Cycle Consistency (RLCC). For a detailed methodology and insights, refer to Our Paper.

Note: Our model is subject to the following constraints:

Maximum Context Length: Supports up to 4,096 tokens. Exceeding this may degrade performance; keep inputs within this limit for best results.

Language Limitation: Optimized for English. Non-English performance may vary due to limited training data.

Sample Implementation

Below is a concise example of using the User Simulator to generate user utterances, ideal for recreating scenarios or evaluating multi-turn dialogues based on specific profiles.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# Define the model path
model_path = "/your/path/to/USP"  

# Initialize tokenizer with specific configurations
tokenizer = AutoTokenizer.from_pretrained(model_path, padding_side="left", trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

# Define model configuration
model_config = {
    "pretrained_model_name_or_path": model_path,
    "trust_remote_code": True,
    "torch_dtype": torch.bfloat16,
    "device_map": "cuda",
    "attn_implementation": "flash_attention_2"
}

# Load the pre-trained model with specified configurations
model = AutoModelForCausalLM.from_pretrained(**model_config)
model.eval()

# Define conversation messages
user_profile = """
You have a knack for planning exciting adventures, particularly when it comes to exploring new destinations on a budget. Your current focus is on organizing a cost-effective vacation to a warm, beach-filled location. You're actively seeking recommendations for such a getaway, with a particular interest in places like Phuket, Thailand. You're keen on discovering the must-visit spots in Phuket without breaking the bank, and you're looking for advice on how to make the most of your trip within your budget constraints. Your love for travel is evident in your habit of meticulously planning vacations in advance, ensuring you maximize both the experience and the value for money.

Your personality shines through in your conscientious approach to planning, where every detail is considered and nothing is left to chance. You're open-minded and adventurous, always eager to dive into new experiences and embrace what each destination has to offer. Your inquisitive nature means you're always asking questions, seeking out the best advice to enhance your journeys. You communicate with an informal and friendly style, making it easy for others to share their knowledge and insights with you. This combination of traits makes you not only a savvy traveler but also a delightful companion on any adventure.
""".strip()
messages = [
        [
     {"role": "system", "content": f"You are engaging in a conversation with an AI assistant. Your profile is: 
{user_profile}
 You can say anything you want, either based on the profile or something brand new.

"},
     ],
    [
     {"role": "system", "content": f"You are engaging in a conversation with an AI assistant. Your profile is: 
{user_profile}
 You can say anything you want, either based on the profile or something brand new.

"},
     {"role": "user", "content": "I want to go on vacation to a warm place. Do you have any recommendations?"},
     {"role": "assistant", "content": "Sure! If you like beaches, Maldives or Bali are great options. If you're into culture, consider Tuscany in Italy or Santorini in Greece. Which type of destination do you prefer?"},
     ]

]


def generated_user_utt(msgs, model, tokenizer):
    with torch.no_grad():
        input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
        inputs = tokenizer.batch_encode_plus(input_text, return_tensors="pt", truncation=True, padding=True, max_length=4096).to(model.device)
        input_ids = inputs["input_ids"]
        outputs = model.generate(
            **inputs, 
            pad_token_id=tokenizer.eos_token_id,
            eos_token_id=tokenizer.eos_token_id,
            max_new_tokens=4096, 
            repetition_penalty=1.2)
        generated_ids = [
            output[len(input_):] 
            for input_, output in zip(input_ids, outputs)
        ]
        response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
    return response 

# Generate responses for the provided messages
responses = generated_user_utt(messages, model, tokenizer)
print(f"Generated response: {responses}")

# >>> Expected Output:
# >>> ['I am thinking about going away for vacation', 'How about phucket thailand']
# >>> The first output initiates a topic based on the system-defined user profile.
# >>> The second output extends the conversation based on the given profile and context to mimic the user's behavior.

Citation

If you find this model useful, please cite:

@misc{wang2025knowbettermodelinghumanlike,
      title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles}, 
      author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li},
      year={2025},
      eprint={2502.18968},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18968}, 
}

wangkevin02
/

USP

User Simulator Model

Overview

Sample Implementation

Citation

Model tree for wangkevin02/USP

Dataset used to train wangkevin02/USP

Collection including wangkevin02/USP

USP