wangkevin02 commited on
Commit
1e9ee59
·
verified ·
1 Parent(s): 4b8a342

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -13,7 +13,7 @@ base_model:
13
 
14
  > **GitHub repository** for exploring the source code and additional resources: https://github.com/wangkevin02/USP
15
 
16
- The **Profile Generator** is a model designed to extract and generate detailed user profiles from given dialogues, particularly those simulated by our User Simulator for Reinforcement Learning with Cycle Consistency (RLCC) as described in [our paper](https://tongyi.aliyun.com/qianwen/?sessionId=ea3bbcf36a2346a0a7819b06fcb36a1c#). Built upon the **LLaMA-3-Instruct** architecture, this model has been fine-tuned through knowledge distillation of the user profile generation capabilities of **GPT-4o**. As demonstrated in the table below, the distilled Profile Generator achieves dialogue profile consistency (DPC) nearly equivalent to GPT-4o.
17
 
18
  | Dataset | Profile Source | DP.P | Avg DP.P # Fact | DPR | Avg DPR # Fact | DPC | SC Val.Score |
19
  | --------- | -------------- | ----- | --------------- | ----- | -------------- | ----- | ------------ |
@@ -173,5 +173,13 @@ print(f"profile:{profile}")
173
  If you find this model useful, please cite:
174
 
175
  ```plaintext
176
- [Authors], "[Paper Title]," [Venue], [Year], [URL or DOI].
 
 
 
 
 
 
 
 
177
  ```
 
13
 
14
  > **GitHub repository** for exploring the source code and additional resources: https://github.com/wangkevin02/USP
15
 
16
+ The **Profile Generator** is a model designed to extract and generate detailed user profiles from given dialogues, particularly those simulated by our User Simulator for Reinforcement Learning with Cycle Consistency (RLCC) as described in [our paper](https://arxiv.org/pdf/2502.18968). Built upon the **LLaMA-3-Instruct** architecture, this model has been fine-tuned through knowledge distillation of the user profile generation capabilities of **GPT-4o**. As demonstrated in the table below, the distilled Profile Generator achieves dialogue profile consistency (DPC) nearly equivalent to GPT-4o.
17
 
18
  | Dataset | Profile Source | DP.P | Avg DP.P # Fact | DPR | Avg DPR # Fact | DPC | SC Val.Score |
19
  | --------- | -------------- | ----- | --------------- | ----- | -------------- | ----- | ------------ |
 
173
  If you find this model useful, please cite:
174
 
175
  ```plaintext
176
+ @misc{wang2025knowbettermodelinghumanlike,
177
+ title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles},
178
+ author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li},
179
+ year={2025},
180
+ eprint={2502.18968},
181
+ archivePrefix={arXiv},
182
+ primaryClass={cs.CL},
183
+ url={https://arxiv.org/abs/2502.18968},
184
+ }
185
  ```