pkoloveas commited on
Commit
03ec37a
·
verified ·
1 Parent(s): 79c16d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -1
README.md CHANGED
@@ -33,7 +33,76 @@ A fine-tuned model for Citation Intent Classification, based on [Qwen 2.5 14B In
33
  ## Quickstart
34
 
35
  ```python
36
- # TODO
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
38
 
39
  Details about the system prompts and query templates can be found in the paper.
 
33
  ## Quickstart
34
 
35
  ```python
36
+ # from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ # model_name = "Qwen/Qwen2.5-14B-Instruct"
39
+
40
+ # model = AutoModelForCausalLM.from_pretrained(
41
+ # model_name,
42
+ # torch_dtype="auto",
43
+ # device_map="auto"
44
+ # )
45
+ # tokenizer = AutoTokenizer.from_pretrained(model_name)
46
+
47
+ system_prompt = """
48
+ # CONTEXT #
49
+ You are an expert researcher tasked with classifying the intent of a citation in a scientific publication.
50
+
51
+ ########
52
+
53
+ # OBJECTIVE #
54
+ You will be given a sentence containing a citation. You must classify the intent of the citation by assigning it to one of three predefined classes.
55
+
56
+ ########
57
+
58
+ # CLASS DEFINITIONS #
59
+ The three (3) possible classes are the following: "background information", "method", "results comparison."
60
+
61
+ 1 - background information: The citation states, mentions, or points to the background information giving more context about a problem, concept, approach, topic, or importance of the problem in the field.
62
+ 2 - method: Making use of a method, tool, approach, or dataset.
63
+ 3 - results comparison: Comparison of the paper’s results/findings with the results/findings of other work.
64
+
65
+ ########
66
+
67
+ # RESPONSE RULES #
68
+ - Analyze only the citation marked with the @@CITATION tag.
69
+ - Assign exactly one class to each citation.
70
+ - Respond only with the exact name of one of the following classes: "background information", "method", or "results comparison".
71
+ - Do not provide any explanation or elaboration.
72
+ """
73
+
74
+ test_citing_sentence = "Activated PBMC are the basis of the standard PBMC blast assay for HIV-1 neutralization, whereas the various GHOST and HeLa cell lines have all been used in neutralization assays @@CITATION@@."
75
+
76
+ user_prompt = f"""
77
+ {test_citing_sentence}
78
+ ### Question: Which is the most likely intent for this citation?
79
+ a) background information
80
+ b) method
81
+ c) results comparison
82
+ ### Answer:
83
+ """
84
+
85
+ messages = [
86
+ {"role": "system", "content": system_prompt},
87
+ {"role": "user", "content": user_prompt}
88
+ ]
89
+ text = tokenizer.apply_chat_template(
90
+ messages,
91
+ tokenize=False,
92
+ add_generation_prompt=True
93
+ )
94
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
95
+
96
+ generated_ids = model.generate(
97
+ **model_inputs,
98
+ max_new_tokens=512
99
+ )
100
+ generated_ids = [
101
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
102
+ ]
103
+
104
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
105
+ # Response: method
106
  ```
107
 
108
  Details about the system prompts and query templates can be found in the paper.