The used memory increases with the number of input samples
#18
by
storm2008
- opened
I have many long texts (about 3000 words for each), and this model is used to generate the text embeddings. The code is as follows:.
# load data
datapath = "data/"
filename = "data.csv"
with open(datapath+filename,"r",encoding="gbk") as file:
csvfile = csv.reader(file)
data = [tmp[5] for tmp in csvfile]
# import models
tokenizer = AutoTokenizer.from_pretrained('Qwen/gte-Qwen2-1.5B-instruct')
LLMmodel = AutoModel.from_pretrained('Qwen/gte-Qwen2-1.5B-instruct')
# get the embedding of each input
res = torch.Tensor()
for index in range(len(data)):
currdata = data[index]
batch_dict = tokenizer(currdata, max_length=8192,
padding=True, truncation=True, return_tensors='pt')
# The problem occurs in the following code.
# as the index increases, the used memory always increase, until memory runs out
outputs = LLMmodel(**batch_dict)
torch.cuda.empty_cache() # this code cannot solve this problem
embeddings = outputs.last_hidden_state[:, -1]
embeddings = F.normalize(embeddings, p=2, dim=1)
# res saves the result
if len(res) == 0:
res = embeddings
else:
res = torch.cat((res,embeddings),dim=0)
How can I solve this problem?
solved, the problem in res = torch.cat((res,embeddings),dim=0)
storm2008
changed discussion status to
closed