Spaces:

Ahmadzei
/

RAG

Runtime error

update 1

57bdca5 over 1 year ago

356 Bytes

	With our sliding window approach, however, there is overlap in
	the tokens we pass to the model at each iteration. We don't want the log-likelihood for the tokens we're just treating
	as context to be included in our loss, so we can set these targets to -100 so that they are ignored. The following
	is an example of how we could do this with a stride of 512.