- Yes, it should be language agnostic
- You would need to repeat fine-tuning your model, this time in a way that preserves timestamps. If you have timestamps in your target data, you can continue using these. If you don't have timestamps in your data, you can try training with LoRA. Using LoRAs reduces the amount of catastrophic forgetting, so even though we don't have timestamps in our fine-tuning data, the model remembers how to make timestamp'd predictions. You can see a guide on LoRA fine-tuning using the PEFT library here. Note that you want to run inference in half/full precision (not 8-bit), as outlined here
Note that the original post is a hypothesis for why timestamps reduces hallucinations. It would need to be tested and evaluated to confirm whether these findings hold more generally!