This means the _process_model_before_weight_loading method takes care of manipulating the model skeleton to replace some modules (e.g., nn.Linear) with the target modules (quantization modules). You can define a module replacement logic or any other utility method by creating a new file in transformers/src/integrations/ and exposing the relevant methods in that folder's __init__.py file. |