To deepcopy a model in PyTorch, we can use either copy.deepcopy or make new instance of the model and copy the parameters using load_state_dict and state_dict. The module 'copy' in Python provides us deepcopy() method to create a deep copy. So we can use deepcopy() to create a deepcopy of any object in Python not restricted to PyTorch only. While deepcopy using load_state_dict and state_dict is specific to PyTorch only.
Lets understand these two approaches in detail.
Prerequisite/ setup
We need to install PyTorch.pip install torchFor more details, please visit the PyTorch page to install locally.
Approach 1: Using the copy.deepcopy()
In this approach, we use deepcopy() method to deepcopy a PyTorch model. The deepcopy() method is available in Python module 'copy'.Syntax
copy.deepcopy(model)Here model is a PyTorch model/ neural network, we deepcopy this model.
Steps
1. Import required libraries2. Define a model/ neural network
3. Create a deep copy using copy.deepcopy()
4. Print both models
1. Import required libraries/ module
The first step is to import the necessary libraries/ modules. Here we will use torch - library and copy - module.import torch
import copy
2. Define a model/ neural network
Now we define a simple model/ neural network. Here we define a linear model with 'in_features' = 2 and 'out_features' = 2.model = torch.nn.Linear(2, 2)
3. Create a deep copy using copy.deepcopy()
Now use the deepcopy() method to deepcopy the above defined PyTorch model. The deepcopy is assigned to 'model_copy'.model_copy = copy.deepcopy(model)
4. Print both models
At last step, we print the both models.
print(model.weight)Now look at the complete Python program example with all steps discussed above.
print(model_copy.weight)
Example 1
# import required lib/module
import torch
import copy
# define a simple model
model = torch.nn.Linear(2, 2)
# create a deepcopy of the above model
model_copy = copy.deepcopy(model)
# Print both models
print(model.weight)
print(model_copy.weight)
Output
Parameter containing: tensor([[-0.3875, 0.1497], [-0.1765, -0.2011]], requires_grad=True) Parameter containing: tensor([[-0.3875, 0.1497], [-0.1765, -0.2011]], requires_grad=True)
Look at the output, both models are the same.
Note: You may get different values of the above tensors as the weights are initialized randomly.
Approach 2: Using the load_state_dict() and state_dict()
The second and most accurate approach is to create an instance of the model and then copy the parameters (weights and biases) using the load_state_dict and state_dict. To create an instance of the model we use type() method in Python. See the below syntax:Syntax
model_copy = type(model)(args)
model_copy.load_state_dict(model.state_dict())
Here model is our model, args are the arguments (here 'in_features' and 'out_features') of the model. We should pass these arguments, else it will throw an error (See example 3)
Steps
1. Import required libraries2. Define a model/ neural network
3. Get a new instance of the model
4. Copy weights and biases
5. Print both models
1. Import required libraries
We need only torch to perform our task. So import torch.
import torch
2. Define a model/ neural network
Define a simple model as in first approach:model = torch.nn.Linear(2, 2)
3. Get a new instance of the model
Now create a new instance of the above model:model_copy = type(model)(2,2)
4. Copy weights and biases
Now we copy the parameters (weights and biases) of the model to the above created instance 'model_copy'. To get the parameters of the model, we use model.state_dict().model_copy.load_state_dict(model.state_dict())
5. Print both models
Now at last print both models (the weights of the mdoels):print(model.weight)
print(model_copy.weight)
Example 2
import torch
# define a simple model
model = torch.nn.Linear(2, 2)
# get a new instance of the model
model_copy = type(model)(2,2)
# copy weights and biases
model_copy.load_state_dict(model.state_dict())
# Print both models
print(model.weight)
print(model_copy.weight)
Output
Parameter containing: tensor([[-0.0303, -0.6644], [ 0.1111, 0.5059]], requires_grad=True) Parameter containing: tensor([[-0.0303, -0.6644], [ 0.1111, 0.5059]], requires_grad=True)
Example 3:
# import required lib/module
import torch
# define a simple model
model = torch.nn.Linear(2, 2)
# get a new instance of the model
model_copy = type(model)()
# copy weights and biases
model_copy.load_state_dict(model.state_dict())
# Print both models
print(model.weight)
print(model_copy.weight)
Output
TypeError: __init__() missing 2 required positional arguments: 'in_features' and 'out_features'
Conclusion
In this post we discussed two approaches to deepcopy a model in PyTorch. The first approach is to use copy.deepcopy() method. The other approach is to first create an instance of the model and then copy the model parameters (weights and biases) to created intance using load_state_dict and state_dict.Advertisements
Useful Resources:
Next Post: Latest Post
Previous Post: Activity Recognition - A Camera Based Framework | Research Plan
Next Post: Latest Post
Previous Post: Activity Recognition - A Camera Based Framework | Research Plan
Comments
Post a Comment