To convert a NumPy array to a PyTorch tensor you can:
- Use the
from_numpy()
function, for example,tensor_x = torch.from_numpy(numpy_array)
- Pass the NumPy array to the
torch.Tensor()
constructor or by using the tensor function, for example,tensor_x = torch.Tensor(numpy_array)
andtorch.tensor(numpy_array)
.
This tutorial will go through the differences between the NumPy array and the PyTorch tensor and how to convert between the two with code examples.
Table of contents
What is a NumPy Array?
A NumPy array is a grid of values containing information about the raw data, how to locate an element, and how to interpret an element. We can access the grid of elements using indexing, slicing and iterating, like ordinary Python lists. The elements of an array must be of the same type, referred to as the array dtype.
What is a PyTorch Tensor?
In mathematical terms, a scalar has zero dimensions, a vector has one dimension, a matrix has two dimensions and tensors have three or more dimensions.
Generally, a tensor can be any n-dimensional array.
Specifically, a torch.Tensor
is a multi-dimensional matrix containing elements of a single data type. We can access the elements of a Tensor using indexing and slicing and iterating.
What is the Difference Between a NumPy Array and a PyTorch Tensor?
NumPy’s array is the core functionality of the library and is designed to support fast and scalable mathematic operations. PyTorch tensors are similar to arrays but we can operate on tensors using GPUs. PyTorch tensors are suited more for deep learning which requires matrix multiplication and derivative computations. When creating a PyTorch tensor it accepts two other arguments:
device_type
: whether the computation happens on CPU or GPUrequires_grad
: If true record the operations performed on the tensor
The PyTorch tensor has an API very similar to NumPy array.
Convert NumPy Array to PyTorch Tensor
Let’s look at how to convert a NumPy array to a PyTorch tensor using the from_numpy()
function, the Tensor constructor, and the tensor()
functions:
import torch import numpy as np np_array = np.array([2, 4, 6, 8, 10, 12]) tensor_x = torch.from_numpy(np_array) tensor_y = torch.Tensor(np_array) tensor_z = torch.tensor(np_array) print(tensor_x) print(tensor_y) print(tensor_z)
tensor([ 2, 4, 6, 8, 10, 12]) tensor([ 2., 4., 6., 8., 10., 12.]) tensor([ 2, 4, 6, 8, 10, 12])
The from_numpy()
and tensor()
functions acknowledge the dtype of the original NumPy array. For example, starting with the array of integers, the dtype will be int64
:
print(np_array.dtype)
int64
If we print the dtype of all three tensors, we will find that tensor_x
and tensor_z
will retain the dtype of the NumPy array cast into PyTorch’s variant torch.int64
.
Whereas tensor_y
assigns the values in the array to floats.
print(tensor_x.dtype) print(tensor_y.dtype) print(tensor_z.dtype)
torch.int64 torch.float32 torch.int64
Casting PyTorch Tensor to a Different dtype
We can specify the dtype using the tensor()
function, but not from_numpy()
or Tensor()
:
tensor_z = torch.tensor(np_array, dtype=torch.float64) print(tensor_z)
tensor([ 2., 4., 6., 8., 10., 12.], dtype=torch.float64)
Once you create the tensor you can cast it to a specific data type regardless of the conversion method. For example we can convert the tensor made using from_numpy()
to float using the built-in float()
method.
tensor_x = torch.from_numpy(np_array) print(tensor_x.dtype) tensor_x = tensor_x.float() print(tensor_x.dtype)
torch.int64 torch.float32
Convert PyTorch Tensor to NumPy Array
PyTorch tensors are built on top of NumPy arrays. We can convert a PyTorch tensor by exposing the underlying data structure using the numpy() function. If your tensor is on the CPU, we can use the numpy() function alone, for example:
tensor_a = torch.tensor([1, 3, 5, 7, 9]) array_a = tensor_a.numpy() print(array_a)
[1 3 5 7 9]
Convert PyTorch Tensor with Gradients to NumPy Array
If you have set requires_grad
to True
when creating the tensor, you cannot just use the numpy()
function. The tensor has a record of the calculated gradients, and you have to detack the underlying NumPy array from the gradients using the detach()
method. Let’s see what happens if you try to just use numpy()
on a tensor which requires gradients:
tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True) array_a = tensor_a.numpy() print(array_a)
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-14-ffe330eca90f> in <module> 1 tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True) 2 ----> 3 array_a = tensor_a.numpy() 4 5 print(array_a) RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
Now let’s correctly use detach()
before using numpy()
tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True) array_a = tensor_a.detach().numpy() print(array_a)
[1. 3. 5. 7. 9.]
Convert PyTorch Tensor on GPU with Gradients to NumPy Array
If you have a tensor on the GPU, you cannot uncover the underlying numpy array because NumPy arrays do not reside on the GPU, they reside on the CPU. We have to detach the gradients from the tensor, transfer the tensor to a CPU, then call the numpy()
function. Let’s look at an example:
tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True).cuda() array_a = tensor_a.detach().cpu().numpy() print(array_a)
[1. 3. 5. 7. 9.]
Note that you need to have PyTorch installed with CUDA enabled in order to create a tensor on the GPU.
Summary
Congratulations on reading to the end of this tutorial! We have gone through the differences between a NumPy array and a PyTorch tensor and how to convert between the two.
For further reading on NumPy, go to the article: Python How to Replace Negative Value with Zero in Numpy Array
For further reading on PyTorch, go to the article: PyTorch Cat Vs Stack Explained.
To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.