Select Page

How to Convert NumPy Array to PyTorch Tensor

by | Programming, Python, PyTorch, Tips

To convert a NumPy array to a PyTorch tensor you can:

  • Use the from_numpy() function, for example, tensor_x = torch.from_numpy(numpy_array)
  • Pass the NumPy array to the torch.Tensor() constructor or by using the tensor function, for example, tensor_x = torch.Tensor(numpy_array) and torch.tensor(numpy_array).

This tutorial will go through the differences between the NumPy array and the PyTorch tensor and how to convert between the two with code examples.


What is a NumPy Array?

A NumPy array is a grid of values containing information about the raw data, how to locate an element, and how to interpret an element. We can access the grid of elements using indexing, slicing and iterating, like ordinary Python lists. The elements of an array must be of the same type, referred to as the array dtype.

What is a PyTorch Tensor?

In mathematical terms, a scalar has zero dimensions, a vector has one dimension, a matrix has two dimensions and tensors have three or more dimensions.

Generally, a tensor can be any n-dimensional array.

Specifically, a torch.Tensor is a multi-dimensional matrix containing elements of a single data type. We can access the elements of a Tensor using indexing and slicing and iterating.

What is the Difference Between a NumPy Array and a PyTorch Tensor?

NumPy’s array is the core functionality of the library and is designed to support fast and scalable mathematic operations. PyTorch tensors are similar to arrays but we can operate on tensors using GPUs. PyTorch tensors are suited more for deep learning which requires matrix multiplication and derivative computations. When creating a PyTorch tensor it accepts two other arguments:

  • device_type: whether the computation happens on CPU or GPU
  • requires_grad: If true record the operations performed on the tensor

The PyTorch tensor has an API very similar to NumPy array.

Convert NumPy Array to PyTorch Tensor

Let’s look at how to convert a NumPy array to a PyTorch tensor using the from_numpy() function, the Tensor constructor, and the tensor() functions:

import torch
import numpy as np

np_array = np.array([2, 4, 6, 8, 10, 12])

tensor_x = torch.from_numpy(np_array)

tensor_y = torch.Tensor(np_array)

tensor_z = torch.tensor(np_array)

print(tensor_x)

print(tensor_y)

print(tensor_z)
tensor([ 2,  4,  6,  8, 10, 12])
tensor([ 2.,  4.,  6.,  8., 10., 12.])
tensor([ 2,  4,  6,  8, 10, 12])

The from_numpy() and tensor() functions acknowledge the dtype of the original NumPy array. For example, starting with the array of integers, the dtype will be int64:

print(np_array.dtype)
int64

If we print the dtype of all three tensors, we will find that tensor_x and tensor_z will retain the dtype of the NumPy array cast into PyTorch’s variant torch.int64.

Whereas tensor_y assigns the values in the array to floats.

print(tensor_x.dtype)

print(tensor_y.dtype)

print(tensor_z.dtype)
torch.int64
torch.float32
torch.int64

Casting PyTorch Tensor to a Different dtype

We can specify the dtype using the tensor() function, but not from_numpy() or Tensor():

tensor_z = torch.tensor(np_array, dtype=torch.float64)

print(tensor_z)
tensor([ 2.,  4.,  6.,  8., 10., 12.], dtype=torch.float64)

Once you create the tensor you can cast it to a specific data type regardless of the conversion method. For example we can convert the tensor made using from_numpy() to float using the built-in float() method.

tensor_x = torch.from_numpy(np_array)
print(tensor_x.dtype)
tensor_x = tensor_x.float()
print(tensor_x.dtype)
torch.int64
torch.float32

Convert PyTorch Tensor to NumPy Array

PyTorch tensors are built on top of NumPy arrays. We can convert a PyTorch tensor by exposing the underlying data structure using the numpy() function. If your tensor is on the CPU, we can use the numpy() function alone, for example:

tensor_a = torch.tensor([1, 3, 5, 7, 9])

array_a = tensor_a.numpy()

print(array_a)
[1 3 5 7 9]

Convert PyTorch Tensor with Gradients to NumPy Array

If you have set requires_grad to True when creating the tensor, you cannot just use the numpy() function. The tensor has a record of the calculated gradients, and you have to detack the underlying NumPy array from the gradients using the detach() method. Let’s see what happens if you try to just use numpy() on a tensor which requires gradients:

tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True)

array_a = tensor_a.numpy()

print(array_a)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-14-ffe330eca90f> in <module>
      1 tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True)
      2 
----> 3 array_a = tensor_a.numpy()
      4 
      5 print(array_a)

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

Now let’s correctly use detach() before using numpy()

tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True)

array_a = tensor_a.detach().numpy()

print(array_a)
[1. 3. 5. 7. 9.]

Convert PyTorch Tensor on GPU with Gradients to NumPy Array

If you have a tensor on the GPU, you cannot uncover the underlying numpy array because NumPy arrays do not reside on the GPU, they reside on the CPU. We have to detach the gradients from the tensor, transfer the tensor to a CPU, then call the numpy() function. Let’s look at an example:

tensor_a = torch.tensor([1, 3, 5, 7, 9], dtype=torch.float32, requires_grad=True).cuda()

array_a = tensor_a.detach().cpu().numpy()

print(array_a)
[1. 3. 5. 7. 9.]

Note that you need to have PyTorch installed with CUDA enabled in order to create a tensor on the GPU.

Summary

Congratulations on reading to the end of this tutorial! We have gone through the differences between a NumPy array and a PyTorch tensor and how to convert between the two.

For further reading on NumPy, go to the article: Python How to Replace Negative Value with Zero in Numpy Array

For further reading on PyTorch, go to the article: PyTorch Cat Vs Stack Explained.

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Have fun and happy researching!