Tensors in PyTorch : Part-1

Table of Content

Overview of Tensors in PyTorch

In our first blog, we introduced the exciting field of computer vision and discussed why PyTorch is an excellent choice for developing machine learning models. In the second blog, we walked through the steps of setting up a development environment, ensuring your system is ready for hands-on experimentation. Now that we have our system set up, we are ready to dive into the fundamental data structure in PyTorch: Tensors.

What are Tensors?

Tensors are the core data structure used in PyTorch and many other deep learning frameworks. They are generalizations of matrices that can be used to represent data in various dimensions. In simpler terms, a tensor is a container that can store data in n-dimensions array:

  • Scalar (0-D tensor): A single number. For example, 5

  • Vector (1-D tensor): An array of numbers. For example, [1, 2, 3]

  • Matrix (2-D tensor): A 2D array of numbers. For example,[[1, 2], [3, 4]]

  • 3-D tensor and higher dimensions: Arrays of numbers with more than two dimensions. For example, a 3-D tensor might look like [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]


Why Tensors?

Tensors are designed to efficiently store and manipulate large amounts of data, making them ideal for the complex calculations involved in deep learning. Here are some key reasons why tensors are crucial:

  • Flexibility and Efficiency: Tensors can represent various types of data, from images and text to time series and more. They are optimized for performance, leveraging hardware acceleration (like GPUs) for faster computations.

  • Interoperability: Tensors can easily be converted between different deep learning frameworks and other numerical libraries, ensuring smooth workflow transitions.

  • Support for Automatic Differentiation: PyTorch’s autograde system uses tensors to automatically compute gradients, which are essential for training neural networks.


Creating Tensors in PyTorch:

The first step is to import the PyTorch module. Additionally, we'll import the math module to facilitate some of the examples:

blog image

1. Create empty Tensor:

The simplest way to create a tensor is with tensor.empty()

blog image

Let’s break down what we just did

We created a tensor using one of the many factory methods provided by the torch module.
The tensor is 2-dimensional, with 3 rows and 4 columns.
The returned object is of type torch.Tensor, which is an alias for torch.FloatTensor.
By default, PyTorch tensors are populated with 32-bit floating point numbers.
When you print your tensor, you may see random-looking values.

blog image

The torch.empty()call allocates memory for the tensor but does not initialize it with any values, so what you are seeing is whatever was in memory at the time of allocation.

2. Initialize Tensor with some value:

Many times you want to initialize your tensor with some value. Common cases are

  • all zeros

  • all ones

  • all random values

torch module provides these functions

blog image

When working with random tensors, you might have noticed the use of torch.manual_seed(). Initializing tensors with random values is a common practice, especially when setting a model's learning weights. However, in research and other scenarios, reproducibility is crucial. Manually setting the random number generator’s (RNG) seed ensures that your results can be replicated. Here's a closer look:

blog image

The output is :

blog image

As you can see, random1 and random3 contain identical values, and so do random2 and random4. This happens because manually setting the RNG’s seed with
torch.manual_seed(1729) resets it, ensuring that subsequent computations dependent on random numbers will yield identical results, thus ensuring reproducibility.

Tensors Shape in PyTorch:

When performing operations on two or more tensors, it's often necessary for them to have the same shape—that is, the same number of dimensions and the same number of elements in each dimension. PyTorch provides the torch.*_like() methods to help with this:

blog image

In the code above, the torch.empty(2, 2, 3) call creates a three-dimensional tensor with a shape of 2x2x3. The .shape property of a tensor contains a list representing the size of each dimension. This is useful for verifying the dimensions of tensors.

The methods .empty_like(), .zeros_like(), .ones_like(), and .rand_like() create tensors with the same shape as the original tensor x. Each method returns a tensor of identical dimensionality and extent, but with different initial values.

Another way to create a tensor is to directly use data from a Python collection:

blog image

Using torch.tensor() is the most straightforward way to create a tensor if you already have data in a Python tuple or list. Nesting these collections results in a multi-dimensional tensor.

Note: torch.tensor() creates a copy of the data, ensuring that changes to the original data do not affect the tensor and vice versa.

Tensors Data Type:

One of the simplest ways to set the data type of a tensor is by using an optional argument during its creation. For instance, in the first line of the example below, we set dtype=torch.int16 for the tensor a. When we print a, we see it contains 1 instead of 1.0, indicating that it’s an integer type rather than a floating point.

blog image

Notice that when printing a, the output explicitly specifies its dtype, unlike when the dtype is left as the default (32-bit floating point).

Another detail to observe is how we specify the tensor’s shape. Instead of listing the dimensions as separate integer arguments, we grouped them in a tuple. While not strictly necessary—PyTorch can infer the shape from a series of integers—using a tuple can make your code more readable, especially when adding optional arguments.

blog image

Another way to set the data type is with the method. In the example above, we first create a random floating point tensor b. We then convert b to a 32-bit integer tensor c using the .to() method. As you can see, c contains the same values as b, but truncated to integers.

By setting the data type either during tensor creation or later with the .to() method, you can ensure your tensors have the appropriate data types for your specific needs. This flexibility is crucial for optimizing performance and compatibility within your deep learning models.

PyTorch has twelve different data types

blog image

Explore Tensors in PyTorch:

When you have a tensor and want to know more about it, you can simply print it using print(). However, for large tensors, it's often more practical to check its dimensions and properties.

To check a tensor's shape, you can use the .shape property or the .size() function.

If you want to know the number of dimensions a tensor has, use the .ndim property

Using the len() function on a tensor will only give you the size of the first dimension.

Another important property of a tensor is its data type. In deep learning, floating-point numbers are commonly used, but sometimes you might need integers (e.g., for image pixel values). To check a tensor's data type, use the .dtype property.

If you need to change the data type, you can recreate the tensor with a new type

For example:

blog image

By using these properties and functions, you can easily inspect and manipulate tensors to fit your needs in deep learning projects.

Conclusion

Tensors are the fundamental building blocks in PyTorch, offering a versatile way to handle and manipulate data. From understanding basic operations to managing tensor shapes, mastering tensors is crucial for efficient and effective use of PyTorch in your deep learning projects.

Stay tuned for the next installment, where we will start with a practical guide to setting up your first computer vision project with PyTorch in “Understanding Tensors in PyTorch-Part-2"

Impetus Img

Written By

Impetus Ai Solutions

Impetus is a pioneer in AI and ML, specializing in developing cutting-edge solutions that drive innovation and efficiency. Our expertise extends to product engineering, warranty management, and building robust cloud infrastructures. We leverage advanced AI and ML techniques to provide state-of-the-art technological and IT-related services, ensuring our clients stay ahead in the digital era.

We run all kinds of IT services that vow your success