Data Visualization in Google Colaboratory

TIJ Tech Private Limited
7 min readMar 9, 2021

For more details and information, please refer to the YouTube video.

Video URL: https://youtu.be/UrroNzXshNw

Introduction

In this tutorial, we’re going to learn more about the MatplotLib library, and how to customize MatplotLib, its configurations, and stylesheets.

MatplotLib is a multi-platform data visualization library built on Numpy arrays. We will import matplotlib.pyplot and NumPy for the operation.

Then, we will use the plt.style directive to choose appropriate aesthetic styles for the plots. Here, we will set the classic style, which ensures that the plots and we create use the classic Matplotlib style.

Any plt plot command will use a figure window to open but %matplotlib inline will only draw static images and sets the backend of matplotlib to the inline backend.

Now we’re going to import a cycler, and a single-entry cycler object can be used to easily cycle over a single style. We have to use the cycler() function to link a key to series of values and the key must be hash-able. These are the values. These color codes represent the respective colors.

Why we’re doing this is because each time MatplotLib loads, it defines a runtime configuration (rc) containing the default styles for every plot element you create. This group ‘axes’ defines the overall design of the axes of your plot. It has a list or tuple of group names, xtick and ytick that gives the design of the x-axis and y-axis each. This group ‘patch’ shows how the edges of the plot look like. And this group ‘lines’ set the appropriate expansion within the plot area. You can use all this to customize the style of the plotting.

Here, we initialize variable ‘x’ and this function returns samples from a normal distribution with a mean of 0 and a variance of 1 where you’ll eventually get an upper or lower limit. And we plot the histogram using hist() function. And we plot the histogram using hist() function. The first color defined in the cycler() function. And the overall layout of the plot is what we customized using plt.rc() method earlier.

Next, we will print the line graph using this function. It returns random numbers sampled from a uniform distribution over the intervals [0,1] where the distribution has no upper or lower limits. We can see 5 different lines with varying colors that were defined in the cycler() function.

These colors are the first 5 colors that were defined within the function.

Now we will initialize variable ‘y’ with the function np.linspace() that takes in 3 parameters that are starting number, stopping number, and the number of samples. This will plot the wave with the red color and this will plot the wave with blue color.

If you can see, these two colors are the first two colors defined in the cycler() function and the layout is also the same.

Next, we will plot a simple bar graph with simple data. The variable ‘axes_part’ will get the equidistant bars in the x-axis. This sets the heights in the y-axis and this will set bars in the x-axis. We can see a figure similar to that of the histogram.

The following figure will be the plotting of the pie diagram. We will use the same simple data used in the plot of a bar graph.

We will also initialize the variable ‘explode’. This will allow us to plot the pie diagram. The following function accepts 6 parameters. Height will give the value of variable ‘height’, explode will enable us to have a protrusion on that particular slice of the pie diagram, the label will provide the different labels for different portions of different values of bars, autopct will automatically convert the values of height into a percentage, autopct will automatically convert the values of height into a percentage, and startangle will have a start of calculating different angle from 90`. The function axe.axis() method will enable us to have an equal aspect ratio that ensures the pie diagram is drawn as a circle. We can see in the figure that different slices of the pie diagram have different colors. That’s how we can customize the style of the different plottings using runtime configuration (rc) parameters.

Now we’re going to define a function that creates a histogram and a line graph. This sets the random seed to 0 so the pseudo-random numbers you get from random will start from the same point.

This will create the plot with length 4 and width 11. And this will initialize a histogram and generates samples from the normal distribution. As for this, it will initialize a line graph with 4 lines with different colors.

So when we call this function, we will get the two different plots, a histogram, and a line graph. Now we know that we can customize the styles to our own accord. But even if you don’t create your own style, the stylesheets include very useful styles by default.

Now what we’re going to do is, show some of the available styles and how to use them. Firstly, we have the FiveThirtyEight style. It mimics the graphics found on the popular FiveThirtyEight website.

Secondly, we have ggplot. Matplotlib’s ggplot style mimics the default styles from the ggplot package in the R language which is a very popular visualization tool.

This is how it looks like.

Thirdly, we have the Bayesian Methods for Hackers style also known as bmh. It features figures created with a great consistent and visually appealing style.

And this is how it looks.

Next, we have a dark background style. It is useful to create figures that are used in the presentation.

This style provides this outlook.

Also, we have a grayscale style. If you want figures for a print publication that does not accept color figures, you might find this very useful.

This is how it looks.

There is also Seaborn style which is inspired by the Seaborn library. These styles are loaded automatically when Seaborn is imported.

This is how the plot looks like.

With all of these built-in options for various plot styles, MatplotLib becomes much more useful for both interactive visualization and understanding the data.

--

--