The ability to generate random numbers is extremely useful for all sorts of programming tasks, from a simple simulation of a dice roll to selecting data for data analysis activities. In Python, the random module contains several functions for generating random numbers or sequences of numbers.
Generate Random Numbers
First, import your random module. The name of the package in Python is... random:
import random
The basic function for generating random numbers is called... random() as well (how original!). It will generate a random float between 0 and 1 (not including 0 or 1). Let's try a simple example by displaying three random numbers:
Of course, if you run the same code at home, you will get different results (that’s the nature of randomness)!
Absolutely! But the people who created the random package fortunately thought of everything. There are other functions that let you generate a random number in a given range:
uniform(a, b)
: will generate a random float betweena
andb
.randint(a, b)
: as its name suggests, this one is similar touniform
except that the random number generated is an integer this time!
You can use one or the other according to your needs!
Generate a Random Number According to a Given Distribution
The random module can also generate a random number according to a distribution. One of the best known is the Gaussian (or normal) distribution. If you don't know it already, let me introduce you to it!
The normal law is one of the most suitable probability laws to model natural phenomena resulting from several random events. These are all phenomena where the majority of individuals are around an average, with decreasing proportions below and above this average. Here is a very telling example, with the distribution of the population by IQ:
The random module lets you generate random numbers according to this law: i.e., you are much more likely to have values close to the average (with the example above, between 85 and 115) than extreme values (close to 70 or 130). The corresponding function is called gauss(mean, standard_deviation)
.
Here is an example with a distribution centered at 0 and with a standard deviation of 1 (which is a “conventional” normal distribution):
We can see here, with 10 values, that the majority of the values are close to 0.
Choose Randomly From a List: Subsampling
As you already know, to select an item in a list, you have to select it via its index. If you want to select an item randomly from a list, a somewhat naive solution might be to draw the index randomly, and then use the random index to select the item. The random module goes a step further by offering a function that lets you make the selection directly from the list: the choice
function.
The evolution of this is the choices
function, now making it possible to select a sample from the initial list, with replacement:
Note how, in the second line, we get “two” returned twice, because the first “two” was effectively put back into the list once it was initially drawn.
This is called subsampling. The corresponding function, for a sample without replacement, is sample
:
In data analysis, this concept of subsampling is essential, as it can select a sample from an initial population. In statistics, a sample is a set of individuals representative of a population. The use of a subsample is generally a solution to a practical constraint (lack of time, space, financial cost, etc.) that does not allow an exhaustive study of the entire population.
Further Reading
The random module offers more functions than those presented in this course, even though you have seen the ones that are most commonly used in practice. If you want to go further, you can consult the official documentation of the random module, which lists all the possibilities offered by it.
You should also know that the numpy package, which we briefly mentioned in the previous chapter, also includes the random module. There, you will find all the functions seen above. All functions are accessible via the line (for example):
import numpy.random as random
Let’s Recap
You have seen the main features available through the random module. You can now:
generate a random number, integer, or decimalin a given range.
generate a random number according to a given distribution.
randomly select one or more items from a list, with or without replacement.