In this exercise we will use a decision tree, and we will search for the best model with a model tuning. We will use the train and test datasets we prepared earlier.
First, let's import the relevant modules and data:

Let's remember the names of the features:

Let's focus on two parameters of the DecisionTreeClassifier() algorithm: max_depth and min_samples_split. The version of scikit-learn we are using is:

The scikit-learn documentation correspondigng to our sklearn verison describes max_depth as "maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.".
On the other hand, min_samples_split is defined as "minimum number of samples required to split an internal node". The default value of min_samples_split is 2. Let's set it equal to 100 and, just to make an example, let's take the max_depth equal to:

Now let's build a loop to train the model with all the possible values of max_depth:

Let's see how the accuracy of the model changes depending on max_depth:

Interestingly, the accuracy on the train set increases with max_depth, but the accuracy on the test set starts decreasing when max_depth is equal to 7. This indicates that the model is overfitted when max_depth is greater than 6.