Deploying the model

In this section we will bring the tuned model into effec with data of new customers. Firstly, we will prepare the new data loaded into mongodb and make predictions on these new data. In real-world projects, the real outcomes can come some time after the prediction, and then they can be used to validate the performance of the model. In our case, we will simulate a situation in which the real values of the Exited feature are provided time after we have made the predictions, and we will compute a score with the real data to find out how well the model predicted the behaviour of the customers.

Preparing the data

In this section we will prepare the new data loaded into mongodb to make predictions with the tuned model. We have already started the MongoDB daemon. Now let's connect to the database, take the relevant data and store them into a data frame.

Now let's close the connection:

To start with, have a look at the data:

Some data refer to legal entities. As we know, our model is focused on natural person, so let's filter the dataframe:

As we did earlier to train the model, now we will create "dummy" (i.e. binary) variables from the categorical features. However, this time we will not drop any feature in this step (except for the original categorical variables we extracted the dummies from). In fact, we will store the id and the names of the clients, because we will use this data to validate our predictions when we will get the real outcomes.

Now store Customerid and Name in another dataframe:

Before exporting the data, we have to make sure that the features are in the same order they had in the train data set. To do that, recall the list of the feature names we stored earlier:

Making predictions

Now we will load the tuned model and make predictions on the new data.

Validating the predictions

Now let's validate the model by comparing the predictions with the real outcomes

To validate the prediction of the model, we will compute a score of correct predictions. To this end, we first join the real outcomes with the predicted values, and then compute a percentage of correct predictions.

Next step: deployment process flow

Now that we have validated the model on new data, we can build a pipeline to run the whole prediction-validation process as it would be done in real-world projects when predictive models are deployed as "business as usual". To this end, we will first prepare some scripts to run the preparation of the data, the predictive step and the validation step separately. This will enable us to run the whole process from a terminal.