We will explore the database to know the data of the customers of the bank. There are two types of customers: legal entities and natural person. Our project focuses on natural person, so we have to filter our data. Let’s start.

Open a terminal and type

mongod

to run the mongodb daemon.

Now open a new terminal and type

mongo

to run the mondodb shell.

Find the collection customer_data in the database customers:

show dbs

  admin      0.000GB
  config     0.000GB
  customers  0.002GB
  local      0.000GB
  test       0.000GB
use customers

  switched to db customers
show collections

  customers_data
  legal_entity
  natural_person

Find out how many documents are in the collection:

db.customers_data.count()

  10098

See the first few documents and prepare the data:

db.customers_data.find().limit(2).pretty()

  {
    "_id" : ObjectId("60152a83127b7b1168e813d1"),
    "CustomerId" : "E15654535",
    "Name" : "E8000X",
    "CreditScore" : 785,
    "Geography" : "France",
    "Gender" : "0",
    "Age" : 44,
    "Tenure" : 6,
    "Balance" : 9999,
    "NumOfProducts" : 1,
    "HasCrCard" : 1,
    "IsActiveMember" : 1,
    "EstimatedIncome" : 1002280.29,
    "Exited" : 1,
    "person" : "entity"
}
{
    "_id" : ObjectId("60152a83127b7b1168e813d2"),
    "CustomerId" : "15683544",
    "Name" : "Buccho",
    "CreditScore" : 626,
    "Geography" : "Spain",
    "Gender" : "Male",
    "Age" : 62,
    "Tenure" : 3,
    "Balance" : 0,
    "NumOfProducts" : 1,
    "HasCrCard" : 1,
    "IsActiveMember" : 1,
    "EstimatedIncome" : 65010.74,
    "Exited" : 0,
    "person" : "natural"
}

We are interested in the field person. We have to filter this field to take only natural person:

db.customers_data.distinct("person")
  
  [ "entity", "natural" ]

db.customers_data.find({"person":"natural"}).count()

  10000

There are 10000 documents of natural person. Let’s create a collection with this data:

db.createCollection("natural_person")
nat_pers = db.customers_data.find({"person":"natural"}).toArray()
db.natural_person.insertMany(nat_pers)
db.natural_person.count()

  10000
  
db.natural_person.find().limit(2).pretty()

  {
    "_id" : ObjectId("60152a83127b7b1168e813d2"),
    "CustomerId" : "15683544",
    "Name" : "Buccho",
    "CreditScore" : 626,
    "Geography" : "Spain",
    "Gender" : "Male",
    "Age" : 62,
    "Tenure" : 3,
    "Balance" : 0,
    "NumOfProducts" : 1,
    "HasCrCard" : 1,
    "IsActiveMember" : 1,
    "EstimatedIncome" : 65010.74,
    "Exited" : 0,
    "person" : "natural"
}
{
    "_id" : ObjectId("60152a83127b7b1168e813d3"),
    "CustomerId" : "15737489",
    "Name" : "Ramsden",
    "CreditScore" : 610,
    "Geography" : "Spain",
    "Gender" : "Female",
    "Age" : 46,
    "Tenure" : 5,
    "Balance" : 116886.59,
    "NumOfProducts" : 1,
    "HasCrCard" : 0,
    "IsActiveMember" : 0,
    "EstimatedIncome" : 107973.44,
    "Exited" : 0,
    "person" : "natural"
}

This is the data we will finally use for our model.

We can store the remaining data in another collection to export and explore it later:

leg_ent = db.customers_data.find({"person":"entity"}).toArray()
db.legal_entity.insertMany(leg_ent)
db.legal_entity.count()

  98
  
db.legal_entity.find().limit(2).pretty()

  {
    "_id" : ObjectId("60152a83127b7b1168e813d1"),
    "CustomerId" : "E15654535",
    "Name" : "E8000X",
    "CreditScore" : 785,
    "Geography" : "France",
    "Gender" : "0",
    "Age" : 44,
    "Tenure" : 6,
    "Balance" : 9999,
    "NumOfProducts" : 1,
    "HasCrCard" : 1,
    "IsActiveMember" : 1,
    "EstimatedIncome" : 1002280.29,
    "Exited" : 1,
    "person" : "entity"
}
{
    "_id" : ObjectId("60152a83127b7b1168e813dd"),
    "CustomerId" : "E15660842",
    "Name" : "E8973X",
    "CreditScore" : 629,
    "Geography" : "Spain",
    "Gender" : "0",
    "Age" : 34,
    "Tenure" : 8,
    "Balance" : 99086.89,
    "NumOfProducts" : 2,
    "HasCrCard" : 0,
    "IsActiveMember" : 1,
    "EstimatedIncome" : 1026310.39,
    "Exited" : 0,
    "person" : "entity"
}

In the next steps of the project, we will connect to mondoDB and query the data. But you may want to export the data instead. To this end, exit the mongo shell:

exit

go to your working directory and export a collection (let’s take legal_entity as an example) by typing:

mongoexport --db customers --collection legal_entity --out legal_entity.json

This will export the collection to your working directory as a json file.