We will explore the database to know the data of the customers of the bank. There are two types of customers: legal entities and natural person. Our project focuses on natural person, so we have to filter our data. Let’s start.
Open a terminal and type
mongod
to run the mongodb daemon.
Now open a new terminal and type
mongo
to run the mondodb shell.
Find the collection customer_data
in the database customers
:
show dbs
admin 0.000GB
config 0.000GB
customers 0.002GB
local 0.000GB
test 0.000GB
use customers
switched to db customers
show collections
customers_data
legal_entity
natural_person
Find out how many documents are in the collection:
db.customers_data.count()
10098
See the first few documents and prepare the data:
db.customers_data.find().limit(2).pretty()
{
"_id" : ObjectId("60152a83127b7b1168e813d1"),
"CustomerId" : "E15654535",
"Name" : "E8000X",
"CreditScore" : 785,
"Geography" : "France",
"Gender" : "0",
"Age" : 44,
"Tenure" : 6,
"Balance" : 9999,
"NumOfProducts" : 1,
"HasCrCard" : 1,
"IsActiveMember" : 1,
"EstimatedIncome" : 1002280.29,
"Exited" : 1,
"person" : "entity"
}
{
"_id" : ObjectId("60152a83127b7b1168e813d2"),
"CustomerId" : "15683544",
"Name" : "Buccho",
"CreditScore" : 626,
"Geography" : "Spain",
"Gender" : "Male",
"Age" : 62,
"Tenure" : 3,
"Balance" : 0,
"NumOfProducts" : 1,
"HasCrCard" : 1,
"IsActiveMember" : 1,
"EstimatedIncome" : 65010.74,
"Exited" : 0,
"person" : "natural"
}
We are interested in the field person. We have to filter this field to take only natural person:
db.customers_data.distinct("person")
[ "entity", "natural" ]
db.customers_data.find({"person":"natural"}).count()
10000
There are 10000 documents of natural person. Let’s create a collection with this data:
db.createCollection("natural_person")
nat_pers = db.customers_data.find({"person":"natural"}).toArray()
db.natural_person.insertMany(nat_pers)
db.natural_person.count()
10000
db.natural_person.find().limit(2).pretty()
{
"_id" : ObjectId("60152a83127b7b1168e813d2"),
"CustomerId" : "15683544",
"Name" : "Buccho",
"CreditScore" : 626,
"Geography" : "Spain",
"Gender" : "Male",
"Age" : 62,
"Tenure" : 3,
"Balance" : 0,
"NumOfProducts" : 1,
"HasCrCard" : 1,
"IsActiveMember" : 1,
"EstimatedIncome" : 65010.74,
"Exited" : 0,
"person" : "natural"
}
{
"_id" : ObjectId("60152a83127b7b1168e813d3"),
"CustomerId" : "15737489",
"Name" : "Ramsden",
"CreditScore" : 610,
"Geography" : "Spain",
"Gender" : "Female",
"Age" : 46,
"Tenure" : 5,
"Balance" : 116886.59,
"NumOfProducts" : 1,
"HasCrCard" : 0,
"IsActiveMember" : 0,
"EstimatedIncome" : 107973.44,
"Exited" : 0,
"person" : "natural"
}
This is the data we will finally use for our model.
We can store the remaining data in another collection to export and explore it later:
leg_ent = db.customers_data.find({"person":"entity"}).toArray()
db.legal_entity.insertMany(leg_ent)
db.legal_entity.count()
98
db.legal_entity.find().limit(2).pretty()
{
"_id" : ObjectId("60152a83127b7b1168e813d1"),
"CustomerId" : "E15654535",
"Name" : "E8000X",
"CreditScore" : 785,
"Geography" : "France",
"Gender" : "0",
"Age" : 44,
"Tenure" : 6,
"Balance" : 9999,
"NumOfProducts" : 1,
"HasCrCard" : 1,
"IsActiveMember" : 1,
"EstimatedIncome" : 1002280.29,
"Exited" : 1,
"person" : "entity"
}
{
"_id" : ObjectId("60152a83127b7b1168e813dd"),
"CustomerId" : "E15660842",
"Name" : "E8973X",
"CreditScore" : 629,
"Geography" : "Spain",
"Gender" : "0",
"Age" : 34,
"Tenure" : 8,
"Balance" : 99086.89,
"NumOfProducts" : 2,
"HasCrCard" : 0,
"IsActiveMember" : 1,
"EstimatedIncome" : 1026310.39,
"Exited" : 0,
"person" : "entity"
}
In the next steps of the project, we will connect to mondoDB and query the data. But you may want to export the data instead. To this end, exit the mongo shell:
exit
go to your working directory and export a collection (let’s take legal_entity as an example) by typing:
mongoexport --db customers --collection legal_entity --out legal_entity.json
This will export the collection to your working directory as a json file.