Introduction to MongoDB

The most innovative structures for storing data today are NoSQL and object-oriented databases. These do not follow the table/row/column approach of RDBMS. There are four types of NoSQL databases :-

  • Document Database
  • Graph Database
  • Key-value Database
  • Wide-column Database

This blog is about a NoSQL database which stores data in the form of document i.e.MongoDB.
MongoDB is a open source, cross platform and document- oriented database which stores data in forms of collection of documents. It is written in C++.It can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure. MongoDB is easy to deploy, and new machines can be added to a running database.

MongoDB can be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.

Indexing in MongoDB

MongoDB provides a unique index “ _id” to every document inserted to the collection which in the form of  ObjectId. An ObjectId is a 12 bytes of hexadecimal number which consist of :

  • a 4-byte value representing the seconds since the Unix epoch,
  • a 3-byte machine identifier,
  • a 2-byte process id
  • a 3-byte counter, starting with a random value.

Indexes in MongoDB support the efficient execution of queries. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.

We can also provide user defined values to the “_id” field for every document but it should be unique for each document otherwise MongoDB will give  an Duplicate Key error.

Data Modeling in MongoDB

MongoDB provides schemaless database or we can say flexible schema. Data in MongoDB can be stored in two different types of data models:

  • Reference Data Model – This data model stores the relationship between data by including references from one document to another. These are normalized data Models.
  • Embedded Data Model – This data model capture relationships between data by storing related data in a single document structure.These are denormalized data models which allow applications to retrieve and manipulate related data in a single database operation.

Here we are going to discuss about the document structure that stored in mongoDB, advantages and installation process of mongoDB and will understand how CRUD operations will perform in it.

First let’s understand the basic terminology of MongoDB with respect of RDBMS.

RDBMS MONGODB
Table Collection
Row Document
Column Field
Table Join Embedded Document

A document is stored in key value pair in mongoDB. MongoDB saves documents in BSON (Binary JSON) form and maximum size of any document can be 16 megabytes.
Let’s take an example of Student Document:

{
“_id”  : ObjectId(7df78ad8902c),
“name” :  “xyz”,
“emailId”  : “xyz@abc.com”,
“phoneNo”  : NumberLong(9999999999),
“address” : {
       “permanentAddress” :  “New Delhi-110011”,
       "correspondenceAddress" : “New Delhi-110011”,
        },
“class” : “8th”,
“rollNumber”  :  201,
“subjects”  :  [“Hindi”, “English”, “Maths”, “Science” , “Arts”]
}

In the above document first field is“_id” which is 12 bytes hexadecimal number generated by mongoDB itself, provides uniqueness to every document.

Advantages of MongoDB

  • MongoDB stores data in forms of Json kind of documents which are very flexible. There is no need to define schema for storing the documents. In a single document we can store multiple type of data. As we can see the same in our above example.
  • MongoDB provides embedded Document structure which reduce the need to perform complex join queries.
  • It provides high scalability.
  • MongoDB provides auto-Sharding and replication which increases the query performance.
  • It supports multiple storage engines.

Sharding in MongoDB

Sharding is a method for storing data across multiple machines. With sharding, you add more machines to support data growth and the demands of read and write operations.

MongoDB supports sharding through the configuration of sharded clusters.Sharded cluster has the following components: shards, query routers and config servers.

  • Shards A shard is a replica set that contains a subset of the data for the sharded cluster.Typically each shard is a replica set. The replica set provides high availability and data consistency for the data in each shard.
  • Query Routers – MongoDb query routers route queries and write operations to shards in sharded cluster.It provides an interface to the Applications to commute with the shards through sharded cluster.
  • Config servers – Config servers store the cluster’s metadata.The query router uses this metadata to target operations to specific shards.This data contains a mapping of the cluster’s data set to the shards.

 

MongoDB Storage Engines

The storage engine is the component that is responsible for managing how data is stored, both in memory and on disk. MongoDB supports multiple storage engines, as different engines perform better for specific workloads.

Storage Engines that supported by MongoDB are-

  • WiredTiger Storage Engine – It is the default storage engine starting in MongoDB 3.2.WiredTiger uses document-level concurrency control for write operations. As a result, multiple clients can modify different documents of a collection at the same time.
  • MMAPv1  Storage Engine – It is the default storage engine for MongoDB versions before 3.2.MMAPv1 is MongoDB’s original storage engine based on memory mapped files. It excels at workloads with high volume inserts, reads, and in-place updates.
  • In-Memory Storage Engine – The In-Memory Storage Engine is available in MongoDB Enterprise.Rather than storing documents on-disk, it retains them in-memory for more predictable data latencies. Other than some metadata and diagnostic data, the in-memory storage engine does not maintain any on-disk data, including configuration data, indexes, user credentials, etc.

Installation of MongoDB

MongoDB is best use in 64-bit operating systems. From MongoDB-3.2 32-bit binaries are depreciated and will be unavailable in in future.

For Windows
Download the latest release of MongoDB from http://www.mongodb.org/downloads and extract this to your C drive or any other drive where you want to store.
Now open the command prompt and run the given commands

C:\>move mongodb-win64-* mongodb

MongoDB requires a data directory to store all data. MongoDB’s default data directory path is /data/db, create the same.

md /data/db

You can specify an alternate path for data files using the –dbpath option to for example:

C:\mongodb\bin\mongod.exe --dbpath d:\test\mongodb\data

Now to start mongoDb specify the mongo.exe

C:\mongodb\bin\mongod.exe

Add C:\mongodb\bin to environment variable of your system so that you can access the same from anywhere.

For Linux
Run the following command to import the MongoDB public GPG Key

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10

Create a /etc/apt/sources.list.d/mongodb.list file using the following command

echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen'| sudo tee /etc/apt/sources.list.d/mongodb.list

Now issue the following command to update the repository

sudo apt-get update

Now install the MongoDB by using following command

apt-get install mongodb-10gen=3.2

Now start mongoDb.

sudo service mongodb start


 
CRUD Operations

Here we are going to perform CRUD operations on student collection of school database for clear understanding.

Let’s start the mongo shell first by entering the below command on your terminal

~ $ mongo
MongoDB shell version: 3.2.5
connecting to: test

View all databases

>show dbs

Create Database or use database

This command will switch the current database to the school database if it is already created or create a new database named school if it is not created and make it the current database.

> use school
switched to db school

View all collections in current database

>show collections

Insert Document to the collection

>db.student.insertOne({
“name” :  “xyz”,
“emailId”  : “xyz@abc.com”,
“phoneNo”  : NumberLong(9999999999),
“address” : {
   “permanentAddress” :  “New Delhi-110011”,
   “correspondeneAddress” : “ New Delhi-110011”,
  },
“class” : “8th”,
“rollNumber”  :  201,
“subjects”  :  [“Hindi”, “English”, “Maths”, “Science” , “Arts”]
})

This command will create a student collection if it is not present in the database and insert a document into it. If you want to insert multiple document in a collection at a time then you can use db.student.insertMany().
The method return a status of operation in form of document with auto generated ” _id” provided to the inserted document.

{
"acknowledged" : true,
"insertedId" : ObjectId("5742045ecacf0ba0c3fa82b0")
}

Read Documents

This operation will give all the documents in the student collection.

>db.student.find({})

This operation will give the document in the student collection that matches the field “name” with the value “xyz”.

>db.student.find({“name” : “xyz”})

Update Documents

Update a single document in the collection at a time that matches a specified filter. If you want to update multiple document then use {multi:true} option.
In the first braces we define the filter on which we want to perform the operations and in the second braces we define the value which we want to update.

>db.student.update({},{})

Update a single document in the collection at a time in the collection that matches a specified filter.

>db.student.updateOne({},{})

Update multiple document at a time in the collection that match specified filter.

>db.student.updateMany({},{})

Replace Documents

Replace at most a single document that match a specified filter even though multiple documents may match the specified filter.

>db.student.replaceOne({},{})

Delete Documents

Delete a single document or all documents that match a specified filter.

>db.student.remove({})

Delete at most a single document that match a specified filter even though multiple documents may match the specified filter.

>db.student.deleteOne({})

Delete all documents that match a specified filter.

>db.student.deleteMany({})

 
References
https://docs.mongodb.com/manual

8 thoughts on “Introduction to MongoDB

  • June 21, 2016 at 1:09 pm
    Permalink

    Nice Blog…..
    Can You Please Elaborate, what is in-place update in MMAPv1 Storage Engine?

    Reply
    • June 25, 2016 at 6:12 pm
      Permalink

      MMAPv1 Storage engine is based on memory mapped files. It does not load documents from disc, it access the memory page where the document has been stored.
      When any update operation perform through it then it directly updates the document in the memory without fetching the same from the disc. This updation is called in-place update.

      Reply
      • June 29, 2016 at 11:44 am
        Permalink

        Very Well…..

        Reply
        • July 6, 2016 at 4:48 am
          Permalink

          could you please explain more details about WiredTiger?

          Reply
  • June 21, 2016 at 2:09 pm
    Permalink

    Good Feature for mongodb.

    Reply
  • June 22, 2016 at 6:49 am
    Permalink

    nicely written,well concise and compact info.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*