MongoDb - Databases, documents and collections
Descriptions
In this tutorial, we will walk you through the concepts and key facts of databases, documents and collection of MongoDB.
Databases
A number of databases can be run on a single MongoDB server. Default database of MongoDB is 'db', which is stored within data folder.
MongoDB can create databases on the fly. It is not required to create a database before you start working with it.
"show dbs" command provides you with a list of all the databases.
Run 'db' command to refer to the current database object or connection.
To connect to a particular database, run use command.
In the above command, 'student' is the database we want to select.
w3resource MongoDB tutorial has a separate page dedicated to commands related to creation and management of database.
Database names can be almost any character in the ASCII range. But they can't contain an empty string, a dot (i.e. ".") or " ".
Since it is reserved, "system" can't be used as a database name.
A database name can contain "$".
documents
document is the unit of storing data in a MongoDB database.
document use JSON (JavaScript Object Notation, is a lightweight, thoroughly explorable format used to interchange data between various applications) style for storing data.
A simple example of a JSON document is as follows :
{ site : "w3resource.com" }
Often, the term "object" is used to refer a document.
Documents are analogous to the records of a RDBMS. Insert, update and delete operations can be performed on a collection. The following table will help you to understand the concept more easily :
RDBMS |
MongoDB |
Table |
Collection |
Column |
Key |
Value |
Value |
Records / Rows |
Document / Object |
The following table shows the various datatypes which may be used in MongoDB.
Data Types |
Description |
string |
May be an empty string or a combination of characters. |
integer |
Digits. |
boolean |
Logical values True or False. |
double |
A type of floating point number. |
null |
Not zero, not empty. |
array |
A list of values. |
object |
An entity which can be used in programming. May be a value, variable, function, or data structure. |
timestamp |
A 64 bit value referring to a time and unique on a single "mongod" instance. The first 32 bit of this value refers to seconds since the UTC January 1, 1970. And last 32 bits refer to the incrementing ordinal for operations within a given second. |
Internationalized Strings |
UTF-8 for strings. |
Object IDs |
Every MongoDB object or document must have an Object ID which is unique. This is a BSON(Binary JavaScript Object Notation, which is the binary interpretation of JSON) object id, a 12-byte binary value which has a very rare chance of getting duplicated. This id consists of a 4-byte timestamp (seconds since epoch), a 3-byte machine id, a 2-byte process id, and a 3-byte counter. |
Collections
A collection may store number of documents. A collection is analogous to a table of a RDBMS.
A collection may store documents those who are not same in structure. This is possible because MongoDB is a Schema-free database. In a relational database like MySQL, a schema defines the organization / structure of data in database. MongoDB does not require such a set of formula defining structure of data. So, it is quite possible to store documents of varying structures in a collection. Practically, you don't need to define a column and it's datatype unlike in RDBMS, while working with MongoDB.
In the following code, it is shown that two MongoDB documents, belongs to same collection, storing data of different structures.
{"tutorial" : "NoSQL"} {"topic_id" : 7}
A collection is created, when the first document is inserted.
Pictorial Presencation : Collections and Documents
Valid collection names
Collection names must begin with letters or an underscore.
A Collection name may contain numbers.
You can't use "$" character within the name of a collection. "$" is reserved.
A Collection name must not exceed 128 characters. It will be nice if you keep it within 80/90 characters.
Using a "." (dot) notation, collections can be organized in named groups. For example, tutorials.php and tutorials.javascript both belong to tutorials. This mechanism is called as collection namespace which is for user primarily. Databases don't have much to do with it.
Following is how to use it programmatically :
db.tutorials.php.findOne()
capped collections
Imagine that you want to log the activities happening with application. you want to store data in the same order it is inserted. MongoDB offers Capped collections for doing so.
Capped collections are collections which can store data in the same order it is inserted.
It is very fixed size, high-performance and "auto-FIFO age-Out". That is, when the allotted space is fully utilized, newly added objects (documents) will replace the older ones in the same order it is inserted.
Since data is stored in the natural order, that is the order it is inserted, while retrieving data, no ordering is required, unless you want to reverse the order.
New objects can be inserted into a capped collection.
Existing objects can be updated.
But you can't remove an individual object from the capped collection. Using drop command, you have to remove all the documents. After drop, you have to recreate the capped collection.
Presently, maximum size for a capped collection is 1e9(i.e. 1X109) for 32 bit machines. For 64 bit machines, there is no theoretical limit. Practically, it can be extended till your system resources permit.
Capped collections can be used for logging, caching and auto archiving.
Use number of collections instead of one
This omits the requirement if creating index since you are not storing some repeating data on each object.
If applied on a suitable situation, it can enhance the performance.
Metadata
Information about a database is stored in certain collections. They are grouped in system namespace, as
dbname.system.*
The following table shows the collections and what they store
Collections with namespace |
Description |
dbname.system.namespaces |
list of all namespaces |
dbname.system.indexes |
list of all indexes |
dbname.system.profile |
stores database profiling information |
dbname.system.users |
list of users who may access the database |
dbname.local.sources |
stores replica slave configuration data and state |
dbname.local.sources |
stores replica slave configuration data and state |
There are two more options to store metadata :
database.ns files stores additional namespace / index metadata if exists.
Information on the structure of a stored object is stored within the object itself.
?