MongoDB Schema Validation Rules. How to apply schema validation rules in… | by Panos Zafeiropoulos | Mar, 2022

How to apply schema validation rules in a collection

Notebook and pencil

MongoDB is a very popular free and open source cross-platform document-oriented database. It is a NoSQL database and it is based on JSON-like documents. Document-based databases are either schema-less or they provide a certain level of flexibility defining schemas using schema validation rules.

For those who are coming from the RDBMS world, where a table structure is characterized by columns with strictly defined properties (type, size, etc.,), the ability to define schemas could be proved to be a quite useful option.

Generally, we can think that a MongoDB database object is similar to an RDBMS schema containing tables, views, and other RDBMS objects. Respectively, a MongoDB collection is analogous to a table, and a MongoDB document can be considered as a table row.

A MongoDB database can group together collections, a collection holds documents, and a document consists of a number of objects of key-value pairs, and even of other documents.

The purpose of this post is to demonstrate how we can apply some schema validation rules in a collection. For that, it is necessary to create an example MongoDB database with a MongoDB collection.

Here, you can get an overview.

It is presumed that you have an available and accessible running MongoDB instance. If you don’t have this, then you can easily achieve that by also using Docker and the official MongoDB Docker image to run a MongoDB container. Read more at https://www.mongodb.com/compatibility/docker.

For convenience, we are also going to use the MongoDB Compass which is the official GUI for MongoDB.

You can create and run a Docker container named ‘mongodb’ by running the following command:

docker run --name mongodb -p 27017:27017 -d mongo

After you created the container, you can stop and start it using the following commands, respectively:

docker stop mongodb
docker start mongodb

Also, you can always check the running containers, via the following:

docker ps

You can download the GUI Compass at the following link: https://www.mongodb.com/try/download/compass

After you install it, run it. Ensure that the mongodb container is up and running, and create a new connection using a connection string, which for our case it can be the following:

mongodb://localhost:27017/?readPreference=primary&appname=MongoDB%20Compass&directConnection=true&ssl=false

Then, after you have successfully connected to a MongoDB docker instance, you can create a new database and a new collection. Name them ‘ticket-management’ and ‘users,’ respectively.

The ‘users’ collection will store user documents and the documents should be validated by our validation rules.

Define a MongoDB document properties

As we have said before, a MongoDB document is an ordered set of key-value pairs. A key difference to the RDBMS is that a MongoDB document can store documents of any size of key-value pairs as well as nested documents.

However, in our case, we want to enforce the ‘users’ collection to hold documents of strictly the same properties (keys). This is analogous to the fields (columns) of a table in an RDBMS. So, for example, we want each document to have exactly the same properties/fields.

The mongo _id

Before we identify the fields of our ‘users’ collection, it’s worth mentioning that MongoDB automatically generates a special _id property/field each time a new document is being inserted into a collection.

The _id is a special data type for MongoDB. It is actually a MongoDB object (ObjectID) of BSON type with a 12-byte size. The 12-byte _id consists of the following:

  • 4 bytes representing the seconds since the Unix epoch
  • 3 bytes specific to the host — a machine identifier
  • 2 bytes of the process id, and
  • 3 bytes representing a counter, starting with a random value

Even the fact that an auto-generated _id is not actually a standard UUID. The _id fields can be considered unique. They are ordered, and they can be used as the ‘primary key’ of our collection.

After that, this is the example list of the fields for the ‘users’ collection:

_id
username,
password,
email,
registrationdate,
confirmed,
cancelled,
typeid,
countryid

The goal is to ensure (well, as much as we can) that all documents aimed to be inserted into the ‘users’ collection should consist of those fields.

In order to achieve all the documents to comply with the above fields, we will use a specific MongoDB schema. You can think that a MongoDB schema is nothing but a set of rules for document properties (keys) and values. Those rules are functioning on a per-collection basis. The rules should be followed (=validated) during each document’s insertion or update in the specific collection.

Such a set of rules should be defined using a JSON file according to the BSON standards.

We are not going to go through more details here, but you can read more about the MongoDB schema and how it works using the official documentation. For example, you can follow the links below:

After the short intro given above, now it’s time to define our MongoDB validation schema. The summary of what we actually want to define is given below:

  • The fields: username, emailand password should be present in each document (they are mandatory).
  • The fields: username, emailand password should be of type string, and their strings’ length should be between the minimum and maximum limits.
  • The field email should comply with a specific regex pattern.
  • The field: registrationdate should be of type date.
  • The fields: confirmed and canceled should be of type bool (Boolean: true or false).
  • The fields typeid and countryid should be of type int (integer), and their values ​​should be between a minimum and a maximum number.

We define our rules via various methods via mongo shell CLI or mongosh CLI, but since we have already created our ‘users’ collection in Compass, using the GUI of Compass seems to be the convenient way.

So, select the ‘users’ collection, click the Validation tab and put your JSON schema (leave the Validation Action and Validation Level to ERROR and STRICT options, respectively). Below is our example of validation rules that we will use:

mongodb (compass) collection validation schema

After we have saved our Validation rules in Compass, we can use the mongosh to get a taste of what they look like. Compass provides us with an embedded version of the mongosh CLI.

By default, the mongosh is connected to the ‘test’ database, as you can see above. So, switch to the tickets-management database and navigate to the validation rules using the db.getCollectionInfos() function:

It seems that we cannot go deeper and see/check the “properties” objects using the embedded mongosh.

However, we can jump into the container shell:

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
11b9a599c13e mongodb "docker-entrypoint.s…" 3 months ago Up 4 hours 0.0.0.0:27017->27017/tcp mongodb
. . .
docker exec -it mongodb bash
root@11b9a599c13e:/#

And run the mongosh from within it using these commands:

root@11b9a599c13e:/# 
root@11b9a599c13e:/# mongosh
Current Mongosh Log ID: 6229dc064130345cc3d542bf
Connecting to: mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000
Using MongoDB: 5.0.5
Using Mongosh: 1.1.6
For mongosh info see: https://docs.mongodb.com/mongodb-shell/
To help improve our products, anonymous usage data is collected and sent to MongoDB periodically (https://www.mongodb.com/legal/privacy-policy).
You can opt-out by running the disableTelemetry() command.
------
The server generated these startup warnings when booting:
2022-03-10T05:41:44.202+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
2022-03-10T05:41:45.856+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
------
Warning: Found ~/.mongorc.js, but not ~/.mongoshrc.js. ~/.mongorc.js will not be loaded.
You may want to copy or rename ~/.mongorc.js to ~/.mongoshrc.js.
test>

Then, we can switch to the ticket-management database and execute the db.collectionInfos() to obtain the validation rules information for the ‘users’ collection, as you can see below:

This time our validation rules are clearly presented.

Alternatively, we can run just the mongo CLI (not the mongosh), shown below:

root@11b9a599c13e:/# mongo
MongoDB shell version v5.0.5
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("bb023c59-6160-461d-b022-4d88658fb890") }
MongoDB server version: 5.0.5
================
Warning: the "mongo" shell has been superseded by "mongosh",
which delivers improved usability and compatibility.The "mongo" shell has been deprecated and will be removed in
an upcoming release.
For installation instructions, see
https://docs.mongodb.com/mongodb-shell/install/
================
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
https://docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
https://community.mongodb.com
---
The server generated these startup warnings when booting:
2022-03-10T05:41:44.202+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
2022-03-10T05:41:45.856+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
---
---
Enable MongoDB's free cloud-based monitoring service, which will then receive and display
metrics about your deployment (disk utilization, CPU, operation statistics, etc).
The monitoring data will be available on a MongoDB website with a unique URL accessible to you
and anyone you share the URL with. MongoDB may use this information to make product
improvements and to suggest MongoDB products and deployment options to you.
To enable free monitoring, run the following command: db.enableFreeMonitoring()
To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
>

Note that the mongo shell is depreciated, and it has been superseded by the mongosh.

Then again, we can switch to the ticket-management database and execute the db.collectionInfos() to obtain such information for the ‘users’ collection:

As you can see above, the result is pretty much the same.

After we have defined our Validation rules, we can use any of the available tools (Compass GUI, mongosh CLI, mongo CLI) and test if they work correctly. For that, we can try to insert some documents that do not meet the validation rules requirements and confirm their failure. Below, there are some such examples that can be used by your own as well.

Using mongosh

Let’s try to insert an empty document:

Now let’s try again with a document with an invalid email:

You can continue trying to insert documents with invalid values, eg, using a value of the typeid field — the value 0, for instance, when it should be at least:

Using Compass

Similarly, trying to insert documents that do not comply with our validation rules, you will keep getting failure errors, such as the following:

Using MongoDB validation rules is quite useful and saves us from a lot of headaches. However, it is not a panacea. As an example drawback, we can mention the inability to define uniqueness with fields, eg, we cannot prevent insertion (or update) of a document with a username value that is already existing in another document.

Another example is that we cannot also prevent the insertion of documents that do not have all the fields (apart from the required ones). And so on.

However, as MongoDB suggests, such challenges can be solved in our business logic in middleware, but this is the subject of another post. So, stay tuned!

That’s it!

Thank you for reading, and happy coding!

Leave a Comment