How to Use MongoDB 2.4 with Python 3.3 from Red Hat Software Collections

This article is focused on MongoDB 2.4 packaged as software collections. Knowledge of MongoDB basics is recommended, but not required. In case you are not familiar with MongoDB and you’d like to learn more, try MongoDB’s online courses. These courses give you basic knowledge about MongoDB concepts, configuration, and deployment, as well as knowledge of how to program application for MongoDB.

This article is focused on what is different with Red Hat Software Collections (RHSCL) packages. These packages are available in RHSCL 1.1, and RPM packages are prefixed with `mongodb24`, which is also the name of the MongoDB24 collection.

What is in the MongoDB24 software collection?

For those who are not yet familiar with software collections yet, let’s summarize this concept quickly. In traditional Linux environments based on RPMs, you can only have one version of an application installed at one time because various versions of the same package usually conflict with each other. Software collections gives you power to build, install, and use multiple versions of software on the same system. What is more, they do so without affecting system-wide installed packages. The software collections concept is based on RPM and is a general concept available on Red Hat Enterprise Linux 5 and later, where the packages install their files into /opt/rh.

The software collection MongoDB24, as its name suggests, includes MongoDB 2.4.x (the third number in the version is only a bugfix release, which does not influence compatibility, so it is not important and thus not included in the name). This particular collection is a document database that works with JSON documents and belongs to the set of NoSQL databases. The most important package, which is installed after `yum install mongodb24` as well, is `mongodb24-mongodb-server` and includes `mongod` and `mongos` binaries, a database daemon itself, and a sharding server. If you want to use the MongoDB shell, you need to install the `mongodb24-mongodb` package. To use the MongoDB client interface in C, you will need to install `mongodb24-libmongodb`, and for building the client itself, also `mongodb24-libmongodb-devel`.

Another important thing to know about software collections is that for using a command from a collection, you need to enable an environment for the collection. As an example, instead of running `mongo –help`, you need to run `scl enable mongodb24 ‘mongo –help’`.

For demonstrations of how the software collections packages might collaborate, we’ll create a simple application in Python (specifically Python 3.3, which is also available in the Red Hat Software Collections). That collection includes all packages prefixed with `python33`, and to install a basic stack, you need to run `yum install python33`.

Where do the connectors for dynamic languages live?

In many cases, you need to connect to a MongoDB server using one of the dynamic languages, like Ruby, Python, or Java. The MongoDB community provides native clients that are not necessarily distributed together with the server, but rather as one of the modules in the language environments. For connecting to the MongoDB server from the Python 3.3 software collections, you will need to install the `pymongo` module. From Ruby, it will be the `rubygem-mongo` module (part of ror40 collection), and so on. As said before, the packages are prefixed, so you can install such a module using `yum install python33-python-pymongo`.

As said before, MongoDB client drivers are not tied to the particular MongoDB server version, so you may also use `pymongo` module from your base system python stack. In other words, you do not have to use Python from software collections if you do not want to. However, we will in the following scenario in order to demonstrate using Python33 and MongoDB24 collections together.

How do I start the server?

Let’s finally get to starting the MongoDB server. If you have already installed the MongoDB24 collection, you have installed the `mongodb24-mongodb-server` package as well. This package includes the SysV init script or `systemd` service file, depending on your RHEL 6 or 7 operating system version. The name of the service is prefixed with the collection name `mongodb24`, so it will not be in conflict with any of the name provided by a package from the base system. Otherwise, the service behaves as usual. To start the daemon, run the command `service mongodb24-mongodb start`.

If you are already familiar with the software collections concept, you might notice that we do not use the `scl enable mongodb24` here. That is because the software collections concept uses environment variables in its background and we need to ensure correct environment variables are set for the process. However, when starting the service, the daemon is started in a clean environment. Thus, running `scl enable mongodb24 service mongodb24 start` works, but the environment changes made by `scl` command are canceled out by `service` call. In order to link properly, the `mongod` daemon still needs to be run under the correct environment. This is ensured by running the `scl` command implicitly in the SysV init script or in the `systemd` service file. In other words, when starting the services in software collections, the user does not need to bother with the `scl` command.

By default, only the MongoDB24 software collection environment is enabled for `mongod` and `mongos` deamons. It is not necessary to change the list of software collections enabled for these processes usually, but if you need to add another one, just edit the MONGODB24_SCLS_ENABLED environment variable in /opt/rh/mongodb24/service-environment.

A started server reads configurations from /opt/rh/mongodb24/root/etc/mongodb.conf and opens a port configured there, which is by default 27017. The database files are stored into /opt/rh/mongodb24/root/var/lib/mongodb.

As database admins like to treat log files together, the `mongodb24` daemon stores logs to /var/log/mongodb24-mongodb/mongodb.log. If you see an error, look in that log file for details.

It does not seem to be consistent to store configuration and data files into /opt and log files into /var. We are currently considering moving all variable data into /var/opt/ and configuration files into /etc/opt, which would correspond more closely to FHS standards. Feel free to let us know what you think about this approach by leaving comments on this article.

How do I make /opt read-only?

If you want to make the /opt directory read-only, it is possible to change the location of the database files and the configuration file. As /opt/rh/mongodb24/root/etc/mongodb.conf is hard-coded in the daemon’s source code, moving the configuration file to /etc/opt/rh/mongodb24/mongodb.conf will need to be followed by creating a symlink /opt/rh/mongodb24/root/etc/mongodb.conf, which should point to the new configuration file location.

To move the database files, you need to create a directory for the files with proper owner and privileges, such as /var/opt/rh/mongodb24/lib/mongodb, while changing the `dbpath` configuration option in the configuration file mentioned above.

Please keep in mind that this configuration is not currently supported and you should use it at your own risk only.

What about the MongoDB shell?

The MongoDB shell is used either to manipulate data on a server quickly or to change the daemon’s settings. As already mentioned, the shell is not installed after installing the MongoDB24 collection, so the package `mongodb24-mongodb` package has to be installed manually.

The mongo binary, unlike starting the server, must be run in the software collection environment. That means you need to run `scl enable mongodb24 ‘mongo –help’` to see the mongo’s help file, for example. Since the data objects are created on demand, the following command set will create a database called `space` with a collection called `planets` (ensure the server has been started already):

#> service mongodb24-mongodb start
 $> scl enable mongodb24 mongo <<EOF
 > use space
 > db.planets.insert({'name': 'Mercury'})
 > EOF

After that, you may check which files have been created in the configured dbpath (which is by default /opt/rh/mongodb24/root/var/lib/mongodb). You should also see some results after querying the date using mongo shell:

$> scl enable mongodb24 'mongo' <<EOF
 > use space
 > db.planets.findOne()
 > EOF
 { "_id" : ObjectId("5386ff4cb0f217c50926d9fd"), "name" : "Mercury" }

Simple application using python33 and mongodb24 software collections

As mentioned above, connectors for MongoDB do not need to use any library from MongoDB collection and the drivers themselves are included in the languages’ collections. In case of python33 software collections it is `python33-python-pymongo`. What may be relevant is that python driver’s point of view does not differ if it communicates with MongoDB packaged as software collections or not. What matters is if the port (which is by default 27017) is ready to accept connections and responds properly. Thus, we do not need to enable mongodb24 collection in case we run the client application in python, but we must enable python33 software collections in our scenario.

Now, let’s create a simple Python 3.3 script that fetches a specified planet from the MongoDB database and adds a diameter to that planet. Create a Python script `add_diameter.py` with the following content:

#!/usr/bin/env python
import pymongo
import sys

try:
planet_name = sys.argv[1]
planet_diameter = int(sys.argv[2])
except (IndexError, ValueError):
print("Usage: planet_edit name diameter")
exit(1)

connection = pymongo.MongoClient("mongodb://localhost")
db = connection.space

# get the object from DB or create a new one
planet = db.planets.find_one({"name": planet_name})
if not planet:
planet = {"name": planet_name, "diameter": planet_diameter}
else:
planet["diameter"] = planet_diameter

# store the object
if db.planets.save(planet):
print("Planet %s with diameter %s saved." % (planet_name, planet_diameter))
exit(0)
else:
print("Saving did not work.")
exit(1)

We used `#!/usr/bin/env python` here, which may be good for testing the script in various environments. For production you should either keep the shebang be generated by `setuptools` module or use the full patch for the appropriate binary, in case of python33 it would be `#!/opt/rh/python33/root/usr/bin/python`.

Now, see that when we run the script without scl command, it will not work:

$ chmod a+x ./add_diameter.py
 $ ./add_diameter.py
 Traceback (most recent call last):
 File "./add_diameter.py", line 2, in <module>
 import pymongo
 ImportError: No module named pymongo

But as soon as we properly enable python33 collection, it will work as we want:

$> scl enable python33 './add_diameter.py Mercury 1300'

And we can check the database to see the diameter has been really inserted:

$> scl enable mongodb24 'mongo' <<EOF
 > use space
 > db.planets.findOne()
 > EOF
 { "_id" : ObjectId("5386ff4cb0f217c50926d9fd"), "diameter" : 1300, "name" : "Mercury" }

That’s all for now about the MongoDB24 and Python33 software collections. Please, do not be afraid to try it, and stay tuned for more articles about software collections! Also, any feedback is welcome in the comments here.

Leave a Reply