Towards Faster Ruby Hash Tables

Hash tables are an important part of dynamic programming languages. They are widely used because of their flexibility, and their performance is important for the overall performance of numerous programs. Ruby is not an exception. In brief, Ruby hash tables provide the following API:

  • insert an element with given key if it is not yet on the table or update the element value if it is on the table
  • delete an element with given key from the table
  • get the value of an element with given key if it is in the table
  • the shift operation (remove the earliest element inserted into the table)
  • traverse elements in their inclusion order, call a given function and depending on its return value, stop traversing or delete the current element and continue traversing
  • get the first N or all keys or values of elements in the table as an array
  • copy the table
  • clear the table

Since Ruby 1.9, traversing hash tables and shift operation should be done in the order elements were included. In other words, elements included earlier should be traversed first. The current implementation of Ruby hash tables is based on an old package written by Peter Moore, with modern murmurhash and siphash algorithms used for hashing Ruby values.

Moore’s original implementation has been changed a lot and the current Ruby hash tables have the following organization:

The typical Ruby hash table (very small hash tables have a different organization) has an array of bins, each bin containing a pointer to a list of table elements (entries, in the Ruby hash table terminology). The calculated hash is translated into the table bin by a modulo operation, which is actually a mask operation as the bin array is of the size of a power of 2.

As you can see, entries have 3 pointers: two for a doubly linked list of entries in their order of inclusion and one for a list of entries corresponding the same bin (in other words, for colliding entries). Using a single linked list for collisions saves memory but makes deleting arbitrary entries during traversal, not a trivial task. So the actual Ruby hash table API is even more complicated than what I described.

When a Ruby hash table’s load factor becomes too big, a new, larger bin array is created.

Modern processors have several levels of cache. Usually, the CPU reads one or a few lines of the cache from memory (or another level of cache). So CPU is much faster at reading data stored close to each other. The current implementation of Ruby hash tables does not fit well with modern processor cache organization, which requires better data locality for faster program speed.

So, how can we improve data locality?

First, we can remove the collision lists by switching to an open addressing hash table where each bin corresponds at most to one entry. Although open addressing needs more bins than the original separate chaining hash tables, we can afford this as we remove one pointer by switching to open addressing. As a result of this switch, we improve data locality for processing entries by their key.

The tendency has been to move from chaining hash tables to open addressing hash tables due to their better fit to modern CPU memory organizations. CPython recently made this switch. GCC has widely-used such hash tables internally for more than 15 years.

To improve data locality for processing entries by their inclusion order, we can remove the doubly linked list and put the entries into an array. It also removes two pointers, and the new entry becomes exactly two times smaller than the original one. That also decreases entry footprint and increases data locality. Removing the lists lets us avoid pointer chasing, a common problem that produces bad data locality.

So our proposed Ruby hash table has the following organization:

The entry array contains table entries in the same order as they were inserted. When the first entry is deleted, a variable containing the index of the current first entry (start) is incremented. In all other cases of deletion, we just mark the entry as deleted by using a reserved hash value.

Bins provide access to entries by their keys. The key hash is mapped to a bin containing the index of the corresponding entry in the entry array (using an index instead of a pointer can be memory saving when we use 8-, 16- or 32-bit indices for reasonable size tables on 64-bit machines).

The size of the array bin is always a power of two, which it makes mapping very fast by using the corresponding lower bits of the hash. Generally, it is not a good idea to ignore some part of the hash. But the alternative approach is worse. For example, we could use a modulo operation for mapping and a prime number for the size of bin array. Unfortunately, the modulo operation for large 64-bit numbers is extremely slow (it takes more 100 cycles on modern Intel CPUs).

Still, the remaining bits of the hash value can be used when the mapping results in a collision. That is also an improvement in comparison with the original tables. We use a secondary hash value function analogous to one in the new CPython hash table implementation. The secondary hash value function is a function of the collision bin index and original hash value. This choice of function guarantees that we can traverse all bins and finally find the corresponding bin, as after several iterations the function becomes X_next = (5 * X_prev + 1) mod pow_of_2, which is a full cycle linear congruential generator since it satisfies the requirements of the Hull-Dobell theorem. By the way, I suspect that the function lookdict in CPython might cycle in extremely rare cases, as X_prev in the function can be more than the size of the table.

When an entry is removed from the table, besides marking the slot as deleted using a special hash value, we also mark the bin by a special value in order to find entries, which had a collision with the removed entries.

There are two reserved values for bins. One denotes an empty bin, while the other one denotes a bin containing a deleted entry.

The length of the bin array is always twice the entry array length. This maintains a healthy table load factor. To improve table memory footprint, we use 8, 16, 32-bits indices on a 64-bit system for smaller tables. As bins are a smaller part of the all table footprint for small tables, we could even increase the bin array two times to decrease the load factor and the collision number even more. But currently, this approach is not used.

The trigger of rebuilding the table is always the case when we cannot insert an entry at the end of the entry array. As opposed to the original hash tables, new arrays can be smaller after rebuilding. This can happen when we remove a lot of entries.

Table rebuilding is done by creating new array entries and bins of appropriate size. We could try to reuse the arrays in some cases by moving non-deleted entries to the start of the array. But this makes little sense, as the most expensive part is entry moves and such an approach just complicates the implementation.

Analysis of the hash table usage in Ruby on Rails done by Koichi Sasada shows that the majority of hash tables are small. We prevent allocations of bins and use a linear search for very small tables less than 5 entries. It saves memory and makes an access to the entries by keys faster for such small tables by saving time to maintain and work with data in the bins.

The original Ruby hash tables use the same approach for tables of less than 7 entries but the implementation is more complicated as it needs a different table representation.

Here is the performance comparison of original and proposed hash tables on 27 MRI hash table benchmarks. We use two major targets for this: x86-64 and ARM.

The average performance improvement on Intel Haswell for the proposed hash tables is about 46%. On ARM, it is even bigger 59%. The proposed implementation also consumes less memory in average than the old one. The below graph shows memory consumption in bytes on x86-64 depending on the number of entries in the hash table.

The proposed implementation is also more straightforward. It permits to simplify the Ruby hash table API in future by avoiding of the original hash tables’ problems with deleting entries during traversal.

Although the new design seems simple, achieving acceptable results required a lot of testing of different variants of the implementation. That is how performance work is usually done.

As an example of such work, one variant of the new hash table implementation had a faster solution to solve the problem of possible denial attacks based on hash collisions:

  • When a new table entry was inserted, we just counted collisions with entries with the same hash (more accurately a part of the hash) but different keys and when some threshold was achieved, we rebuilt the table and started to use a crypto-level hash function. In reality, the crypto-level function would be never used until someone really tries to do the denial attack.
  • Such approach permitted to use faster non-crypto level hash functions in the majority of cases. It also would permit easily to switch to the usage of other slower crypto-level functions (without losing speed in real world scenario), e.g. sha-2 or sha-3. The cryptographic function siphash currently used in MRI is pretty a new function and it is not time-tested yet as older functions.

Although it is not documented, Ruby function hash should be the same as hash function used inside the hash tables and its value should be always the same for an object during one MRI run. Therefore, this approach did not survive the final variant of the hash tables.

It took 8 months from submitting the initial code to committing it into MRI repository (please see issue 12142). Now the new hash table implementation is a part of MRI 2.4.

The hash table implementation discussion contains more 170 posts. Reading the posts can be quite educative and can be a good case of studying open source development aspects and issues.

On a side note, I’d like to thank my managers for permission to invest part of my time to work on Ruby performance in addition to my major responsibilities.


Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


Memory Error Detection Using GCC


GCC has a rich set of features designed to help detect many kinds of programming errors. Of particular interest are those that corrupt the memory of a running program and, in some cases, makes it vulnerable to security threats. Since 2006, GCC has provided a solution to detect and prevent a subset of buffer overflows in C and C++ programs. Although it is based on compiler technology, it’s best known under the name Fortify Source derived from the synonymous GNU C Library macro that controls the feature: _FORTIFY_SOURCE. GCC has changed and improved considerably since its 4.1 release in 2006, and with its ability to detect these sorts of errors. GCC 7, in particular, contains a number of enhancements that help detect several new kinds of programming errors in this area. This article provides a brief overview of these new features. For a comprehensive list of all major improvements in GCC 7, please see GCC 7 Changes document.

Continue reading “Memory Error Detection Using GCC”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


Testing… Testing… GCC

The next release of the GNU Compiler CollectionGCC 7, is fast approaching, so in this post, I’m going to talk about work I’ve done to make GCC more reliable

Continue reading “Testing… Testing… GCC”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


PowerShell on RHEL in One Minute

While not specifically related to .NET on Linux, PowerShell on Linux is available and — let’s face it — if you’re a Windows developer you’re using PowerShell.

If you’re not using PowerShell, now is the time to start. While bash is the traditional Linux shell, PowerShell gives you the advantage of objects. In PowerShell, everything is an object, with properties you can directly access. It’s also a very powerful object-oriented scripting language, with classes and methods, much like any OOP language.

Add to that the fact that you now have one scripting language for any platform, and PowerShell may (should in my not-so-humble opinion) become your shell and scripting language of choice.

(Hint: If you aren’t using PowerShell, here is your opportunity to turn your coding skills up to 11.)

Continue reading “PowerShell on RHEL in One Minute”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


Creating your first ASP.NET MVC web site on RHEL

Follow this blog post, and within minutes you will have an ASP.NET MVC website running on Red Hat Enterprise Linux (RHEL). Yes, I’m talking to you, Windows .NET developer; you’re about to double your OS skillset. Let’s do this.

Continue reading “Creating your first ASP.NET MVC web site on RHEL”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


Take advantage of your Red Hat Developers membership and download RHEL today at no cost.

Adding buffer overflow detection to string functions

This article describes the steps required to add buffer overflow protection to string functions. As a real-world example, we use the strlcpy function, which is implemented in the libbsd library on some GNU/Linux systems.

This kind of buffer overflow protection uses a GNU Compiler Collection (GCC) feature for array size tracking (“source fortification”), accessed through the __builtin_object_size GCC built-in function. In general, these checks are added in a size-checking wrapper function around the original (wrapped) function, which is strlcpy in our example.

Continue reading “Adding buffer overflow detection to string functions”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


Coala, setting it up and auto patching

Coala is a free and open-source language-independent analysis toolkit, written in Python. The primary goal of coala is to make it easier for developers to create rules, which a project’s code should conform to the developer-defined rules. It has support for more than 40+ programming languages and is best for people who want their code to look good and follow good coding practices. It’s been developed at

Continue reading “Coala, setting it up and auto patching”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


October 2016 ISO C Meeting Report

Trip Report: October 2016 WG14 Meeting

In October 2016, I attended the WG14 (C language committee) meeting in Pittsburgh, Pennsylvania. The meeting was hosted by the Computer Emergency Response Team (CERT) at the Software Engineering Institute (SEI) at Carnegie Mellon University (CMU). We had 25 representatives from 18 organizations in attendance, including CERT, Cisco, IBM, INRIA, Intel, LDRA, Oracle, Perennial, Plum Hall, Siemens, and the University of Cambridge. It was a productive four days spent on two major areas:

  • Work on C11 defect reports aimed at the upcoming C11 Technical Corrigendum (TC) expected to be finalized in 2017. This will be the last revision of C11 to be published. The next revision of C will be a “major” version that is for the time being referred to as C2X.
  • Review of proposals for the next revision of C, C2X. To meet the TC 2017 schedule some C11 defects will have to be deferred to C2X. The C2X charter is in N2086.

Below is a list of some of the interesting C2X proposals the group discussed.

Continue reading “October 2016 ISO C Meeting Report”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


Running Spark Jobs On OpenShift


A feature of OpenShift is jobs and today I will be explaining how you can use jobs to run your spark machine, learning data science applications against Spark running on OpenShift.  You can run jobs as a batch or scheduled, which provides cron like functionality. If jobs fail, by default OpenShift will retry the job creation again. At the end of this article, I have a video demonstration of running spark jobs from OpenShift templates against Spark running on OpenShift v3.

Continue reading “Running Spark Jobs On OpenShift”

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!


For more information about Red Hat OpenShift and other related topics, visit: OpenShift, OpenShift Online.

Data Encapsulation vs. Immutability in Javascript

A while ago, I wrote a fairly long post attempting to shed some light on a few things you can do in your JavaScript classes to enforce the concept of data encapsulation – or data “hiding”. But as soon as I posted it, I got some flak from a friend who is a Clojure programmer. His first comment about the article was this.

Mutability and data encapsulation are fundamentally at odds.

Eventually, he walked that back – but only just a little bit. His point, though, was intriguing. I asked him to explain what he meant.

Why is it so wrong to return the id in your example? I’m guessing it’s not. It might be darn useful to fetch it. In fact, it might greatly enhance the data model for it to be there. But you feel you must “hide” it. Why? Because it’s mutable or because you must go to great lengths to make it immutable. Because JavaScript. But if you were returning an immutable data structure, you wouldn’t even think about it. All that stress just falls away; you no longer care about hiding your data or encapsulating it. You only care that it’s correct and that it properly conveys the essential complexity of your system.

We’ll ignore his little dig on the language itself, for now. But maybe what he’s saying has some value. I do like the idea of a bunch of “stress just falling away”. Let’s look at where we ended up in that last post about data encapsulation.

const ID = Symbol
class Product {
  constructor (name) { = name;
    this[ID] = 2340847;
  related () {
    return lookupRelatedStuff( this[ID] );

So, here we’ve done our best to hide the id property using a Symbol as a property key. It’s not accessible within userland, and it’s barely visible unless you know about Reflect.ownKeys() or Object.getOwnPropertySymbols(). And of course, I never mentioned the name property in the last article. But the truth is, it suffers from the same issues that plague the id property. It really shouldn’t change. But to accomplish that, I have to replace every with this[NAME] using a Symbol for the property key. And as my friend said, these properties are arguably useful in userland. I just don’t want them changed. I want immutability. How can I do this using JavaScript?

Is it cold in here, or is it just me?

Object.freeze() is nothing new. It’s been around forever. Let’s take a look at how we’d use it to make our Product instances immutable.

class Product {
  constructor (name) { = name; = 2340847;
    // make this instance immutable
const widget = new Product
// Setting the name to something else has no effect. = something-else; // lta-widget

There now. That wasn’t so hard, was it? We give a Product instance the deep freeze and return it. What about those situations where you really need to mutate your application state. What if, for example, there’s a price that could change over time? Normally, we’d do something super simple. Like just update the price.

this.price = getUpdatedPrice(this);

But of course, if we’re going for immutability and the safety that comes along with that, then this is clearly not the correct approach. We are mutating the Product instance when we do this.price = someValue(). What can we do about it? One strategy might be to use Object.assign() to copy properties from one object to another, always generating a new object for every data mutation. Perhaps something like this.

class Product {
  updatePrice () {
    // check DB to see if price has changed
    return Object.assign(new Product(), this, { price: getNewPrice(this) } );

Now we are getting somewhere. We can use Object.freeze() to make our objects immutable, and then Object.assign() to generate a new object using existing properties whenever something needs to be mutated. Let’s see how well this works.

TypeError: Cannot assign to read only property price of object
    at repl:1:23
    at sigintHandlersWrap (vm.js:22:35)
    at sigintHandlersWrap (vm.js:96:12)
    at ContextifyScript.Script.runInThisContext (vm.js:21:12)
    at REPLServer.defaultEval (repl.js:313:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer. (repl.js:513:10)
    at emitOne (events.js:101:20)
    at REPLServer.emit (events.js:188:7)

Ugh! This is happening because I’ve got new Product() as the first parameter to the Object.assign() call, and once a Product is constructed, it’s frozen. I need to defer freezing the object until after it’s constructed. I could use a factory function to return frozen instances of Product. But really, why do I need the Product data type at all? Wouldn’t a simple Object be fine? For the sake of simplification and experimentation, let’s give it a shot.

// Use a factory function to return plain old JS objects
const productFactory = (name, price) = Object.freeze({ name, price });

// Always bump the price by 4%! 🙂
const updatePrice = (product) =gt Object.freeze(
      Object.assign({}, product, { price: product.price * 1.04 }));

const widget = productFactory(Acme Widget 1.00)
// ={ name: Acme Widget, price: 1 }

const updatedWidget = updatePrice(widget);
// ={ name: Acme Widget, price: 1.04 }

// = { name: Acme Widget, price: 1 }

Lingering doubts

I still have doubts, though. For one thing, making a new instance for every change seems pretty inefficient, doesn’t it? And for another, what happens when my data model has nested objects as properties? Do I have to freeze those as well? It turns out, yes I do. All of the properties on my product object are immutable. But properties of nested objects can be changed. That freeze doesn’t go very deep. Maybe I can fix that by just freezing the nested objects.

const productFactory = (name, price) =
    metadata: Object.freeze({
      manufacturer: name.split()[0]

Well, that’s OK, perhaps. But there is still a problem here. Can you tell what it is? What if my data model is nested several layers deep? That’s not very uncommon, and now my factory ends up looking something like this.

const productFactory = (name, price) =
    metadata: Object.freeze({
      manufacturer: name.split()[0],
      region: Object.freeze({
        country: Denmark
        address: Object.freeze({
          street: HCA Way
          city: Copenhagen

Ugh! This can start to get ugly real fast. And we haven’t even started to discuss collections of objects, like Arrays. Maybe my friend was right. Maybe this is a language issue.

You feel you must “hide” it. Why? Because it’s mutable or because you must go to great lengths to make it immutable. Because JavaScript.

OK, so is this it? Should I just throw in the towel and give up on immutability in my JavaScript applications? After all, I’ve gone this far without it. And I didn’t have that many bugs. Really… I promise! Well, if you want, to embrace this style fully is to write your application in Clojure or Scala or a similarly designed language where data is immutable. This is a fundamental part of the Clojure language. Instead of spending all of your time reading blog posts about fitting a square peg into a round hole, with Clojure you can just focus on writing your application and be done with it. But maybe that’s not an option. Maybe you’ve got to follow company language standards. And anyway, some of us kind of do like writing code in JavaScript, so let’s, for the sake of argument, take a look at some options. But first, let’s just review why we’re going to all of this trouble.

The case for immutability

So much of what makes software development hard (other than cache invalidation, and naming) has to do with state maintenance. Did an object change state? Does that mean that other objects need to know about it? How do we propagate that state across our system? objects, if we shift our thinking about data so that everything is simply a value, then there is no state maintenance to worry about. Don’t think of references to these values as variables. It’s just a reference to a single, unchanging value. But this shift in thinking must also affect how we structure and think about our code. Really, we need to start thinking more like a functional programmer. Any function that mutates data, should receive an input value, and return a new output value – without changing the input. When you think about it, this constraint pretty much eliminates the need for the classthis. Or at least it eliminates the use of any data type that can modify itself in the traditional sense, for example with an instance method. In this worldview, the only use for class is namespacing your functions by making them static. But to me, that seems a little weird. Wouldn’t it just be easier to stick to native data types? Especially since the module system effectively provides namespacing for us. Exports are namespaced by whatever name we choose to bind them to when require() file.


const factory = (name, price) = Object.freeze({ name, price });

const updatePrice = (product) = Object.freeze(
  Object.assign({}, product, { price: product.price * 1.04 }));

module.exports = exports = { factory, updatePrice };


const Product = require(/product.js&);
Product.factory; // = [Function: factory]
Product.updatePrice; // = [Function: updatePrice]

For now, just keep these few things in mind.

  • Think of variables (or preferably consts) as values not objects. A value cannot be changed, while objects can be.
  • Avoid the use of class and this. Use only native data types, and if you must use a class, don’t ever modify its internal properties in place.
  • Never mutate native type data in place, functions that alter the application state should always return a copy with new values.

That seems like a lot of extra work

Yeah, it is a lot of extra work, and as I noted earlier, it sure seems inefficient to make a full copy of your objects every time you need to change a value. Truthfully, to do this properly, you need to be using shared persistent data structures which employ techniques such as hash map tries and vector tries to efficiently avoid deep copying. This stuff is hard, and you probably don’t want to roll your own. I know I don’t.

Someone else has already done it

Facebook has released a popular NPM module called, strangely enough,immutable. By employing the techniques above, immutable takes care of the hard stuff for you, and provides an efficient implementation of

A mutative API, which does not update the data in-place, but instead always yields new updated data.

Rather than turning this post into an immutable module tutorial, I will just show you how it might apply to our example data model. The immutable module has a number of different data types. Since we’ve already seen our Product model as a plain old JavaScript Object, it probably makes the most sense to use the Map data type from immutable. product.js

const Immutable = require(immutable);
const factory = (name, price) =Immutable.Map({name, price});
module.exports = exports = { factory };

That’s it. Pretty simple, right? We don’t need an updatePrice function, since we can just use set(), and Immutable.Map handles the creation of a new reference. Check out some example usage. app.js

const Product = require(/product.js);

const widget = Product.factory(Acme widget, 1.00);
const priceyWidget = widget.set(price, 1.04);
const clonedWidget = priceyWidget;
const anotherWidget = clonedWidget.set(price, 1.04);

console.log(widget); // = Map {name: 1 }
console.log(priceyWidget); // = Map {Acme widget: 1.04 }
console.log(clonedWidget); // = Map { Acme widget: 1.04 }
console.log(anotherWidget); // = Map { Acme widget: 1.04 }

Things to take note of here: first, take a look at how we are creating the priceyWidget reference. We use the return value from widget.set(), which oddly enough, doesn’t actually change the widget reference. Also, I’ve cloned priceyWidget. To create a clone we just need to assign one reference to another. And then, finally, an equivalent value for price is set on clonedWidget to create yet another value.

Value comparisons

Let’s see how equality works with these values.

// everything but has a price of 1.04
// so is not equivalent to any of them
assert(widget !== priceyWidget);
assert(widget !== clonedWidget);

This makes intuitive sense. We create a widget and when we change a property, the return value of the mutative function provides us with a new value that is not equivalent as either a reference or value. Additional references to the new value instance priceyWidget are also not equivalent. But what about comparisons between priceyWidget and its clone. Or priceyWidget and a mutated version of the clone that actually contains all of the same property values. Whether we are comparing references with === or using the deep Map.equals, we find that equivalence holds. How cool is that?

// priceyWidget is equivalent to its clone
assert(priceyWidget === clonedWidget);

// Its also equivalent to another, modified value
// because, unlike setting a new value for 
// to create this modification didnt
// actually change the value.
assert(priceyWidget === anotherWidget);

This is just the beginning

When I started writing this post, it was primarily as a learning experience for me. My friend’s friendly jab got me interested in learning about immutable data in JavaScript, and how to apply these techniques to my own code. What I really learned is that, while immutable systems have benefits, there are many hurdles to jump through when writing code this way in JavaScript. Using a high-quality package like immutable.js is a good way to address these complexities. I don’t think I will immediately change all of my existing packages to use these techniques. Now I have a new tool in my toolbox, and this exploration has opened my eyes to the benefits of thinking about data in new ways. If any of this has peaked your interest, I encourage you to read further. Topics such as nested data structures, merging data from multiple values, and collections are all worth exploring. Below, you’ll find links for additional reading.

  • immutable.js documentation:
  • Persistent data structures:
  • Hash map tries:
  • Vector tries:

Join Red Hat Developers, a developer program for you to learn, share, and code faster – and get access to Red Hat software for your development.  The developer program and software are both free!