Featured image for Apache ActiveMQ Artemis.

Apache ActiveMQ and Apache Artemis (or ActiveMQ Artemis) are open source message brokers with similar functionality. Both implementations are venerable, with histories that go back to the early 2000s. However, Artemis is in some senses a more modern implementation because, ironically, it has a smaller feature set. This makes Artemis easier to maintain, which is important if you're basing a commercial product on it. The smaller feature set means a smaller overall implementation, which fits well with developing microservices.

Early versions of Red Hat AMQ were based on ActiveMQ, but attention has shifted to Artemis in AMQ 7. ActiveMQ is not maintained as vigorously as it once was by the open source community, but at the time of writing, Amazon is still offering a message broker service based on ActiveMQ. Whether it has a long-term future, at Amazon or elsewhere, remains to be seen.

Leaving aside ActiveMQ's complex niche features (such as message routing based on Apache Camel rules), ActiveMQ and Artemis look similar to the integrator and, in most practical applications, provide comparable throughput. However, they differ in important areas. Message distribution in the presence of multiple active brokers causes particular problems for integrators who want to move from ActiveMQ to Artemis.

This article describes subtleties that can lead to lost messages in an Artemis active-active mesh. That architecture consists of multiple message brokers interconnected in a mesh, each broker with its own message storage, where all are simultaneously accepting messages from publishers and distributing them to subscribers. ActiveMQ and Artemis use different policies for message distribution. I will explain the differences and show a few ways to make Artemis work more like ActiveMQ in an active-active scenario.

Configuring the Artemis broker mesh

For simplicity, I'm assuming that the brokers in the mesh have network names like broker1, broker2, etc., and that each listens for all messaging protocols on port 61616 (this is the default for Artemis as well as ActiveMQ). The setup I describe below is for broker1, but there is a high degree of symmetry between the brokers, so it isn't hard to work out the other broker settings.

When creating a new broker, the usual approach is to run artemis create brokerXXX to create an outline configuration. I'm assuming that you have done this initial configuration, and so only mesh-related configuration has to be added to etc/broker.xml.

The acceptor definition

Every Artemis broker has at least one acceptor definition that defines the TCP port and the protocols that will be accepted on that port. There's probably nothing different about this definition in a broker mesh, compared to a standalone broker. Here's an example, for a broker that accepts all wire protocols on port 61616:

<acceptor name="artemis">tcp://0.0.0.0:61616?
   protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE/>

In practice, an acceptor that handles multiple protocols will probably have a lot of additional configuration, but that's not really relevant here. In any case, the instance-creation step will already have created an outline entry. You'll need to change it only if you want a specific configuration, such as using different network interfaces for client and interbroker communication.

The connectors

Next, we need to define connectors. These are equivalent to the network connector definitions in ActiveMQ, but there is one significant difference: With Artemis, we usually define the broker itself as a connector. Here is an example:

  <connectors>
    <connector name="myself">tcp://broker1:61616</connector>
    <connector name="broker2">tcp://broker2:61616</connector>
    <connector name="broker3">tcp://broker3:61616</connector>
  </connectors>

The first entry, myself, denotes the current broker with its hostname and port. Subsequent entries define the other brokers in the mesh. For symmetry, I could have given the self-referential connector the name broker1, to match the other brokers that follow. This naming approach may be useful if you have a large mesh and you want to cut and paste your configuration from one broker to another. However, sometimes it is clearer to make the self-referential connector stand out in some way. In any case, the important point is to define connectors for every broker in the mesh, including this one.

The broker mesh

The final vital piece of configuration assembles the various broker connectors into a mesh. Artemis provides various discovery mechanisms by which brokers can find one another in the network. However, if you're more familiar with ActiveMQ, you're probably used to specifying the mesh members explicitly. The following example shows how to do that, for the connectors listed in the configuration just shown. Note that I'm referring to this broker itself as myself, to match the previous connector definition. It would be a mistake to list the current broker as a cluster connection, which is why I prefer to use a distinctive name.

 <cluster-connections>
    <cluster-connection name="my_mesh">
      <connector-ref>myself</connector-ref>
      <message-load-balancing>ON_DEMAND</message-load-balancing>
      <static-connectors>
        <connector-ref>broker2</connector-ref>
        <connector-ref>broker3</connector-ref>
      </static-connectors>
    </cluster-connection>
  </cluster-connections>

Note: I'll have more to say about message-load-balancing later.

You'll probably want to configure your clients to know about the mesh, as well. Again, Artemis provides a number of discovery mechanisms, allowing clients to determine the network topology without additional configuration. These don't work with all wire protocols (notably, there is no discovery mechanism for Advanced Message Queuing Protocol), and ActiveMQ users are probably familiar with configuring the client's connection targets explicitly. The usual mechanism is to list all the brokers in the mesh in the client's connection URI.

Why the Artemis configuration isn't (yet) like ActiveMQ

With the configuration in the previous section, you should have a working mesh. That is, you should be able to connect consumers to all the nodes, produce messages to any node, and have them routed to the appropriate consumer. However, this mesh won't behave exactly like ActiveMQ, because Artemis mesh operation is not governed by client demand.

Forwarding behavior

In ActiveMQ, network connectors are described as "demand forwarding." This means that messages are accepted on a particular broker and remain there until a particular client requests them. If there are no clients for a particular queue, messages remain on the original broker until that situation changes.

On Artemis, forwarding behavior is controlled by the brokers, and is only loosely associated with client load. In the previous section's configuration, I set message-load-balancing=ON_DEMAND. This instructs the brokers not to forward messages for specific queues to brokers where there are, at present, no consumers for those queues. So if there are no consumers connected at all, the routing behavior is similar to that of ActiveMQ: Messages will accumulate on the broker that originally received them. If I had set message-load-balancing=STRICT, the receiving broker would have divided the messages evenly between the brokers that defined that queue. With this configuration, the presence or absence of clients should be irrelevant ... except it isn't quite that simple, and the complications are sometimes important.

How the message queue is defined

Even with STRICT load balancing, brokers won't forward messages to other brokers that don't know about the queue. If queues are administratively defined, all brokers know about all queues and accept messages for them in STRICT mode. If the queues are auto-created by clients, and there are no clients for a specific queue, a producer on broker1 could send a message for a queue that was not known on broker2. As a result, messages would never be forwarded. In short: It makes a difference whether a queue is defined administratively or auto-created. There is no such difference in message distribution in ActiveMQ, because it is driven by client demand.

Even with ON_DEMAND load balancing, Artemis's behavior is not the same as ActiveMQ's. A particular difference is that message distribution decisions are made when the message arrives. It is at that point that the broker sees what clients are connected and routes the message as it deems appropriate. If there are no clients for a specific queue at that time, the message will not be routed.

What this means is that if a client that is connected to broker1 goes down for some reason, and then reconnects, it will not receive any of the messages that came in the meantime. Even if there are no other clients for that queue on any other broker, the message will not be routed from its original location. It's too late—the routing decision has already been made.

This is a particular problem for broker installations that are behind a load balancer or similar proxy. There's usually no way of knowing which broker a client will ultimately connect to because the load balancer will make that decision. But if a client has the bad fortune to get connected to a broker that has never hosted that client before, no messages that came earlier will be routed to it, even if it subscribes to a queue that has messages on some other broker. To fix this problem, we need message redistribution.

Message redistribution in Artemis

ActiveMQ has no need for a message redistribution mechanism, because all message flows over the network connectors are coordinated by client demand. As we've seen, this is not the case for Artemis, where all message distribution is controlled by the brokers. In the usual run of events, distribution decisions are made when messages arrive, and they are irrevocable.

Artemis does have a way to redistribute messages after that point, but it is not enabled by default. The relevant setting is made on a specific address, or group of addresses, like this:

 <address-setting match="#">
    <redistribution-delay>1000</redistribution-delay>
    ...
  </address-setting>

The value supplied for redistribution-delay is in units of milliseconds. This value is the length of time for which a broker will leave messages on a specific address that has no consumer, before sending them somewhere else. The default value is -1, meaning "do not redistribute.""

A redistribution delay of seconds or minutes, rather than milliseconds, probably creates less load on the broker and the network. In short, if you set an ON_DEMAND load-balancing policy, and enable message redistribution with a relatively short delay, the broker mesh will largely look to clients like an ActiveMQ mesh.

Why the Artemis configuration still isn't (exactly) like ActiveMQ's

We have started to solve the problem of lost messages on Artemis. There are a number of subtle differences between Artemis and ActiveMQ, however, and it's impossible to get exactly the same behavior that ActiveMQ implements.

Message selectors

A particular problem involves message selectors. If a client subscribes to a queue using a selector, it expects to receive only messages that match the selector. But what happens if different clients subscribe to the same queue, with different selectors, on different brokers? This is a rather specific problem, but it does come up. Artemis forwards messages according to whether there are consumers, not according to whether there are selectors. So there's every chance that messages will get forwarded to a broker whose consumers will not match the selector. These messages will never be consumed.

This isn't specifically an Artemis problem: Using selectors is somewhat problematic with a broker mesh, regardless of the implementation. Using selectors with a mesh isn't entirely robust on ActiveMQ, either: The broker has to maintain a "selector cache" to keep track of which selectors are active on which queues. Because it's impossible for the broker to know when clients come and go, the selector cache has to maintain tracking data for an extended period of time—perhaps indefinitely. This creates a memory burden, and as a result, there are different selector cache implementations available with different properties.

Artemis does not use selector caches, because it side-steps the issue of selector handling altogether. Unless your clients are configured to consume from all brokers concurrently (which isn't a bad idea in many applications), it's just not safe to use selectors.

Message grouping

There are a number of other broker features that don't work properly in a mesh, and don't work properly with ActiveMQ, either. The most troublesome is message grouping, which doesn't work at all in an Artemis mesh. It works partially with ActiveMQ, but isn't robust in the event of a client or broker outage. "Exclusive consumers" are also problematic on both brokers.

Recognizing the limitations described in this section, Red Hat is working on enhancements to Artemis that will allow brokers to re-route client connections to the brokers that are best placed to handle them. The work required is extensive because each of the various wire protocols that Artemis supports has its own way of dealing with dynamic load balancing.

Summary

In a broker mesh, Artemis uses a completely different strategy for message distribution from ActiveMQ. Understanding how Artemis works in this respect should go a long way to determining what changes need to be made to move from ActiveMQ to Artemis.

In particular, use the ON_DEMAND load-balancing policy, and be sure to enable message redistribution. Some tuning may be needed to find the best redistribution delay for a particular application.

Comments