Comparing Beanstalkd, IronMQ and Amazon SQS

Bashkim Isai
Share

Introduction

This article introduces the concept of message queues and discusses the strengths and weaknesses of three specific message queue services: Beanstalkd, IronMQ and Amazon SQS.

Any information described in this article is correct at the time of writing and is subject to change.

What are Message Queues?

Queues allow you to store metadata for processing jobs at a later date. They can aid in the development of SOA (service-oriented architecture) by providing the flexibility to defer tasks to separate processes. When applied correctly, queues can dramatically increase the user experience of a web site by reducing load times.

Advantages of message queues:

  • Asynchronous: Queue it now, run it later.
  • Decoupling: Separates application logic.
  • Resilience: Won't take down your whole application if part of it fails.
  • Redundancy: Can retry jobs if they fail.
  • Guarantees: Makes sure that jobs will be processed.
  • Scalable: Many workers can process individual jobs in a queue.
  • Profiling: Can aid in identifying performance issues.

Disadvantages of message queues:

  • Asynchronous: you have to wait until a job is complete.
  • Load: each job in the queue must wait its turn before it can be processed. If one job overruns, it affects each subsequent job.
  • Architecture: the application needs to be designed with queues in mind.

Use cases of message queues:

Any time consuming process can be placed in a queue:

  • Sending/receiving data from a third-party APIs
  • Sending an e-mail
  • Generating reports
  • Running labour intensive processes

You can also use queues in creative ways – locking jobs so only one user can access information at a time

Services

There are many services that you can use to implement message queues, this article outlines differences between Beanstalkd, IronMQ and Amazon SQS.

Beanstalkd

Beanstalkd is "… a simple, fast work queue". It is released as open source under the MIT license. It's well documented, unit tested and can be downloaded for free to run on your own server. The architecture is borrowed from memcached and it is designed specifically to be a message queue.

An article on SitePoint by author Dave Kennedy called Giant Killing with Beanstalkd contains information on how to start using Beanstalkd with Ruby.

IronMQ

IronMQ is a hosted RESTful web service. There is a free tier for developers and many other subscription tiers for commercial applications.

SQS

Amazon SQS is an inexpensive hosted solution for implementing message queues. It comes as part of Amazon Web Services (AWS). Amazon offers a Free Tier for evaluating their web services which includes SQS.

Server setup

Beanstalkd IronMQ Amazon SQS
Self-hosted Remotely hosted Remotely hosted

Beanstalkd

Runs on Linux and Mac OS X. Read the installation instructions from the Beanstalkd website for details on how to get it working on your system. The Beanstalkd server does not work on Windows.

IronMQ and SQS

IronMQ and Amazon SQS are cloud-based web services. No applications need to be setup on your server, you simply need to sign-up for an account and setup a queue.

Service Level Agreements (SLAs)

Beanstalkd IronMQ Amazon SQS
None 99.95% per month None

Beanstalkd

As Beanstalkd is a server you host, you are responsible for ensuring its availability.

IronMQ

Iron.IO has a Service Level Agreement with an uptime percentage of at least 99.95% during any monthly billing cycle. Their Pro Platinum package ($2450/month) has custom contract terms which includes Service Level Agreements. They provide refunds in Service Credits.

SQS

Amazon does not have a specific Service Level Agreement for SQS. They do have Support Services available which can cover SQS at an extra cost.

Architecture

Beanstalkd IronMQ Amazon SQS
PUSH (sockets) HTTP Web Service HTTP Web Service

Beanstalkd

Communicates via PUSH sockets providing instant communication between providers and workers.

When a provider enqueues a job, a worker can reserve it immediately if it is connected and ready. Jobs are reserved until a worker has sent a response (delete, bury, etc.)

IronMQ

SQS is a hosted RESTful web service.

There is push-like support for IronMQ. A subscriber can be called whenever a provider enqueues a job to the queue. Generally you will want to use the standard RESTful service to enqueue and dequeue jobs instead of the push approach.

SQS

SQS is a hosted web service.

There is no push support for SQS. You must poll at regular intervals to check if there are jobs in the queue.

SQS can use long polling known as Message Receive Wait Time (default: 0 seconds, max: 20 seconds) to keep a connection open while the worker waits for a job. This can mean fewer requests and longer socket opening times.

Client libraries

Beanstalkd IronMQ Amazon SQS
Open source Official Official

Beanstalkd

There are many open source Beanstalkd client libraries available in a myriad of programming languages. These are all independent projects from Beanstalkd.

IronMQ

The IronMQ client libraries are provided by Iron.IO and can be downloaded from the Dev Center.

You can also use a Beanstalkd client library with IronMQ if you'd like the flexibility of switching between the two services; however some commands (e.g.: kick, bury) are not supported. You also may need to implement the oauth command manually to connect to the service.

SQS

The AWS client libraries include the SQS client libraries. These are provided by Amazon and are available in many programming languages.

Management interface

Beanstalkd IronMQ Amazon SQS
Open source Dashboard Console

Beanstalkd

No graphical management interface is distributed by default. There are some open source projects to help with debugging and administration which can be found on the Beanstalkd tools page.

IronMQ

The IronMQ dashboard manages queues. It contains a helpful tutorial describing how to setup queues and shows you how to add jobs (IronMQ: messages) to a queue via cURL.

The interface allows you to manage your queues in an AJAX-driven website. You can create, read and delete jobs, view historical information and manage queue configuration from the dashboard view.

SQS

The AWS Management Console allows you to manage SQS. The interface is built on top of a stateless protocol so you need to press the refresh button to get up-to-date information.

You can create, read and delete jobs (SQS: messages) and manage queue configuration.

Redundancy

Beanstalkd IronMQ Amazon SQS
Client-side Cloud-based Cloud-based

Beanstalkd

Redundancy is handled on the client side and if a server goes down you will lose jobs.

Beanstalkd does include an option to store jobs in a binary log. You must launch Beanstalkd with the -b option, however restoring the queue is a manual task and requires access to the server disks.

IronMQ

IronMQ is a cloud-based service with high persistence, availability and redundancy.

SQS

Jobs are stored on multiple servers in a hosted zone. This approach ensures the availability of the service and jobs should never be lost.

Security

Beanstalkd IronMQ Amazon SQS
None Token Key & secret

Beanstalkd

No authentication is required to connect to Beanstalkd. Providers are able to enqueue jobs and workers are able to reserve jobs without passing through a security model. For this reason it is highly recommended to create a firewall blocking external connections to the port that Beanstalkd is running on.

IronMQ

You can invite collaborators via the project settings to use your message queues. Authentication to the application is done via an Iron.IO token and a project ID.

SQS

Authentication to SQS is realised through the Amazon API key and secret. Permissions can be granted and revoked for other AWS accounts to access your queues via the AWS Management Console.

Speed

Beanstalkd IronMQ Amazon SQS
Fast Internet Latency Internet Latency

Beanstalkd

Beanstalkd is very fast as it should be on the same network as its providers and workers. Beanstalkd can sometimes be so fast that if a provider puts a job in a queue and follows it with a call to MySQL, a worker may pick up your job before MySQL has finished executing.

IronMQ

Requests have an increased latency as they are sent to the IronMQ RESTful web service via HTTP.

SQS

Requests have an increased latency as they are sent to the SQS web service via HTTP.

Jobs may not be picked up straight away as they need to be distributed across different servers and data centres. This latency should be negligible if the application, a provider or a worker is hosted on an EC2 instance.

When you enqueue a job to SQS, it might not be immediately available. Jobs must be propagated to other servers. There is generally a one second wait at most.

Fidelity

Beanstalkd IronMQ Amazon SQS
FIFO FIFO No guarantees
Prioritisable No priority No priority

Beanstalkd

Queues are FIFO (first in, first out). Jobs with higher importance can be prioritised which will affect the order in which jobs are dequeued.

IronMQ

Queues are FIFO (first in, first out). Jobs cannot be prioritised.

SQS

Jobs will not come out in the same order that they entered the queue. Because SQS is a distributed service, jobs will be available on each server at different times. This is something to be acutely aware of when designing for SQS.

One-time pickup

Beanstalkd IronMQ Amazon SQS
Guaranteed Guaranteed Not guaranteed

One-time pickup describes the restriction that unless a worker has timed out, two or more workers will never run the same job in parallel.

Beanstalkd

The socket-based architecture of Beanstalkd ensures one-time pickup.

IronMQ

IronMQ guarantees one-time pickup.

SQS

Because SQS is a distributed service, there is no guarantee for one-time pickup (but it is unlikely).

Fail-safe

Beanstalkd IronMQ Amazon SQS
Zombie socket Timeout Timeout

Beanstalkd

Jobs are automatically returned to the queue if a worker doesn't respond to Beanstalkd within a set amount of time or if the socket closes without responding to the job.

It's then ready for immediate pick-up by the next requesting worker (it doesn't need to be kicked).

IronMQ & SQS

Workers connect to a queue and reserve a job. From this moment, the worker has a set amount of time to delete the job from the queue before it is released and becomes available for workers to reserve again.

Creating new queues

Beanstalkd IronMQ Amazon SQS
Automatic Auto & manual Manual

Beanstalkd

Queues (Beanstalkd: tubes) are automatically created when jobs are enqueued. They do not need to be created manually.

IronMQ

Requires you to create a project in the dashboard. One project contains many queues. Queues can either be created automatically when jobs are enqueued or manually created with configuration from the dashboard.

SQS

Queues must be manually setup from the AWS management console for SQS. Each queue will generate a unique URL which acts as the queue name.

Note the region (e.g.: us-west-1, eu-west-1, etc.) that the queue belongs to as it's required to connect to SQS.

Framework integration

Laravel

The Laravel framework has an excellent built-in wrapper which encapsulates message queues for Beanstalkd, IronMQ and Amazon SQS. You can change servers through configuration without altering any of your application.

PHP code samples

These code examples show you how you can connect to a server, enqueue, reserve and dequeue a job from a queue. If an exception is thrown, it will bury the job (if the server supports it).

Try stopping the execution after a job has been enqueued and using a management tool to debug your queue.

Beanstalkd

composer.json

{
    "require": {
        "pda/pheanstalk": "dev-master"
    }
}

beanstalkd.php

<?php

/* 1. Setup & connect */

// Include composer libraries
require 'vendor/autoload.php';

$queue_name = 'default';
$job_data   = 'lorem ipsum';

$beanstalkd = new Pheanstalk_Pheanstalk('127.0.0.1:11300');

/* 2. Provider */

// Enqueue a job
$beanstalkd
  ->useTube($queue_name)
  ->put($job_data);

/* 3. Worker */

// Loop through all jobs
while ($job = $beanstalkd->watch($queue_name)->reserve(5)) {
    try {
        $job_data = $job->getData();

        echo $job_data . PHP_EOL;

        // Dequeue a job
        $beanstalkd->delete($job);
    } catch (Exception $e) {
        // Bury a job
        $beanstalkd->bury($job);
        echo $e->getMessage();
    }
}

IronMQ

composer.json

{
    "require": {
        "iron-io/iron_mq": "dev-master"
    }
}

iron_mq.php

<?php

/* 1. Setup & connect */

// Include composer libraries
require 'vendor/autoload.php';

$queue_name = 'default';
$job_data   = 'lorem ipsum';

$iron_mq = new IronMQ(array(
    'token'      => '{token}',
    'project_id' => '{project_id}'
));

/* 2. Provider */

// Enqueue a job
$iron_mq->postMessage($queue_name, $job_data);

/* 3. Worker */

// Loop through all jobs
while ($job = $iron_mq->getMessage($queue_name)) {
    try {
        $job_data = $job->body;

        echo $job_data . PHP_EOL;

        // Dequeue a job
        $iron_mq->deleteMessage($queue_name, $job->id);
    } catch (Exception $e) {
        // Bury a job
        // There is no bury in IronMQ
        echo $e->getMessage();
    }
}

SQS

composer.json

{
    "require": {
        "aws/aws-sdk-php": "2.4.*@dev"
    }
}

sqs.php

<?php

/* 1. Setup & connect */

// Include composer libraries
require 'vendor/autoload.php';

$queue_name = 'https://sqs.{region}.amazonaws.com/{id}/{queue_name}';
$job_data   = 'lorem ipsum';

$aws = \Aws\Common\Aws::factory(array(
    'key'    => '{key}',
    'secret' => '{secret}',
    'region' => '{region}'
));

$sqs = $aws->get('sqs');

/* 2. Provider */

// Enqueue a job
$sqs->sendMessage(array(
    'QueueUrl'    => $queue_name,
    'MessageBody' => $job_data
));

/* 3. Worker */

// Handle one job
$result = $sqs->receiveMessage(array(
    'QueueUrl' => $queue_name
));

if (!$result) {
    // No jobs
    exit;
}

$messages = $result->getPath('Messages');
if (!$messages) {
    // No jobs
    exit;
}

foreach ($messages as $message) {
    try {
        $job_data = $message['Body'];

        echo $job_data . PHP_EOL;

        // Dequeue a job
        $sqs->deleteMessage(array(
            'QueueUrl'      => $queue_name,
            'ReceiptHandle' => $message['ReceiptHandle']
        ));
    } catch (Exception $e) {
        // Bury a job
        // There is no bury in SQS
        echo $e->getMessage();
    }
}

Tips for message queues

Regardless of which service you select, here are some tips for keeping your queues robust:

Metadata serialisation

Your job can contain whatever data you like, provided it's within the limit of the server's job data size. Use JSON in your job body to make metadata easy to transmit.

// Encode for enqueuing:
$raw_data = (object) array('id' => 100);
$job_data = json_encode($raw_data);

// Decode from dequeuing:
$raw_data = '{"id":100}';
$job_data = json_decode($raw_data);

Limit your job data size

Try not to crowd jobs with too much metadata. If you can can store some information in a database and only queue an ID for later processing, your queue will be more robust and easier to debug.

// Good:
$raw_data = (object) array('id' => 100);

// Not as good ...
// But sometimes necessary when there is
// no database access on a worker:
$raw_data = (object) array(
    'id' => 100,
    'color' => 'black',
    'background' => 'white'
);

Keep track of job states

If for some reason an item which has already been processed re-enters a queue, you probably don't want it to be reprocessed. Unfortunately the job data is not forced to be unique and it's important that you keep track of the state of a job in a database.

This can be as simple as having a column on your jobs table to mark an item as processed. You can deleting the job from the queue if it already has been handled.

Terminology

Some words are used differently between Beanstalkd and Amazon SQS. There's a quick list of translations:

Beanstalkd Amazon SQS IronMQ
Tube Queue Queue
Job Message Message
Job data Message body Message body
Put Send message POST
Reserve Receive message GET
Delete Delete message DELETE
TTR (time-to-run) Visibility timeout Timeout
Delay Delivery delay Delay
Retention Period Expires in

Glossary

When working with queues you may come across these terms:

Bury (a job) – puts a job in a failed state. The job cannot be reprocessed until it is manually kicked back into the queue. Not supported by IronMQ and SQS.

Consumer – see Worker.

Delay – defer a job from being sent to a worker for a predetermined amount of time.

Delete (a job) – see Dequeue.

Dequeue – marks a job as completed and removes it from the queue.

Enqueue – adds a job to a queue ready for a worker.

FIFO – describes the way jobs are handled in a queue as First In, First Out. This is the most common type of message queue.

FILO – describes the way jobs are handled in a queue as First In, Last Out.

Job – a deferred task in a queue containing metadata to identify what task is waiting to be processed. Akin to database rows.

Kick (a job) – returns a previously buried job to the queue ready for workers to pick up. Not supported by IronMQ and SQS.

Provider – a client which connects to the message server to create jobs.

Queue – a way to group similar jobs into a queue. Akin to database tables.

Reserve (a job) – delivers a job to a worker and locks it from being delivered to another worker.

Worker – a client which connects to the message server to reserve, delete and bury jobs. These perform the labour intensive part of the processing.

Conclusion

There is no silver bullet for message queue services. Beanstalkd, IronMQ and Amazon SQS all have their strengths and weaknesses which can be used to your advantage. This article should provide you with enough information to help you make an informed decision as to which service is best for your skill level and project needs.

Which message queue service will you be using? If you currently use queues, Will you consider switching? Have you used message queues in an unconventional way that could help others? Leave a comment and let everyone know.