Writing an Asterisk security statistics module for statsd

Houston, we have a problem

Yeah… that’s not cool.

When Josh told me how many failed authentication attempts his public Asterisk 12 server was getting, I wouldn’t say I was surprised: it is, after all, something of the wild west still on ye olde internet. I enjoy the hilarity of having fraud-bots break themselves on a tide of 401s. Seeing the total number of perturbed electrons is a laugh. But you know what’s more fun? Real-time stats.

I wondered: would it be possible to get information about the number of failed authentication attempts in real-time? If we included the source IP address, would that yield any interesting information?

Statistics are fun, folks. And, plus, it is for security. I can justify spending time on that. These people are clearly bad. Our statistics will serve a higher purpose. This is for a good cause.

For great justice!

ZeroWing

Stasis: Enter the message bus

We’re going to start off using our sample module. In Asterisk 12 and later versions, the Stasis message bus will publish notifications about security events to subscribed listeners. We can use the message bus to get notified any time an authentication fails.

Getting Stasis moving

I’m making the assumption that we’ve copied our res_sample_module.c file and named it res_auth_stats.c. Again, feel free to name it whatever you like.

  1. First, let’s get the right headers pulled into our module. We’ll probably want the main header for Stasis, asterisk/stasis.h. Stasis has a concept called a message router, which simplifies a lot of boiler plate code that can grow when you have to handle multiple message types that are published on a single topic. However, in our case, we only have a single message type, ast_security_event_type, that will be published to the topic we’ll subscribe to. As such, it really is overkill for this application, so we’ll just deal with managing the subscription ourselves. Said Stasis topic for security events is defined in asterisk/security_events.h, so let’s add that as well. Finally, the payload in the message that will be delivered to us is encoded in JSON, so we’ll need to add asterisk/json.h too.
    #include "asterisk/module.h"
    #include "asterisk/stasis.h"
    #include "asterisk/security_events.h"
    #include "asterisk/json.h"
    
  2. In load_module, we’ll subscribe to the ast_security_topic and tell it to call handle_security_event when the topic receives a message. The third parameter to the subscription function let’s us pass an object to the callback function whenever it is called; we don’t need it, so we’ll just pass it NULL. We’ll want to keep that subscription, so at the top of our module, we’ll declare a static struct stasis_subscription *.
    /*! Our Stasis subscription to the security topic */
    static struct stasis_subscription *sub;
    
    ...
    
    static int load_module(void)
    {
    	sub = stasis_subscribe(ast_security_topic(), handle_security_event, NULL);
    	if (!sub) {
    		return AST_MODULE_LOAD_FAILURE;
    	}
    
    	return AST_MODULE_LOAD_SUCCESS;
    }
    
  3. Since we’re subscribing to a Stasis topic when the module is loaded, we also need to unsubcribe when the module is unloaded. To do that, we can call stasis_unsubscribe_and_join – the join implying that the unsubscribe will block until all current messages being published to our subscription have been delivered. This is important, as unsubscribing does not prevent in-flight messages from being delievered; since our module is unloading, this is likely to have “unhappy” effects.
    static int unload_module(void)
    {
    	stasis_unsubscribe_and_join(sub);
    	sub = NULL;
    
    	return 0;
    }
    
  4. Now, we’re ready to implement the handler. Let’s get the method defined:
    static void handle_security_event(void *data, struct stasis_subscription *sub,
    	struct stasis_message *message)
    {
    
    }
    

    A Stasis message handler takes in three parameters:

    1. A void * to a piece of data. If we had passed an ao2 object when we called stasis_subscribe, this would be pointing to our object. Since we passed NULL, this will be NULL every time our function is invoked.
    2. A pointer to the stasis_subscription that caused this function to be called. You can have a single handler handle multiple subscriptions, and you can also cancel your subscription in the callback. For our purposes, we’re (a) going to always be subscribed to the topic as long as the module is loaded, and (b) we are only subscribed to a single topic. So we won’t worry about this parameter.
    3. A pointer to our message. All messages published over Stasis are an instance of struct stasis_message, which is an opaque object. It’s up to us to determine if we want to handle that message or not.
  5. Let’s add some basic defensive checking in here. A topic can have many messages types published to it; of these, we know we only care about ast_security_event_type. Let’s ignore all the others:
    static void handle_security_event(void *data, struct stasis_subscription *sub,
    	struct stasis_message *message)
    {
    	if (stasis_message_type(message) != ast_security_event_type()) {
    		return;
    	}
    
    }
    
  6. Now that we know that our message type is ast_security_event_type, we can safely extract the message payload. The stasis_message_payload function extracts whatever payload was passed along with the struct stasis_message as a void *. It is incredibly important to note that by convention, message payloads passed with Stasis messages are immutable. You must not change them.Why is this the case?

    A Stasis message that is published to a topic is delivered to all of that topic’s subscribers. There could be many modules that are intrested in security information. When designing Stasis, we had two options:

    1. Do a deep copy of the message payload for each message that is delivered. This would incur a pretty steep penalty on all consumers of Stasis, even if they did not need to modify the message data. Publishers would also have to implement a copy callback for each message payload.
    2. Pretend that the message payload is immutable and can’t be modified (this is C after all, if you want to shoot yourself in the foot, you’re more than welcome to). If a subscriber needs to modify the message data, it has to copy the payload itself.

    For performance reasons, we chose option #2. In practice, this has worked out well: many subscribers don’t need to change the message payloads; they merely need to know that something occurred.

    Anyway, the code:

    	struct ast_json_payload *payload;
    
    	if (stasis_message_type(message) != ast_security_event_type()) {
    		return;
    	}
    
    	payload = stasis_message_data(message);
    	if (!payload || !payload->json) {
    		return;
    	}
    

    Note that our payload is of type struct ast_json_payload, which is a thin ao2 wrapper around a struct ast_json object. Just for safety’s sake, we make sure that both the wrapper and the underlying JSON object aren’t NULL before manipulating them.

  7. Now that we have our payload, let’s print it out. The JSON wrapper API provides a handy way of doing this via ast_json_dump_string_format. This will give us an idea of what exactly is in the payload of a security event:
    static void handle_security_event(void *data, struct stasis_subscription *sub,
    	struct stasis_message *message)
    {
    	struct ast_json_payload *payload;
    	char *str_json;
    
    	if (stasis_message_type(message) != ast_security_event_type()) {
    		return;
    	}
    
    	payload = stasis_message_data(message);
    	if (!payload || !payload->json) {
    		return;
    	}
    
    	str_json = ast_json_dump_string_format(payload->json, AST_JSON_PRETTY);
    	if (str_json) {
    		ast_log(LOG_NOTICE, "Security! %s\n", str_json);
    	}
    	ast_json_free(str_json);
    }
    

Stasis in action

Let’s make a security event. While there’s plenty of ways to generate a security event in Asterisk, one of the easiest is to just fail an AMI login. You’re more than welcome to do any failed (or even successful) login to test out the module, just make sure it is through a channel driver/module that actually emits security events!

AMI Demo

Build and install the module, then:

$ telnet 127.0.0.1 5038
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Asterisk Call Manager/2.4.0
Action: Login
Username: i_am_not_a_user
Secret: nope

Response: Error
Message: Authentication failed

Connection closed by foreign host.

In Asterisk, we should see something like the following:

*CLI> [Aug  3 16:00:51] NOTICE[12019]: manager.c:2959 authenticate: 127.0.0.1 tried to authenticate with nonexistent user 'i_am_not_a_user'
[Aug  3 16:00:51] NOTICE[12019]: manager.c:2996 authenticate: 127.0.0.1 failed to authenticate as 'i_am_not_a_user'
[Aug  3 16:00:51] NOTICE[12008]: res_auth_stats.c:60 handle_security_event: Security! {
  "SecurityEvent": 1,
  "EventVersion": "1",
  "EventTV": "2014-08-03T16:00:51.052-0500",
  "Service": "AMI",
  "Severity": "Error",
  "AccountID": "i_am_not_a_user",
  "SessionID": "0x7f50ae8aebd0",
  "LocalAddress": "IPV4/TCP/0.0.0.0/5038",
  "RemoteAddress": "IPV4/TCP/127.0.0.1/57546",
  "SessionTV": "1969-12-31T18:00:00.000-0600"
}
  == Connect attempt from '127.0.0.1' unable to authenticate

Great success!

Dissecting the event

There’s a few interesting fields in the security event that we could build statistics from, and a few that we should consider carefully when writing the next portion of our module. In no particular order, here are a few thoughts to guide the next part of the development:

  • There are a number of different types of security events, which are conveyed by the SecurityEvent field. This integer value corresponds to the enum ast_security_event_type. There are a fair number of security events that we may not care about (at least not for this incarnation of the module). Since what we want to track are failed authentication attempts, we will need to filter out events based on this value.
  • The Service field tells us who raised the security event. If we felt like it, we could use that to only look at SIP attacks, or failed AMI logins, or what not. For now, I’m going to opt to not care about where the security event was raised from: the fact that we get one is sufficient.
  • The RemoteAddress is interesting: it tells us where the security issue came from. While we’re concerned with statistics – and I think keeping track of how many failed logins a particular source had is pretty interesting – for people using fail2ban, iptables, or other tools to restrict access, this is a pretty useful field. Consume, update; rinse, repeat.

STATS!

Let’s get rid of the log message and start doing something interesting. In Asterisk 12, we added a module, res_statsd, that does much what its namesake implies: it allows Asterisk to send stats to a statsd server. statsd is really cool: if you have a statistic you want to track, it has a way to consume it. With a number of pluggable backends, there’s also (usually) a way to display it. And it’s open source!


In the interest of full disclosure, installing statsd on my laptop hit a few … snags. Libraries and what-not. I’ll post again with how well this module works in practice. For now, let’s just hope the theory is sound.

To use this module, we’ll want to pull in Asterisk’s integration library with statsd, statsd.h.

...
#include "asterisk/json.h"
#include "asterisk/statsd.h"

...

And we should go ahead and declare that our module depends on res_statsd:

/*** MODULEINFO
	<support_level>extended</support_level>
	<depend>res_statsd</depend>
 ***/

Since we’re not going to print out the Stasis message any more, go ahead and delete the char *str_json. Now that we know we’re getting messages, let’s filter out the ones we don’t care about:

static void handle_security_event(void *data, struct stasis_subscription *sub,
	struct stasis_message *message)
{
	struct ast_json_payload *payload;
	int event_type;

	if (stasis_message_type(message) != ast_security_event_type()) {
		return;
	}

	payload = stasis_message_data(message);
	if (!payload || !payload->json) {
		return;
	}

	event_type = ast_json_integer_get(ast_json_object_get(payload->json, "SecurityEvent"));
	switch (event_type) {
	case AST_SECURITY_EVENT_INVAL_ACCT_ID:
	case AST_SECURITY_EVENT_INVAL_PASSWORD:
	case AST_SECURITY_EVENT_CHAL_RESP_FAILED:
		break;
	default:
		return;
	}
}

Here, after pulling out the payload from the Stasis message, we get the SecurityEvent field out and assign it to an integer, event_type. Note that we know that the value will be one of the AST_SECURITY_EVENT_* values. In my case, I only care when someone:

  • Fails to provide a valid account
  • Fails to provide a valid password
  • Fails a challenge check (rather important for SIP)

So we bail on any of the event types that aren’t one of those.

The first stat I’ll send to statsd are the number of times a particular address trips one of those three security events. statsd uses a period delineated message format, where each period denotes a category of statistics. The API provided by Asterisk’s statsd module lets us send any statistic using ast_statsd_log. In this case, we want to just simply bump a count every time we get a failed message, so we’ll use the statistic type of AST_STATSD_METER.

Using a struct ast_str to build up the message we sent to statsd, we’d have something that looks like this:

static void handle_security_event(void *data, struct stasis_subscription *sub,
	struct stasis_message *message)
{
	struct ast_str *remote_msg;
	struct ast_json_payload *payload;
	int event_type;

	if (stasis_message_type(message) != ast_security_event_type()) {
		return;
	}

	payload = stasis_message_data(message);
	if (!payload || !payload->json) {
		return;
	}

	event_type = ast_json_integer_get(ast_json_object_get(payload->json, "SecurityEvent"));
	switch (event_type) {
	case AST_SECURITY_EVENT_INVAL_ACCT_ID:
	case AST_SECURITY_EVENT_INVAL_PASSWORD:
	case AST_SECURITY_EVENT_CHAL_RESP_FAILED:
		break;
	default:
		return;
	}

	remote_msg = ast_str_create(64);
	if (!remote_msg) {
		return;
	}

	ast_str_set(&remote_msg, 0, "security.failed_auth.%s.%s",
		ast_json_string_get(ast_json_object_get(payload->json, "Service")),
		ast_json_string_get(ast_json_object_get(payload->json, "RemoteAddress")));
	ast_statsd_log(ast_str_buffer(remote_msg), AST_STATSD_METER, 1);

	ast_free(remote_msg);
}

Cool! If we get an attack from, say, 192.168.0.1 over UDP via SIP, we’d send the following message to statsd:

security.failed_auth.SIP.UDP/192.168.0.1/5060

Except, we have one tiny problem… IP addresses are period delineated. Whoops.

(cleaner) STATS!

We really want the address we receive from the security event to be its own “ID”, identifying what initiated the security event. That means we really need to mark the octets in an IPv4 address with something other than a ‘.’. We also need to lose the port: if the connection is TCP based, that port is going to bounce all over the map (and we probably don’t care which port it originated from either). Since the address is delineated with a ‘/’ character, we can just drop the last bit of information that’s returned to us in the RemoteAddress field. Let’s write a few helper functions to do that:

static char *sanitize_address(char *buffer)
{
	char *current = buffer;

	while ((current = strchr(current, '.'))) {
		*current = '_';
	}

	current = strrchr(buffer, '/');
	*current = '\0';

	return buffer;
}

Note that we don’t need to return anything here, as this modifies buffer in place, but I find those semantics to be nice. Using the return value makes the modification of the buffer parameter obvious.

That should turn this:

IPV4/TCP/127.0.0.1/57546

Into this:

IPV4/TCP/127_0_0_1

Nifty.

Since we used a struct ast_str * to build our message, we’ll need to pull out the RemoteAddress into a char * to manipulate it.

REMEMBER: STASIS MESSAGES ARE IMMUTABLE.

We’re about to mutate a field in it; you cannot just muck around with this value in the message. Let’s do it safely:

	char *remote_address;

...

	remote_address = ast_strdupa(ast_json_string_get(ast_json_object_get(payload->json, "RemoteAddress")));
	remote_address = sanitize_address(remote_address);

Better. With the rest of the code, this now looks like:

static void handle_security_event(void *data, struct stasis_subscription *sub,
	struct stasis_message *message)
{
	struct ast_str *remote_msg;
	struct ast_json_payload *payload;
	char *remote_address;
	int event_type;

	if (stasis_message_type(message) != ast_security_event_type()) {
		return;
	}

	payload = stasis_message_data(message);
	if (!payload || !payload->json) {
		return;
	}

	event_type = ast_json_integer_get(ast_json_object_get(payload->json, "SecurityEvent"));
	switch (event_type) {
	case AST_SECURITY_EVENT_INVAL_ACCT_ID:
	case AST_SECURITY_EVENT_INVAL_PASSWORD:
	case AST_SECURITY_EVENT_CHAL_RESP_FAILED:
		break;
	default:
		return;
	}

	remote_msg = ast_str_create(64);
	if (!remote_msg) {
		return;
	}

	service = ast_json_string_get(ast_json_object_get(payload->json, "Service"));
	remote_address = ast_strdupa(ast_json_string_get(ast_json_object_get(payload->json, "RemoteAddress")));
	remote_address = sanitize_address(remote_address);

	ast_str_set(&remote_msg, 0, "security.failed_auth.%s.%s", service, remote_address);
	ast_statsd_log(ast_str_buffer(remote_msg), AST_STATSD_METER, 1);

	ast_free(remote_msg);
}

Yay! But what else can we do with this?

MOAR (cleaner) STATS!

So, right now, we’re keeping track of each individual remote address that fails authentication. That may be a bit aggressive for some scenarios – sometimes, we may just want to know how many SIP authentication requests have failed. So let’s track that. We’ll use a new string buffer (dual purposing buffers just feels me with ewww), and populate it with a new stat:

	service = ast_json_string_get(ast_json_object_get(payload->json, "Service"));

	ast_str_set(&count_msg, 0, "security.failed_auth.%s.count", service);
	ast_statsd_log(ast_str_buffer(count_msg), AST_STATSD_METER, 1);

Since we’re unlikely to get a remote address of count, this should work out okay for us. With the rest of the code, this looks like the following:

static void handle_security_event(void *data, struct stasis_subscription *sub,
	struct stasis_message *message)
{
	struct ast_str *remote_msg;
	struct ast_str *count_msg;
	struct ast_json_payload *payload;
	const char *service;
	char *remote_address;
	int event_type;

	if (stasis_message_type(message) != ast_security_event_type()) {
		return;
	}

	payload = stasis_message_data(message);
	if (!payload || !payload->json) {
		return;
	}

	event_type = ast_json_integer_get(ast_json_object_get(payload->json, "SecurityEvent"));
	switch (event_type) {
	case AST_SECURITY_EVENT_INVAL_ACCT_ID:
	case AST_SECURITY_EVENT_INVAL_PASSWORD:
	case AST_SECURITY_EVENT_CHAL_RESP_FAILED:
		break;
	default:
		return;
	}

	remote_msg = ast_str_create(64);
	count_msg = ast_str_create(64);
	if (!remote_msg || !count_msg) {
		ast_free(remote_msg);
		ast_free(count_msg);
		return;
	}

	service = ast_json_string_get(ast_json_object_get(payload->json, "Service"));

	ast_str_set(&count_msg, 0, "security.failed_auth.%s.count", service);
	ast_statsd_log(ast_str_buffer(count_msg), AST_STATSD_METER, 1);

	remote_address = ast_strdupa(ast_json_string_get(ast_json_object_get(payload->json, "RemoteAddress")));
	remote_address = sanitize_address(remote_address);

	ast_str_set(&remote_msg, 0, "security.failed_auth.%s.%s", service, remote_address);
	ast_statsd_log(ast_str_buffer(remote_msg), AST_STATSD_METER, 1);

	ast_free(remote_msg);
	ast_free(count_msg);
}

In Conclusion

And there we go! Statistics of those trying to h4x0r your PBX, delivered to your statsd server. In this particular case, getting this information off of Stasis was probably the most direct method, since we want to programmatically pass this information off to statsd. On the other hand, since security events are now passed over AMI, we could do this in another language as well. If I wanted to update iptables or fail2ban, I’d probably use the AMI route – it’s generally easier to do things in Python or JavaScript than C (sorry C afficianados). On the other hand: this also makes for a much more interesting blog post and an Asterisk module!

Later: actually displaying this data!

Writing an Asterisk Module

A lot of modules for Asterisk were written a long time ago. For the most part, these modules targetted Asterisk 1.4. I tend to suspect this was for a variety of reasons: the Asterisk project really took off during that time frame, and a lot of capabilities were being added at that time. Unfortunately (or fortunately, if you happen to be one of those who develop for Asterisk often), the Asterisk Architecture has changed a lot since then. While a some of those modules may still work just fine (with maybe a few find and replaces), there’s a lot of tools in the toolbox that didn’t exist then.

In fact, if I could point to two things that have resulted in Asterisk evolving into the stable, robust platform that it is today, it would be:

  1. LOTS of testing.
  2. Frameworks, frameworks, frameworks!

Not that there is a framework for everything, mind you. Or that everything is “perfect”. Software is hard, after all. But if there’s something you want to do, there’s probably something available to help you do that, and if you use that, then you stand a good chance of using something that just. Plain. Works.

So, let’s start at the beginning.

When Asterisk starts, it looks for modules to load in the modules directory (specified in asterisk.conf in the astmoddir setting). If modules.conf says to load a module in that directory – either explicitly or via autoload – then it loads that module into memory. But that doesn’t actually start the module – it just gets it into memory. However, due to some fancy usage of the constructor attribute, it also gets it registered in a few places as a “module” that can be manipulated. That registration really occurs due to an instance of a certain struct that all modules must have, and which provides a few different things:

  1. Notification that the module is GPLv2. Unless your module includes the disclaimer that the module is GPLv2, it can’t be loaded. So, yes, you have to distribute your module per the conditions of the GPLv2.
  2. Various bits of metadata – the name of the module, a description of it, etc.
  3. The module’s load order priority. Asterisk’s module loader is a tad … simple… which means things have a rough integer priority for loading. If you have lots of dependencies, make sure you load after them – and if things depend on your module, make sure they load after you.
  4. Most importantly (for this blog post), it defines a set of virtual functions that will be called at key times by Asterisk. Namely:
    • load_module: Called during module load. This should process the configuration, set up containers, and initialize module data. If your module can’t be loaded, you can tell the core not to load your module.
    • unload_module: The analogue of load_module; called when Asterisk is shutting down. Your module should shut down whatever it is using, unsubscribe from things, dispose of memory, etc.
    • reload_module: Called when a module is reloaded by an interface, i.e., through the CLI’s module reload command or the AMI ModuleReload action. You should reprocess your configuration when this is called. More on that in a future post; for now, we’ll just focus on load and unload.

Writing the basics

For now, let’s just get something that will load. We’ll add interesting stuff later. I’m going to assume that this module is named res_sample_module.c, stored in the res directory of Asterisk. If you’d rather write an application, or a function, or whatever, feel free to rename it and place it where it suits you best. The Asterisk build system should find it, so long as it is in one of the canonical directories. I’m also going to assume that this is done on Asterisk 13 – if you’d like to use another version of Asterisk, that’s fine, but your mileage may vary.

  1. Add a MODULEINFO comment block at the top of your file. When Asterisk compiles, it will parse out this comment block and feed it into menuselect, which allows users to control which modules they want to build. It also will prevent a module from compiling if its dependencies aren’t met. For now, we’ll simply set its support level:
    /*** MODULEINFO
        <support_level>extended</support_level>
    ***/
    
  2. Include the main asterisk header file. This pulls in the build options from the configure script, throws in some forward declarations for common items, and gives us the ability to register the version of this file. Immediately after this, we should go ahead and add the call that will register the file version, using the macro ASTERISK_FILE_VERSION:
    #include "asterisk.h"
    
    ASTERISK_FILE_VERSION(__FILE__, "$Revision: $")
    
  3. While we’re here, after the ASTERISK_FILE_VERSION, let’s include the header module.h. That header contains the definition for the module struct with its handy virtual table of load/unload/reload functions.
    ASTERISK_FILE_VERSION(__FILE__, "$Revision: $")
    
    #include "asterisk/module.h"
    

    At the bottom of the file, declare that this is a “standard” Asterisk module. This will make a few assumptions:

    1. That you have a load_module and unload_module static functions conforming to the correct signature, and no reload_module function (which is okay, we’ll worry about that one another time).
    2. That you’re okay with loading your module at the “default” time, i.e., AST_MODPRI_DEFAULT. For now, we are.
    3. That your module generally doesn’t get used by other things, e.g., it doesn’t export global symbols, or have any really complex dependencies. Again, if we need that, we’ll worry about that later.
    AST_MODULE_INFO_STANDARD(ASTERISK_GPL_KEY, "Sample module");
    
  4. Now that we have our sample module defined, we need to provide implementations of the load_module and unload_module functions. Each of these functions takes in no parameters, and returns an int:
    static int unload_module(void)
    {
    
    }
    
    static int load_module(void)
    {
    
    }
    
  5. These two functions should probably do something. unload_module is pretty straight forward; we’ll just return 0 for success. Note that generally, a module unload routine should return 0 unless it can’t be unloaded for some reason. If the CLI is attempting to unload your module, it will give up; if Asterisk is shutting down, it won’t care. It will eventually skip on by your module and terminate (with extreme prejudice, no less).
    static int unload_module(void)
    {
        return 0;
    }
    

    Now we need something for load_module. Unlike the unload routine, the load routine can return a number of different things to instruct the core on how to treat the load attempt. In our case, we just want it to load up. For now, we’ll just return success:

    static int load_module(void)
    {
        return AST_MODULE_LOAD_SUCCESS;
    }
    

Putting it all together

The whole module would look something like this:

/*** MODULEINFO
    <support_level>extended</support_level>
***/

#include "asterisk.h"

ASTERISK_FILE_VERSION(__FILE__, "$Revision: $")

#include "asterisk/module.h"

static int unload_module(void)
{
    return 0;
}

static int load_module(void)
{
    return AST_MODULE_LOAD_SUCCESS;
}

AST_MODULE_INFO_STANDARD(ASTERISK_GPL_KEY, "Sample module");

Note that this module is available in a GitHub repo:

https://github.com/matt-jordan/asterisk-modules

Let’s make a module

Let’s get this thing compiled and running. Starting from a clean checkout:

  1. Configure Asterisk. Note that when you’re writing code, always configure Asterisk with --enable-dev-mode. This turns gcc warnings into errors, and allows you to enable a number of internal nicities that help you test your code (such as the unit test framework).
    $ ./configure --enable-dev-mode
    
  2. Look at menuselect and make sure your module is there:
    $ make menuselect
    

    If you don’t see it, something went horribly wrong. Make sure your module has the MODULEINFO section.

  3. Build!
    $ make
    
  4. Install!
    $ sudo make install
    
  5. Start up Asterisk. We should see our module go scrolling on by:
    $ sudo asterisk -cvvg
    ...
    Loading res_sample_module.so.
     == res_sample_module.so => (Sample module)
    
  6. We can unload our module:
    *CLI> module unload res_sample_module.so 
     Unloading res_sample_module.so
    Unloaded res_sample_module.so
    
  7. And we can load it back into memory:
    *CLI> module load res_sample_module.so
    Loaded res_sample_module.so
      Loaded res_sample_module.so => (Sample module)
    *CLI>
    

Granted, this module doesn’t do much, but it does let us form the basis for much more interesting modules in the future.

Running: 10 Miles with gadgets

I have to be careful. I’ve now written two blog posts in two days, after ignoring this blog for a good six months. Seriously: if I write too much, I’ll get annoyed and just stop again.

Maybe every two weeks is a good goal?

Maybe.

Anyway, I wrote last time on maintaining a work/life balance because I felt I needed context for this post. Namely, that I run. That’s traditionally been one of those things I set aside from work, and don’t violate.

Except that hasn’t been true for some time. I have to qualify my “I run” statement: I’m working back to becoming a runner. Two years ago, I ran quite a lot, all of which culminated in the Rocket City marathon. Unfortunately, I torked my knee out about a month before the marathon. Undeterred, I ran it anyway, and at about mile 18, my knee fell apart. “Fell apart” is a nice way of saying it felt like someone took a chisel to my knee cap. I willed myself through the next 8 miles, but I went from running a good 9:00 minute mile (which was, at the time, my goal marathon pace) to 12:00 minute miles. Or worse. I’m not sure.

It sucked.

Either way, I did finish the marathon (4:09 and change) – but the marathon ultimately did me in. I haven’t run in the two years since.

Having running disappear from my life sucked a lot more than the knee pain ever did.

Getting the run back on

Running is a weird thing. The vast majority of the time you’re running, it sucks. Running in the South doesn’t help either; we have the worst conditions for running. Hot and humid just doesn’t go with slogging through a long distance run.

Now, after you’ve done it awhile, it does get “better”. But it’s not like your legs suddenly stop hurting, or your muscles stop cramping, or your head doesn’t spin when you run too hard up a hill and it’s 100 degrees out with 100% humidity. You can just hurt longer as you run farther, and that’s about all.

A funny thing does happen, however: eventually, you start to kind of like it. And when you’ve been out there for quite awhile, you eventually do get the glorious Runner’s High.

So… hours upon hours upon hours of pain to get a few brief moments of pleasure? Sure, why not!

Of course, I’m underselling this a bit. The Oatmeal got the joy of this correct when he called it “The Void”: time seems to stand still. Your legs are moving, things hurt, but it’s all good. Everything is awesome (cue theme music). And, when the world is stressful, when your thoughts are plaguing you, when all is chaotic and crazy… the void is a great place to reach.

It’s all worth the pain to get there. I decided to get back there.

Doing it better

My biggest fear is getting injured again. That knee: it took the wind right out of my void-happy sails. It’s not that injuries won’t happen again – when you’re pounding pavement over and over, sometimes things give – but I’d like to do better at preventing it this time around.

Number 1 problem: old shoes.

This one is easy to fix: buy new shoes and stop being a cheap skate. Compared to other hobbies, running is pretty cheap.

Number 2 problem: running too damn fast.

I tend to do pretty well at keeping myself to a 10% mileage increase week after week, but I get competitive. I attack hills; I try to beat my last pace. I do this even on long runs: if I ran a 6 mile run at 9:30, I can do an 8 mile run at the same pace. Or at 9:25. 9:20? Sure.

FYI: This is dumb.

To try and prevent my nature from overriding my brain – particularly when I want to run harder – I bought a Garmin Forerunner 220. Yay, new toys!

garmin-forerunner-220-black-red-3.gif

The Garmin Forerunner 220 has got all sorts of crazy features. It’s amazing how much functionality you can fit onto a chip (go-go Moore’s Law). The two features that I wanted the most were probably the most obvious:

  1. A heart rate monitor. Heart rate doesn’t lie.
  2. GPS. I want to actually know my pace per mile, and how far I’ve actually run.

10 miles at 143

Before going out for my 10 mile long run this Saturday, I calculated my aerobic heart rate at 143 (using the somewhat arbitrary calculation of 180 – 32 (age) – 5 (less than 6 months of running)). Last Saturday, I strapped on the heart rate monitor, drove up to Monte Sano, parked at the Elementary school, and started out.

It was a great day for running: overcast, a bit drizzly, and cool for July. The mountain tends to get wrapped in fog and clouds when the weather is like this, and Saturday was no exception. There’s a great 10 mile route on the mountain – start at the school, run the Panorama loop, head up to the state park, run to the overlook, run down the old bankhead/toll gate road until you hit five miles, then turn around and run the thing in reverse.

The first thing I noticed was that I hit 143 pretty easily, and at a much slower pace than I thought. I had previously been running my long runs at about a 10 minute per mile pace; I found that this really was too fast. Disappointing, but not unexpected. At it turns out, keeping myself at a 143 heart rate was closer to an 11:20 minute per mile pace. Oh well.

The second thing I learned: I attack hills. Not even a little, a lot. I actually speed up on the inclines, and my heart rate – not surprisingly – goes up a lot. (It’s surprising how fast heart rate goes up when start going up hill.) It took a bit, but I learned to slow down – a lot – when running up a hill. I ended up having to walk a few of them, particularly later in the 10 miles.

The final thing: I liked running with a heart monitor. It was kind of a game: how close could I keep myself to my target heart rate? It felt like I was a bit more engaged at times with my running. While I have always loved my long runs – far more than the weekly grind – sometimes, they can get a bit dull as you wait for that void to kick in. The heart rate game kept me interested during those first six or eight miles.

When the ten miles were done, and I was back at my car, I was surprised how much better I felt. Often, when I finish a long run, I’m a bit winded and my legs are a bit “jelly” like. Keeping myself a bit slower and in that aerobic zone helped a lot – I felt like I could have easily run another couple of miles.

Supposedly, keeping myself in an aerobic zone will slowly improve speed. I get to test that out over the next few months, as I’ll be wearing this during my weekly training runs.

Hopefully this will get me ready for the new marathon route this winter… and get me in shape to run it better this time around.

Finding a work/life balance

In which I am Young and Dumb

My first job out of college was as a systems engineer, doing software implementations for large public safety projects. It was a great first job. I mean that sincerely: it was a lot of fun. I travelled all over the country, some places illustrious (what, you mean I have to spend one week a month in Laguna Beach for a year?); some not (why hello there Central Texas. 100 degrees + 100% humidity? Sure!). I learned a lot from the experience. I think everyone who aspires to be a software engineer benefits from seeing their software used. As an implementation monkey, I often yearned to drag the software engineers responsible for the buggy piece of shit I was attempting to deploy out of their comfy air conditioned offices to my not as comfy (yet still air conditioned) server room to show them just how often their buggy piece of shit crashed. I later got lucky enough to write my own buggy pieces of shit, at which point, my dreams of berating those who were – quite frankly – far better than me – diminished by a large measure.

Nothing tempers enthusiasm like experience.

I worked a lot. A lot. A compatriot of mine and I once decided to see how many 100+ hours weeks we could work in a row before collapsing in something like dementia. I think I hit 3 (I’m pretty sure he hit 4; I was the weak one). I wrote a lot of really bad software during this time as well, mostly in hotel rooms and on planes. I’d like to believe that sleeplessness had nothing to do with this, but I’m probably whitewashing history. One piece of software I am both equally proud of and embarrassed of was written predominately on two plane trips between Orange County and Atlanta (one of which was a red-eye); it coordinated dispatches between all the fire departments in Orange County. Hilarity ensued when I accidentally cleared every single fire engine in the entire county off their calls.

Oh, I’m sorry, were you going to that structure fire? Not any more!

Whoops. (As an aside, there’s some incredibly professional people working in your dispatch centre. Thankfully, they are used to morons like me writing buggy software that their county chooses to buy based on who has the lowest bid. If you meet a dispatcher, thank them. They’re nice people. Fire fighters, for the most part, (and I’m making a gender based stereotype here, but I have to go with my experience and the sheer lack of many female fire fighters that I personally met), are frat boys who never grew up. They just find it fun to drive large trucks down interstates at breakneck speeds. Getting to U-Turn the sucker twice in one day is just good fun.)

The value of test driven development was something I learned that day.

Life versus Work

These are all anecdotes to say that I worked too much. I knew I had a problem when the stress of my demanding contracts got to the point that I couldn’t sleep at night. I’d wake up every hour, dreading tomorrow’s e-mails from customers angry that I hadn’t provided the software they wanted. Every day was this existential horror of working until 2 AM, sleeping in 30 minute batches with periodic, hallucinatory jolts of “wakefulness”, only to know that by 6 AM, my work would be insufficient to stave off the devouring beast that were “my customers”.

We were spread a little thin. Not surprising, I was getting pretty burned out.

You may think that my overworking was proof that there was something wrong with management. In retrospect… nope, still don’t think so. They actually told me – quite often – to stop working. To go on vacation. To close the damn laptop. For the most part, I ignored them. I just figured, “this is what you do”. I had a problem: I hadn’t yet learned how to say “no”. I liked being “the hero”. Even now, I like a good amount of pressure. There is, of course, a balance, and unfortunately, I hadn’t yet learned that heroes, often as not, get eaten by the dragons they’re attempting to slay.

Naivety is fun.

Listen to advice

My vice president at the time was Alice. She’s amazing, and I still have the utmost respect for her. She’s a legend for showing up at the office before everyone else and going home long after everyone else had turned off the lights. She’s a machine; she always knew every customer, what they wanted, what they didn’t really need, who was good at doing what, who to trust, who not to trust. She had the “pulse” of the company, and she knew how to keep the machine moving. Those of us who were lucky to get to know her wanted to be as big of a bad-ass as she was.

I remember hearing some advice she gave someone else some time, who asked – with some incredulity – how she managed to do all that she did while still maintaining healthy relationships and an active personal life. To paraphrase:

Figure out what you need to be sane. Figure out what you need to be happy. When you know what it is, set that aside. And don’t violate it.

Pretty sure this is in the Agile Certification Test

It’s no secret that I love Agile development. A lot of people associate a maintainable, predictable velocity with Agile development – which, while not part of the manifesto – kind of goes hand in hand with that whole “valuing of individuals” statement. Having a predictable development velocity means you do not do high burns on anything approximating a regular basis. 50 hour weeks is not cool. 100+ hour weeks are so far out of bounds the stadium that would be our analogy of “yeah, this is acceptable” is probably in another galaxy. Doing work that exceeds your normal average limit means that there is something wrong. If you can’t maintain a particular velocity, then you can’t predict what you can and cannot do.

As human workers, we’ve gravitated towards 40 hour weeks. I suspect (but am too lazy to look up) that there is a large body of research that says that 8 hours is the most we can expect out of someone in any given day, and giving someone 2 days out of a week as a break is a good idea. I’m sure that works for most people. Others can do more; others can do less. (Personally, I think most people would benefit from a 4 day/9 hour schedule, but that’s just me. I’d take a 10% hit on salary if it meant a 3 day weekend; then again, I’m pretty blessed to have a wife that works and no kids. Your mileage may vary.) Myself, I tend to lean towards the “more” side of that scale, but that’s just a personal preference.

The point is: find a work/life balance. Find what works for you. If 40 hours a week is the magic point, then do that. And don’t exceed it.

If it’s more: that’s fine too. For me, I find myself at about 10 hours a day with a smattering here and there on the weekend. Again: I’m lucky; I have an understanding wife and a dog. People with kids have other priorities, and that’s understandable.

Where I’ve been; where I’m going

The past month, I’d say I was in a “high burn” situation. I worked a lot, especially on the weekends. I’m not sure I was hitting that magic 100 hour week, but I was certainly close (ew). At the same time, I do recognize that I pay for these situations. After these high burns, I’m tired, I’m cranky, and I’m pretty sure I care a lot less about whatever software I was writing. That’s not good: I like to care about what I’m doing. Work is important to me – what I do, what I build, is part of my identity. I find that my life is better that way; devaluing that is no bueno.

At the same time, sometimes, you look at the schedule, look at your list of desired features, close your eyes, hit the afterburners, and pray.

All of this is a long way of saying that I’ve had Alice’s advice in the back of my head this week, chiding me for violating that piece of sanity that I carve out for myself. I’ve already carved it back into my life, which is a good thing. I think there’s a limited number of times you can do high burn situations, and the period of time that you can devote to such a burn decreases as you get older. Witness the start-ups founded by today’s young “hot” entrepreneurs. It’s not surprising that by the time they’re 35, they aren’t doing what they did when they were 25. And that’s not a bad thing.

They probably carved out their piece of happiness as well. At least, I hope they have.

Otherwise, what the hell is the point of all this?

Why picking a license isn’t the first step in going open source

I was at a conference in December last year where I was giving a talk on ARI in Asterisk 12. After my talk, one of the attendees came up, and asked if he could talk to me about running an open source project. I don’t consider myself an expert on the subject by any stretch of the imagination – yes, I’m the project lead for Asterisk, but I learn something new about open source every day! But, being daring, I said yes. His question was approximately thus:

“Once upon a time, I wrote a tool that I sold for a small amount of money to a large number of companies, some of whom are rather large. Over the past many years, I’ve re-written said tool such that it’s a full on project, and I’ve realized that to get it done and available, there are a number of things I could use help with. I’d like to take it open source so that I can finally get this software out there. What open source license should I pick?”

I have to admit that I had a quick, off-the-cuff reply: “don’t pick the GPL”. This certainly surprised him – Asterisk, after all, is dual licensed under the GPLv2, such that Digium can choose to commercially license Asterisk. Our business model certainly benefits from this licensing model. He was, in fact, seriously considering GPLv2 for his project as well – specifically because he wanted to dual license his project. It was, I admitted immediately to him, a glib answer, but one that stemmed from my experience over the past couple of years dealing with the GPL. While a popular and powerful license, the dual licensing aspect is tricky: you have to have a good understanding of what can and cannot be used with a GPL licensed application. This can get particularly sticky when you consider all of the various software distribution models that are out there, some of which were not as prevalent when the GPL was written. While licensing your project under the GPL requires a good amount of knowledge, providing a commercial license compounds this; you now have to manage exceptions to the GPL, which has both public perception issues as well as real legal issues. None of this is fun (at least, I don’t find it fun), and it certainly would be a lot of work for him. But I realized I spoke too quickly, because there was a much more fundamental issue at play here: what, exactly did he want to do with his project?

In other words: what is his motivation in open sourcing his project?

I’m going to start with the assumption that this is not a moral question. If it is a moral question, then everything that follows here is moot. If you morally feel that all software should be free – as in libre – then there is no question, no dilemma. You open source the bloody thing: in fact, it should have been free when you started! But I assumed that this wasn’t his case – because if so, than the answer of which license you pick is more of a personal preference (although I’m sure some morally inclined people would disagree; it’s still easier than where I’m going with this next).

If his goal was to have people use his software, then open sourcing his project makes sense – and it’s fine to discuss which license to pick.

If his goal is to make money, then open sourcing his project at this time is crazy.

Why is that?

It is not easy to make money on open source software. Yes, lots of companies do it: Red Hat, Digium (yay!), Canonical, etc. However, if you look closely at companies that make money on open source software – producing it, not just using it – you’ll find that they rely on more than open source. They have some other mechanism to generate revenue:

  • Many companies have something that is not free – both in the libre and the beer sense. Think licensing costs, commercial add-on modules which are neither open or free, etc. If your product is software, you want to find some way to monetize that software. While most open source licenses don’t care if you charge for your software, the end result is that unless you restrict your software in some fashion or provide some added value outside of the open source software, you won’t make any money off your software. You have to hold something back or restrict the software in some way in order to generate revenue strictly from your software.
  • Paid support and/or training on the software. However, as a primary means for a business model, this is not a recipe for large-scale success. This is for two reasons: (1) it does not scale. You may make money yourself by providing support for your software, but as the popularity grows, you will lose ground. There’s only one of you, and the support you provide will eat into your actual development effort. This can, ironically, cause your project to lose popularity. If you find someone to help you, you can hire them; but, assuming they are equally as smart and talented and dedicated as you, you’ve only prolonged the problem. You can’t hire every customer to support your project. (2) Relying only on support implies your software is impenetrable, difficult, and buggy. As you fix that, people will no longer need to pay you (or pay you as much) for support. The more your software matures, the less necessary you are. That is a good thing, but it does undermine this business model.
  • Providing said software as a service, if the project allows for it. However, this may not be enough, as it is difficult to maintain a competitive edge unless the software is substantially complex. If you build a service on top of open source software, someone else will build one too. And another one. And the race to zero begins. Since all modifications have to be made and distributed freely (although SaaS has circumvented many of the older licenses, much to the consternation of the FSF), there is little competitive advantage from one service provider to another (unless, of course, you don’t have to distribute your changes: but that’s hardly open and libre). Note that some open source licenses are much more permissive than the FSF licenses in this regard however – but creating services from your software has a high cost associated with it as well (infrastructure isn’t free). It is, however, one of the more attractive business models: witness Rackspace, hosted VoIP providers, etc.
  • Sell hardware. However, hardware, in general, is not a growth market. This also has a prohibitively high infrastructure/startup cost, which makes it even less palatable.

There are pro’s and con’s to any mechanism used to make money off of open source software. Regardless, the long and short of it is: you need a business plan. In fact, if your primary product is open source software, you need an even more solid business plan than your average software company. And you definitely need a business plan before you choose how to license your software, if you want to make a living from your project.

Do you have a complex project with a targeted niche where your expertise will be required? GPL may be appropriate. Are you planning on using your project as part of a larger whole? A more permissive license may be fine. But all of these are subject to the much more important question of: how do you want to make your money. Because once you ring the bell on the license, you’re going to be stuck with that business plan, and it is very hard to change it.

All of this only matters if you want to make money from your project. That is not the only reason to write software. For developers, it often isn’t even the most important one – and if what you want is to see your work get used, then by all means – license however you feel best to you. Permissive licenses will get your software used the most and the fastest; but there’s something to be said for protecting the freedom of your software as well.

But always know your motivations before you release your software. Otherwise, you may end up with a lot of unnecessary regrets.

As an epilogue to this whole post: the individual wanted to make money. He was sure that since he made a little bit of money with the tool, that he’d make a lot with the project – and he viewed open sourcing it as a way of getting free help. There’s a whole host of problems with viewing open source development in that light as well – not the least of which is that you don’t control what your open source contributors work on. I’m not sure he appreciated the sentiment of needing a concrete business plan either; we in the programming/engineering field often don’t like to think through such things. Still, I wished him the best of luck – he has a tough road ahead.

Oh Discordia

I found out today that the man who hired me out of college and mentored me at my first job died in his sleep on New Years Eve. He was only 44.

They believe it was complications from diabetes. He was, however, relatively healthy overall. Finding out that he is dead is so completely unexpected that there aren’t really words to describe it. There aren’t words to describe the entire situation; it is horrible.

I owe a lot to this man. When I find myself in a difficult spot with co-workers, I often think, how would he defuse the situation? When I find myself dreading a phone call or dealing with a situation, I remember him sighing and picking up the phone time and time again to pull my bacon out of the fire with some irate customer. When I wonder what I should say to someone who is frustrated, I remember his advice to me while I was on the road and working late at night. He shaped my understanding of business, and what it means to have a career you can be proud of.

More than what he gave me, he was, quite frankly, just an awesome guy. A lover of dogs, a talented woodworking craftsman, a die-hard Philadelphia sports fan, a tenacious worker with a savvy understanding of technology, a man who could throw an amazing party, a loving husband, a great man and a great friend.

The world is darker with him gone.

Asterisk 12: Now Available (with PJSIP)

We released Asterisk 12 on Friday. So, that’s a thing.

There’s a lot to talk about in Asterisk 12 – as mentioned when we released the alpha, there’s a lot in Asterisk 12 that makes it unlike just about any prior release of Asterisk. A new SIP channel driver, all new bridging architecture (which is pretty much the whole “make a call” part of Asterisk), redone CDR engine (borne out of necessity, not choice), CEL refactoring, a new internal publish/subscribe message bus (Stasis), AMI v2.0.0 (build on Stasis), the Asterisk REST Interface (ARI, also built on Stasis), not to mention the death of chan_agent – replaced by AppAgentPool, completely refactored Parking, adhoc multi-party bridging, rebuilt features, the list goes on and on. Phew. It’s a lot of new code.

Lots of blog posts to write on all that content.

I’ve been thinking a lot about how we got here. The SIP channel driver is easiest: I’ll start there.

Ever since I was fortunate enough to find myself working on Asterisk – now two years, five months ago (where does the time go!) – it was clear chan_sip was a problem. Some people would probably have called it “the problem”. You can see why – output of sloccount below:

  • 1.4 – 15,596
  • 1.6.2 – 19,804
  • 1.8 – 23,730
  • 11 – 25,823
  • 12 – 25,674

Now, I’m not one for measuring much by SLOC. Say what you will about Bill Gates, but he got it right when he compared measuring software progress by SLOC to measuring aircraft building progress by weight. That aside, no matter who you are, I think you can make two statements from those numbers:

  1. The numbers go up – which means we just kept on piling more crap into chan_sip
  2. The SLOC is too damned high

sloc_is_too_highI don’t care who you are, 15000 lines of code in a single file is hard to wrap your head around. 25000 is just silly. To be honest, it’s laughable. (As an aside: when I first started working on Asterisk and saw chan_sip (and app_voicemail, but that’s another story), I thought – “oh hell, what did I get myself into”. And that wasn’t a good “oh hell”.)

It isn’t like people haven’t tried to kill it off. Numerous folks in the Asterisk community have taken a shot at it – sometimes starting from scratch, sometimes trying to refactor it. For various reasons, the efforts failed. That isn’t an insult to anyone who attempted it: writing a SIP channel driver from scratch or refactoring chan_sip is an enormous task. It’s flat out daunting. It isn’t just writing a SIP stack – it’s integrating it all into Asterisk. And, giving credit to chan_sip, there’s a lot of functionality bundled into that 15000/25000 lines of code. Forget calling – everything from a SIP registrar, SIP presence agent (kinda, anyway), events, T.38 fax state, a bajillion configuration parameters is implemented in there. The 10% rule applies especially to this situation – you need to implement 90% of the functionality in chan_sip to have a useful SIP stack, and the 10% that you don’t implement is going to be the 10% everyone cares about. And none of them will agree on what that 10% is.

As early as Asterisk 11, a number of folks that I worked with had starting thinking about what it would take to rewrite chan_sip. At the time, I was a sceptic. I’m a firm believer in not writing things from scratch: 90% of new software projects fail (thanks Richard Kulisz for the link to that – I knew the statistic from somewhere, but couldn’t find it.) (And there’s a lot of these 90/10 rules, aren’t there? I wonder if there’s a reason for that?). Since so many new software projects fail – and writing a new SIP channel driver would certainly be one – I figured refactoring the existing chan_sip would be a better course. But I was wrong.

Refactoring chan_sip would be the better course if it had more structure, but the truth is: it doesn’t. Messages come in and get roughly routed to SIP message type handlers; that’s about it. There’s code that gets re-used for different types of messages, and change their behaviour based on the type of message. But a lot of that is just bad implementation and bad organizational design; worse is the bad end user design. You can’t fix users.conf; you can’t fix peers/friends/users in sip.conf. (And no, the rules for peers versus friends isn’t consistent.) Those are things that even if you had a perfect implementation you’d still have to live with, and that’s not something I think any one wants to support forever.

But, I don’t believe that not liking something is justification for replacing it. After all, chan_sip makes a lot of phone calls. And that’s not nothing. So, why write a new SIP channel driver?

As someone who has to act as the Scrum master during sprint planning and our daily scrums, I know we spend a lot of time on chan_sip. Running the statistics, about 25% of the bugs in Asterisk are against chan_sip; anecdotally, half of the time we spend is on chan_sip. When we’re in full out bug fix mode – which is a lot of the time – that’s half of an eight man development team doing nothing but fixing bugs in one file. Given its structure, even with a lot of testing, we still find ourselves introducing bugs as we fix new ones. Each regression means we spent more than twice the cost on the original bug: the cost to fix it, the cost to deal with triaging and diagnosing the resulting regression report, the cost to actually fix the regression, the impact to whatever issue doesn’t get fixed because we’re now fixing the regression, the cost to write the test to ensure we don’t regress again, etc. What’s more, all of the time spent patching those bugs is time that isn’t spent on new features, new patches from the community, improving the Asterisk architecture, and generally moving the project forward.

Trying to maintain chan_sip is like running in place: you’re doing a lot, but you aren’t going anywhere.

A few weeks before AstriDevCon in 2012, we were convinced that we should do something. There wasn’t any one meeting, but over the months leading up to it, the thought of rewriting chan_sip crossed a number of people’s minds. There were a few factors converging that motivated us prior to the developer conference:

  1. In my mind, the knowledge that we were spending half of the team’s energy on merely maintaining a thing was motivation enough to do something about it.
  2. The shelving of Asterisk SCF. Regardless of the how and the why that occurred, the result was that we had learned a lot about how to write a SIP stack that was not chan_sip. Knowing that it could be done was a powerful motivator to do it in Asterisk.
  3. Asterisk 11. We had spent a lot of time and energy making that as good of an LTS as we thought we could: so if we were going to do something major in Asterisk, the time was definitely now.

As we went into AstriDevCon, the foremost question was, “would the developer community be behind it? Would they want to go along with us?”

As it turned out: yes. In fact, I wasn’t the first one to bring it up at AstriDevCon – Jared Smith (of BlueHost) was. And once the ball got rolling, there wasn’t any stopping it.

The rest, as they say, is history. And so, on Friday, the successor to chan_sip was released to the world.

There’s a long ways to go still. Say what you will about chan_sip (and you can say a lot), it is interoperable with a lot of devices, equipment, ITSPs, and all sorts of other random SIP stacks. It does “just” work. And chan_pjsip – despite all of our testing and the knowledge that it is built on a proven SIP stack, PJSIP – has not been deployed. It will take time to make it as interoperable as chan_sip is. But we’ll get there, and the first biggest step has been taken.

The best part: all of the above was but one project in Asterisk 12! Two equally large projects were undertaken at the same time – the core refactoring and Stasis/ARI – because if you’re going to re-architect your project, you might as well do as much as you can. But more on that next time.

Follow

Get every new post delivered to your Inbox.