A Distributed Communication System for Modular Applications
(Published on 2012-02-15)
I have a vision. A vision in which rigid point-to-point IPC is replaced with a far more flexible and distributed communication system. A vision in which different components in the same program can interact with each other without having to worry about each others' internal state. A vision where programs can be designed in a modular way, without even worrying about whether to use threads or an event-based model. A vision where every component communicates with others, and where you can communicate with every component. And more importantly, a vision in which each component can be implemented in a different programming language, without the need for specific code to glue everything together.
If that sounds interesting to you, then please read on. As a small research project of mine, I've been looking into ways to realize the above vision, and I believe to have found an answer. In this article I'll try to explain my ideas and how they may be used to realize this vision.
My ideas have been heavily inspired by Linda. If you're already familiar with that, then what I present here probably won't be very revolutionary. Still, there are several aspects in which my ideas differ significantly from Linda, so you won't be bored reading this. :-)
In this section I'll try to introduce the overall concept and some terminology. This is going to be somewhat abstract and technical, but please bear with me. I promise that things will get more interesting in the later sections.
Let me first define an abstract communications framework. We have a network and a bunch of sessions connected to that network. Sessions can communicate with each other through this network (that's usually what a network is for, after all). These sessions do not have to be static: they may come and go. Keep in mind that, for the purpose of explaining this concept, these terms are very abstract: a session can be anything. A process, thread, a single function, an object, or even your mobile phone. Anything. In the same way, the network is nothing more than an abstract way to connect these sessions. It could be sockets, pipes, a HTTP server, a broadcast network or just shared memory between threads. If it allows sessions to communicate I'll call it a network.
Unlike many communication systems, this network does not have the concept of addresses. There is no direct way for one session to identify another, and indeed there is no need to do so for the purposes of communication. Instead, the primary means of communication is by using tuples and patterns.
A tuple is an ordered set (list, array, whatever terminology you prefer) of zero or more elements. Each element may have a different type, so it can hold booleans, integers, floating point numbers, strings and even more complex data structures as arrays or maps. You may think of a tuple as an array in JSON notation, if that makes things easier to understand.
Sessions send and receive tuples to communicate with each other. On the sending side, a session simply "passes" a tuple to the network. This is a non-blocking, asynchronous operation. In fact, it makes no sense to make this a blocking action, because the sender can not know whether it will be received by any other session anyway. The tuple may be received by many other sessions, or there may not even be a single session interested in the tuple at all.
On the receiving side, sessions register patterns. A pattern itself is mostly just a tuple, but with a more limited set of allowed types: only those types for which exact matching makes sense, like booleans, integers and strings. A pattern matches an incoming tuple if the first
n elements of the tuple exactly match the corresponding elements of the pattern. A special wildcard element may be used to match any value of any type.
A sessions thus only receives tuples from other sessions if they have registered a pattern for them. As mentioned, it is not illegal to send a tuple for which no other sessions have registered. In this case, the tuple will just be discarded. It is also possible that many sessions have registered for a matching pattern, in which case all of these sessions will receive the tuple. As an additional rule, if a session sends out a tuple that matches one of its own patterns, then it will receive its own tuple. (However, programming interfaces might allow this to be detected and/or disabled if this eases the implementation of a session).
Finally, there is the concept of a return-path. Upon sending out a tuple, a session may indicate that it is interested in receiving replies. The network is then responsible for providing a return-path: a way for receivers of the tuple to reply to it. When a tuple is received, the session has the option to reply to it: a reply consists of one or more tuples that are sent directly to the session from which the tuple originated, using this return-path. When a receiver is done replying to the tuple or when it has no intention of sending back a reply, it should close the return-path to indicate this. The session that sent the original tuple is then notified that the return-path is closed, and no more replies will be received. If there is no session that has registered for the tuple, the return-path is closed immediately (or at least, the sending session is notified that there won't be a reply). If the tuple is received by multiple sessions, then the replies will be interleaved over the return-path, and the path is closed when all of the receiving sessions have closed their end.
Common design patterns and solutions
The previous section was rather abstract. This section provides several examples on how to do common tasks and design patterns by using the previously described concepts.
This is commonly implemented in OOP systems using the Observer pattern. Implementing the same using tuples and patterns is an order of magnitude more simple, as broadcast notifications are pretty much the native means of communication.
In OOP you have the "observers" that can add themselves to the "observer list" of any "object". This observer list is usually managed by the object that is to be observed. If something happens to the object, it will walk through the observer list and notify each observer.
If you represent an object as a session and define a notification as a tuple that follows a certain pattern, then you very easily achieve the same functionality as with an OOP implementation. In fact, there are some advantages to doing it this way:
- Sessions stay registered to the same notifications even if the "object" (the session that is being observed) is restarted or replaced with something else. It's the network itself that keeps track of the registrations, not the sessions that provide the notifications. Of course, this can be seen as a drawback, but you can easily emulate OOP behaviour by providing an extra notification when the "object" is shut down, indicating that the observing sessions can remove their patterns.
- Since there is no need for the session that is being observed to keep a list of sessions that are observing it, it also doesn't have walk the list and send out multiple notifications. Notifying the observers is as simple as sending out a single tuple.
- Many implementations of the Observer pattern maintain only a single list of observers per object, and each listed observer will be notified for every change to the object. For example, if an object maintains a list and provides notifications when something is added and deleted to the list, every observer will be notified of both the "added" action and the "deleted" action. The use of tuples and patterns allows observers to register for all actions, or just for a single one. If an "add" action would be notified with a tuple of
["object", "add", id]and a "delete" action with
["object", "delete", id], then an observing session can register with the pattern
["object", *]to be notified for both actions, or just
["object", "add"]to register only for additions.
Of course, this is only one way to implement a notification mechanism. There are also solutions that more accurately mimic the behaviour of the Observer pattern OOP in cases where that is desired.
A command is what I call something along the lines of one session telling an other session to do something. Suppose we have a session representing a file system. A command for this session could then be something like "delete file X".
In a sense, this isn't much different from a notification as described above. The file system session would have registered a pattern like
["fs", "delete", *], where the wildcard is used for the file name. If an other session then wants to have a file deleted, the only thing it will have to do is send out a tuple matching that pattern, and the file system session will take care of deleting it.
In the above scenario, the session sending the command has no feedback whatsoever on whether the command has been successfully executed or not. Whether this is acceptable depends of course on the specific application. One way of still providing some form of feedback is to have the file system session send out a notification tuple, e.g.
["fs", "deleted", "file"] (Note that the second element is now
deleted rather than
delete. Using the same tuple for actions and notifications is going to be very messy...). This way the session sending the command, in addition to any other sessions that happen to be interested in file deletion, will be notified of the deletion of the file. An alternative solution is to use the RPC-like method, as described below.
RPC is in essence nothing else than providing an interface similar to a regular function call to a component that can't be reached via a regular function call (e.g. because the object isn't inside the address space of the program). RPC is generally a request-response type of interaction, and making use of the return-path facility as I described earlier, all of the functionality of RPC is also available with the concept of tuple communication.
Commands, the RPC-way
Take the previous file system example. Instead of just sending the command tuple to delete the file, the session could indicate that it is interested in replies and the network will create a return-path. If the return-path is closed before any replies have been received, then the commanding session knows that the file system session is either down or broken. Otherwise, the file system session has the ability to send back a response. This could be a simple "okay, file has been deleted" tuple if things went alright, or an error indication if things didn't go too well. The commanding session has the option to either block and wait for a reply (or a close of the return-path), or continue doing whatever it wanted to do and asynchronously check for a reply.
The downside of using the return-path rather than the previously mentioned notification approach is that other sessions can't easily be notified of file deletion. Of course, an other session can register for the same pattern as the file system did and thus receive the same command, but it would have no way of knowing whether the delete was actually successful or not. For other sessions to be notified as well, the file system session would probably have to send out a notification tuple. Of course, it all depends on the application whether this is necessary, you only have to implement the functionality that is necessary for your purposes.
Another use of RPC, and thus also of the return-path, is to allow sessions to request information from each other. Using the same example again, the file system session could register for a pattern such as
["fs", "list"]. Upon receiving a tuple matching that pattern, the session would send a list of all its files over the return-path. Other sessions can then request this list by simply sending out the right tuple and waiting for the replies.
Advantages over other systems
Now that I've hopefully convinced you that my communication concept is powerful enough to build applications with it, you may be wondering why you should use it instead of the other technologies. After all, you can achieve pretty much the same functionality with just regular OOP, RPC, message passing, or other systems. Let me present some of the inherent advantages that this system has compared to others, and why it will help in designing flexible and modular applications.
Loose coupling of components
Sessions (representing the components of a system) do not have to have a lot of knowledge about each other. Sessions implicitly provide abstracted services using tuple communications, in much the same way as interfaces explicitly do in OOP.
Very much unlike OOP, however, is that sessions do not even have to know of each other how they should be used in threaded or event-based environments. For example, threading in OOP is a pain: which objects should implement synchronisation and which shouldn't? The answer to this question is not nearly as obvious as it should be. With event-based systems, you'll always need to worry about how long a certain function call block the callers' thread. Since communication between the different sessions is completely asynchronous, these worries are gone.
Sessions can communicate with other sessions without knowing where they are. This has as major advantage that a session can be moved around without having to change a single line of code in any of the sessions relying on its service. This allows sessions that communicate a lot with each other to be placed in the same process, while resource-heavy sessions may be distributed among several physical devices.
Programming language independence
All communication is solely done with tuples, which can be represented as abstract objects and serialized and deserialized (or marshalled/unmarshalled, whichever terminology you prefer) for communication. I used a JSON array as an example of a tuple earlier, and perhaps it's not such a bad one: JSON data can be interchanged between many programming languages, and are quite often not that annoying in use. Still, there are many other alternatives (Bencoding, XML, binary encodings, etc.), and it all depends on the exact data types and values you wish to use for communication.
Language independence allows each session to be (re)implemented in a different language, again without affecting any other sessions. Did you write an application in a high-level language and noticed that performance wasn't as good as you wanted? Then you can very easily rewrite the most resource-heavy sessions in a low-level language such as C. Similarly, it allows developers to hook into your application even when they are not familiar with your favorite programming language.
Not only can other applications and/or plugins hook into your application, you can also connect a simple debugger to the network. The debugger just has to register for a pattern and then print out any received tuples, allowing you to see exactly what is being sent over the network and whether the sessions react as expected. Similarly, the debugger could allow you to send tuples back to the network and see whether the sessions react as they should. Unfortunately, what is being sent over a return-path is generally not visible to anyone but the receiver of the replies, although a network implementation might allow a debugging application to look into that as well.
Where to go from here
What I've described above is nothing more than a bunch of ideas. To actually use this, there's a lot to be done.
- Defining a "tuple"
What types can be used in tuples? Should a tuple have some maximum size or a maximum number of elements? Should a
NULLtype be included? What about a boolean type, why not use the integers 1 and 0 for that? Should it be possible to interchange binary data, or only UTF-8 strings?
What will be the size of an integer that a session can reasonably assume to be available? Specifying something like "infinite" is going to be either inefficient in terms of memory and CPU overhead or will require extra overhead (in terms of code) in usage. Specifying that everything should fit in a 64bit integer is a lot more practical, but may be somewhat annoying to cope with in many dynamically typed languages running on 32bit architectures. Specifying that integers are 32bits will definitely ease the implementation of the network library in interpreted languages, but lowers the usefulness of the integer type and is still a pain to use in OCaml (which has 31bit integers).
These choices greatly affect the ease of implementing a networking library for specific programming languages and the ease of using the network to actually develop an application.
- The exact semantics of matching
- Somewhat similar to the previous point, the semantics of matching tuples with patterns should also be defined in some way. Some related questions are whether values of different types may be equivalent. For example, is the string
"1234"equivalent to an integer with that value? What about NULL and/or boolean types? If there is a floating point type, you probably won't need exact matching on those values (floating points are too imprecise for that anyway), but you might still want the floating point number
10.0to match the integer
10to ease the use in dynamic languages where the distinction between integer and float is blurred.
- Defining the protocol(s)
- Making my vision of modularity and ease of use a reality requires that any session can easily communicate with an other session, even if they have a vastly different implementation. To do this, we need a protocol to connect multiple processes together, whether they run on a local machine or over a physical network.
- Coding the stuff
- Obviously, all of this remains as a mere concept if nothing ever gets implemented. Easy-to-use libraries are needed for several programming languages. And more importantly, actual applications will have to be developed using these libraries.
Of course, realizing all of the above is an iterative process. You can't write an implementation without knowing what data types a tuple is made of, but it is equally impossible to determine the exact definition of a tuple without having experienced with an actual implementation.
What's the plan?
I've been working on documenting the basics of the semantics and the point-to-point communication protocol, and have started on an early implementation in the Go programming language to experiment with. I've dubbed the project Tanja, and have published my progress on a git repo.
My intention is to also write implementations for C and Perl, experiment with that, and see if I can refine the semantics to make this concept one that is both efficient and easy to use.
Since I still have no idea whether this concept is actually a convenient one to write large applications with, I'd love to experiment with that as well. My original intention has always been to write a flexible client for the Direct Connect network, possibly extending it to other P2P or chat networks in the future. So I'd love to write a large application using this concept, and see how things work out.
In either case, if this article managed to get you interested in this concept or in project Tanja, and you have any questions, feedback or (gasp!) feel like helping out, don't hesitate to contact me! I'm available as 'Yorhel' on Direct Connect at
adc://blicky.net:2780 and IRC at
irc.synirc.net, or just drop me a mail at