Howdy, strangers ! In today’s article on networking we’re looking at the application layer. If you’ve been paying attention, you know that it’s a formless layer. The reason is in the name : when he invented the Internet, Al Gore simply couldn’t predict what applications you’d use it for.
This means you need to come up with your own network protocol, optimized for your application’s needs. But what does that mean ? I kinda sounds like I’m telling you to come up with an arbitrary language, and that sounds hard !
We can’t all be Noam Chomsky, and fortunately we don’t have to.
Quick note : this article is mostly theoretical and comes in support project-specific articles. If you’re looking for examples, that’s where you should go next.
Networking and Payloads
When I say the application layer is formless, what I mean is that there’s no grammar or vocabulary to it. But there are letters : bytes. Those bytes will form the payload of network packets, and one reason for using bytes is that they are the computer science equivalent of atoms. Memory capacity is expressed in bytes, and so are memory addresses. And while your PC may use a 64-bit processor, there are many, many more 8-bit processors out there, for example the one that runs the timer on your oven. Communicating with bytes makes it easier for everything to talk together.
Another important reason networks use bytes is endianness. Not every computer agrees on which 8 bits should come out first when you transmit a 16-bit word or larger. We send bytes because they are the common denominator.
So you have bytes. What can you do with them ? What should you do with them ?
What do network packets do ?
That’s the fundamental question you need to answer in order to design your application layer.
Packets do two things. The most obvious is data transmission. The less obvious (though not by much) is timing control. In other words, it’s not just the data you carry that matters, it’s also when you carry it. A good protocol addresses both aspects.
Data transmission : the “vocabulary”
Typically, you will design packets to carry specific variables from one mahine to another. This means you’ll need to map those variables onto an array of bytes. That array of bytes will be your packet payload.
You can do this mapping with many different tools depending on which programming language you’re using and, of course, your own preferences. You may even want to follow strict coding rules, if you’re working in the auto or aeronautic industries. Whether you use C unions, byte pointers or the C# “tobytearray” method is up to you.
It’s entirely possible to design a single packet that can carry every possible variable your program may wish to communicate. In practice however, you’ll want to use bandwidth more efficiently. Moreover your network may limit a packet’s size. Luckily, the application layer being formless, you can design as many different packet payload formats as you wish… as long as you make sure they include something to let your program tell them apart. Consider those payload formats the words of your protocol’s vocabulary.
Ultimately you’ll face a trade-off : using lots of small packets will help you shape your application’s network traffic, but using fewer larger packets will reduce network overhead. Overhead isn’t just bandwidth, but all the API calls to your network protocol stacks. Where you set the needle will depend entirely on your application.
The protocol / language analogy is very apt, though : whatever language you speak, you’re well aware that the words you use most often are also the shortest. You can apply that same principle to designing your application layer packets.
Timing : the “grammar”
When to send a packet is as important as what you put in it. Likewise, the time at which you receive a packet can be as important as what’s in it. Case in point : the ping service, where timing is literally everything.
One of the key choices you get to make as you design an application layer is the frequency at which you send packets. If you’re transferring files (bulk data) then you want maximum frequency, sending packets back to back. Or you could send packets in reaction to an event : asynchronously. Third option, you could send your packets at regular intervals.
Which timing to use is not always obvious. For example, a networked button could simply send a packet when the user presses it. Yet by using periodic packets instead you can also convey the information that the button is still “alive”.
Where to start
Crafting an application layer can be as easy or as complicated as you want to make it. My advice : always start simple, with as few different packets as possible. Keep in mind that you’ll need to write code to send and process each type of packet you design, starting with few packets gets you quicker to something that works. Then you can refine your application layer.
Nope, I didn’t invent that word just to troll you. It’s a real thing and you really need to keep a eye on very closely as you design your application layer.
Another reason network protocols are based on bytes is that computers sometimes disagree on how to store wider formats in memory. Take the 16-bit value 0x1234, made of bytes 0x12 and 0x34 : if you’re on an Intel-based machine, it will be stored in memory as 0x3412 because the “least significant” byte goes first. This is called little-endian. Other machine types do the opposite, it’s called big-endian. Now here’s the rub : it was decided long ago that networks are big-endian. I guess Murphy was involved somewhere.
What does it mean to you, as a programmer ? Maybe nothing. If you’re crafting an application layer that will only be used among little-endian machines (for example, you’re programming a PC multiplayer game) then it doesn’t matter to you which order the bytes of your packets get sent in. The network is not going to shuffle them around in transit.
Most machines today are little-endian. This includes PC’s and ARM-based machines like Android devices.
However, if you’re going to network together machines with different endianness, then you need to decide which endianness you’re going to use. It doesn’t need to be that of the network (big-endian). The smart thing to do is to use the endianness of the least powerful devices in your project. It is far easier for a PC to swap bytes around than it is for an microcontroller.
You’ll always end-up with some machine(s) needing code to convert between big and little endianness. This can be implemented in different ways : inline functions, methods, macros… if you’re lucky your processor might also have dedicated byte-swapping instructions. The good news is you only need to code those tools once. The bad news is you need to remember to use them any time you read or write packet payloads. That’s one of the dubious joys of network programming.
This is perhaps the shortest article I’ve written so far. Part of that is due to the lack of examples (holding your hands takes time) but really this is because of the vastness of the topic : I could either give you this quick to-the-point introduction or start writing a whole book on protocol design. The latter has been done already, but I’ve been told the former was very hard to come by. So here you are.
I expect you come to this page from a link in one of my projects. Otherwise I suggest you check out my various networking-enabled projects : they will all contain an example of application layer design from which you could draw inspiration.
In my next article I’ll move on to the layer directly below the application layer : the transport layer.
See you there !