Table of Contents

Contents

Datagram Congestion Control Protocol

 

This site was derived from http://wlug.org.nz/DCCP but this is now the master site for the DCCP implementation for Linux.

DCCP is a transport level protocol (like TCP and UDP) which aims to solve many different congestion issues. This is useful for applications that don't need the data reliability/re-transmission of TCP, but want a session and want congestion control unlike UDP.

DCCP is currently at proposed standard RFC status (4340-4342).

The main reference page on the web for DCCP.

There is also a writeup at LWN.

DCCP stack for Linux

 

There is a GPL version of DCCP in the Linux Kernel. This is being maintained by Arnaldo C. Melo at present. The history of this is that it draws from the code of Patrick McManus, Lulea and the WAND group that Ian McDonald is part of.

The core DCCP stack was written by Arnaldo C. Melo using the Linux TCP implementation as a model, with DCCP being used as a way to identify code in the TCP implementation that could be made generic and shared with other INET transport level implementations. This resulted in the generalisation of code related to the minisockets representing both TCP_SYN_RECV/DCCP_RESPOND and TCP_TIME_WAIT/DCCP_TIME_WAIT status, code related to established/timewait/listen sockets (inet_lookup, inet_lookup_established, etc), the interface to get sock information (tcp_diag), and many other functions and data structures, with more expected to be generalised and eventually used by SCTP and any other INET transport protocols that may be introduced in the future.

The CCID 3 code was drawn from the WAND group, that in turn got it initially from the Lulea FreeBSD codebase and made it work in the core DCCP stack written by Patrick McManus. Ian McDonald got it relicensed from BSD license to GPL by getting written permission via email from the original Lulea authors of the code. It was modified by Arnaldo C. Melo to fit Linux standards wrt list handling and several other aspects. Since then Gerrit Renker and Ian McDonald have spent a lot of time improving CCID 3.

The CCID 4 code was drawn from the Embedded Laboratory, by Leandro Melo de Sales, Ivo Calado and Erivaldo Xavier, that in turn got it initially from the CCID 3 and made it work in the core DCCP stack, adding the differences for CCID4 available in the RFC 5622. The CCID 4 can be considered to be a working in progress implementation, while now Gerrit Renker has been also provide efforts on the implementation of CCID 4.

The  CCID 5/249 code was drawn from the Embedded Laboratory, by Ivo Calado and Leandro Melo de Sales. This CCID implements the Cubic congestion control in a attempt to achieve similar performance of TCP Cubic in long fat network. The code is based on a mix of TCP-Cubic and CCID-2 implementations. The CCID 5/249 can be considered to be a working in progress implementation.


The CCID modular infrastructure was written to fit the CCID 3 existing interface, but will probably be changed in the near future in the effort to have a generic CA (Congestion Avoidance) infrastructure shared with TCP | CCID (and others, who knows), continuing work on the existing TCP | CCID CA infrastructure put in place by Stephen Hemminger.

To have a look at the theoretical performance of CCID 3 see xcalc spreadsheet - the codebase currently assumes s=256, unless you override with an option.

There is a mailing list for DCCP work which is dccp at vger dot kernel dot org. Discussion also occurs on the main Linux networking mailing list - netdev at vger dot kernel dot org.

Experimental Work

 

Andrea Bittau is experimenting with wiring the TCP plugabble congestion control to DCCP, graphs can be seen here.

Gerrit Renker is fixing up the CCID3 implementation and his notes can be found here

At present Gerrit's tree has a substantial number of improvements and is where most work should be done. Details of using this can be found at DCCP_Testing#Experimental_DCCP_source_tree

Choosing and initialising your CCID

 

To use CCID2 do something like:

sudo modprobe dccp
sudo modprobe dccp_ccid2
sudo sysctl -w net.dccp.default.seq_window=10000
sudo sysctl -w net.dccp.default.rx_ccid=2
sudo sysctl -w net.dccp.default.tx_ccid=2

To use CCID3 do something like:

sudo modprobe dccp
sudo modprobe dccp_ccid3
sudo sysctl -w net.dccp.default.seq_window=10000
sudo sysctl -w net.dccp.default.rx_ccid=3
sudo sysctl -w net.dccp.default.tx_ccid=3

The seq_window increases the sequence window validation from the default of 100 which can cause problems. send_ackvec sets the appropriate setting for the CCID. This should be fixed so it defaults correctly really.

Applications

tcpdump support

 

 

tcpdump now has DCCP support in the tree. There is tcpdump support available at here for older versions. This applies to many versions and at least the weekly build from CVS of tcpdump dated 22nd August 2005. This is in the process of being tidied up for submission to tcpdump maintainers. Remember to run tcpdump(8) with a -s0 parameter to capture all data (or some other value) as the default size gets the base DCCP header, but not the options, in many cases. Also, if you want to capture only DCCP traffic, you can use the primitive ip[9] == 33 in the filter expression.

ethereal/wireshark support

See here

Latest version of Wireshark seems to work just fine with DCCP. iperf support

There is a patch for DCCP available on the iperf page.

GStreamer support

GStreamer is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing. Applications can take advantage of advances in codec and filter technology transparently. Developers can add new codecs and filters by writing a simple plugin with a clean, generic interface. For more about gstreamer go to GStreamer.net

D-ITG

D-ITG supports DCCP. It is available on the D-ITG page.

netcat support

 

A patch for netcat is available here or here. The version of netcat that this is used against is 0.71 and was used on 8th September for, what we believe to be, the first public transmission of DCCP over the Internet.

The above patch doesn't handle service codes and Guillaume Teissier has modified the patch to do this. At some point it will be merged with the original or the original deleted but this new patch can be found at http://wand.net.nz/~iam4/dccp/netcat-0.7.1.complete.patch

Python support

 

Python's low level socket library is transparent enough to support DCCP without even knowing it! Here is an adaptation of the http://wand.net.nz/~iam4/dccp/dccp-cs-0.01.tar.bz2 example code:

#!/usr/bin/python

import socket

socket.DCCP_SOCKOPT_PACKET_SIZE = 1
socket.DCCP_SOCKOPT_SERVICE     = 2
socket.SOCK_DCCP                = 6
socket.IPPROTO_DCCP             = 33
socket.SOL_DCCP                 = 269
packet_size                     = 256
address                         = (socket.gethostname(),12345)

# Create sockets
server,client = [socket.socket(socket.AF_INET, socket.SOCK_DCCP, 
                               socket.IPPROTO_DCCP) for i in range(2)]
for s in (server,client):
    s.setsockopt(socket.SOL_DCCP, socket.DCCP_SOCKOPT_PACKET_SIZE, packet_size)
    s.setsockopt(socket.SOL_DCCP, socket.DCCP_SOCKOPT_SERVICE, True)

# Connect sockets
server.bind(address)
server.listen(1)
client.connect(address)
s,a = server.accept()

# Echo
while True:
    client.send(raw_input("IN: "))
    print "OUT:", s.recv(1024)

Ruby support

 

Ruby's low level socket library is also able to support DCCP. Here is the corresponding Ruby code to the previous Python script :

#!/usr/bin/ruby

require 'socket'
class Socket
    DCCP_SOCKOPT_PACKET_SIZE = 1
    DCCP_SOCKOPT_SERVICE     = 2
    SOCK_DCCP                = 6
    IPPROTO_DCCP             = 33
    SOL_DCCP                 = 269
end
packet_size = 256
address = [12345,'localhost']

# Create sockets
server = Socket.new(Socket::AF_INET, Socket::SOCK_DCCP, Socket::IPPROTO_DCCP)
client = Socket.new(Socket::AF_INET, Socket::SOCK_DCCP, Socket::IPPROTO_DCCP)

[server,client].each do |s|
    s.setsockopt(Socket::SOL_DCCP, Socket::DCCP_SOCKOPT_PACKET_SIZE, packet_size)
    s.setsockopt(Socket::SOL_DCCP, Socket::DCCP_SOCKOPT_SERVICE, true)
end

# Connect sockets
sockaddr = Socket.pack_sockaddr_in(*address)
server.bind(sockaddr)
server.listen(1)
client.connect(sockaddr)
s,a = server.accept

# Send/receive stuff
client.send('tartine',0)
p s.recv(10)

PHP

 

PHP's low level socket library is also able to support DCCP. Here is the corresponding Python and Ruby code of the previous scripts:

Server:

#!/usr/bin/php
$SOCK_DCCP                = 6
$IPPROTO_DCCP             = 33

$bind_address = "192.168.1.1";
$port = 9011;

// Connect sockets
$server_socket_fd = socket_create(AF_INET, $SOCK_DCCP, $IPPROTO_DCCP);
socket_bind ($server_socket_fd, $bind_address, $port);
socket_listen($server_socket_fd);
$client_socket_fd = socket_accept($server_socket_fd);
$recv = socket_read($client_socket_fd, 11);
echo $recv;
socket_close($server_socket_fd);

Client:

#!/usr/bin/php
$bind_address = "192.168.1.1";
$port = 9011;

$client_socket_fd = socket_create(AF_INET, $SOCK_DCCP, $IPPROTO_DCCP);

$message = "Hello world";

if (socket_connect ($client_socket_fd, $bind_address, $port)) {
        socket_send($client_socket_fd, $message, strlen($message), 0);
}

socket_close($client_socket_fd);

C Application Programming Interface to DCCP

 

DCCP's implementation on Linux is based on the TCP implementation. The API provided is through the linux //socket// library and so, much of the code looks similar to initialization/send/receive code of a regular TCP socket. The server initializes a socket, binds it to a port, and waits to accept clients to connect. The client initializes a socket and connects to the server. Note that although the connection setup of DCCP is like TCP, transmission of data is unreliable like UDP.

Here's how to get started: There are some general #includes and #defines needed by both client and server. These are:

//----------------------------//
//General defines and includes needed
#include <arpa/inet.h>   
#include <errno.h>     
#define SOCK_DCCP 6      
#define IPPROTO_DCCP 33  //it must be this number. This number is assigned by IANA to DCCP
#define SOL_DCCP 269    
#define MAX_DCCP_CONNECTION_BACK_LOG 5
//----------------------------//

A DCCP server is much like a TCP server. The code for initializing, binding and accepting sockets is shown below. Note that the code for getting the local IP is not shown. Read Brief Socket Tutorial for linux socket basics.

//----------------------------//
//DCCP Server inititialization
//get a socket handle just like TCP Server
int mSocketHandle = socket(PF_INET, SOCK_DCCP, IPPROTO_DCCP);
//turn off bind address checking, and allow port numbers to be reused - otherwise
//  the TIME_WAIT phenomenon will prevent  binding to these address.port combinations for (2 * MSL) seconds.
int on = 1;
int result = setsockopt(mSocketHandle, SOL_DCCP, SO_REUSEADDR, (const char *) &on, sizeof(on));
//bind the socket to the local address and port. mLocalName is sockaddr of local IP and port
result = bind(mSocketHandle, (struct sockaddr *)&mLocalName, sizeof(mLocalName));
//listen on that port for incoming connections
result = listen(mSocketHandle, MAX_DCCP_CONNECTION_BACK_LOG);
//wait to accept a client connecting. When a client joins, mRemoteName and mRemoteLength are filled out by accept()
mClientSocketHandle = accept(mSocketHandle, (struct sockaddr *)&mRemoteName, &mRemoteLength);
//----------------------------//

A DCCP client is also much like a TCP client in that in connects to a remote server.

//----------------------------//
//DCCP Client initialization
//get a socket handle just like TCP Server
int mSocketHandle = socket(PF_INET, SOCK_DCCP, IPPROTO_DCCP);
//turn off bind address checking, and allow port numbers to be reused - otherwise
//  the TIME_WAIT phenomenon will prevent  binding to these address.port combinations for (2 * MSL) seconds.
int on = 1;
int result = setsockopt(mSocketHandle, SOL_DCCP, SO_REUSEADDR, (const char *) &on, sizeof(on));
//connect to remote server: mRemoteName is sockaddr containing remote server IP address and port
result = connect(mSocketHandle, (struct sockaddr *)&mRemoteName, sizeof(mRemoteName));
//----------------------------//

Once the connection has been set up (client successfully connected to server and server got a valid socket for the client), they can exchange data. Take a look at the send routine and notice the checking for errno. This is because DCCP's congestion control can refuse to send a packet.

//----------------------------//
//DCCP client sends message to server
char sendbuffer[100];
int status;
do
{
   status = send(mSocketHandle, sendbuffer, 100, 0);
}
while((status<0)&&(errno == EAGAIN));
//Note here the while loop tries to catch the errno
//DCCP's congestion control algorithm might result in
//data not being sent since congestion window is full (true?) 
//in which case it sets errno to EAGAIN

The receive routine is simple.

//----------------------------//
//DCCP server receives the client's message
char receive_buff[100];
int rec_size = recv(mClientSocketHandle, receive_buff, 100, 0);

Some Important Details

 

1. How do you know which congestion control algorithm is being used?

By default, it should be CCID2 which is a TCP-like congestion control but you can make sure by typing this in a console:

sysctl -a | grep dccp

This should give you something like:

net.dccp.default.tx_qlen = 5
net.dccp.default.retries2 = 15
net.dccp.default.retries1 = 3
net.dccp.default.request_retries = 5
net.dccp.default.send_ndp = 1
net.dccp.default.send_ackvec = 0
net.dccp.default.ack_ratio = 2
net.dccp.default.tx_ccid = 2
net.dccp.default.rx_ccid = 2
net.dccp.default.seq_window = 100

which means that both the outgoing and incoming congestion control mechanisms are CCID2

2. What if I want to change the congestion control algorithm to CCID3?

See Choosing and initialising your CCID above. Newer revisions of DCCP allow for setting of CCID through setsockop() functions of socket.

3. Current implementation of DCCP on linux does NOT support packet fragmentation

This means that if the packet you are sending over DCCP is greater than your Maximum Transmission Unit (MTU) it will not be sent! For example, if you call the send() routine with a size of 30000, chances are the packet will not be sent So you must take care to packetize your packets in sizes that are smaller than the MTU and still leave room for IP header (20 bytes) and DCCP header (32 bytes). packet size of about 1400 bytes is recommended BR

Other applications

 

More info on apps can be found at Gerrit's page or Ian's page.

TIMEWAIT sockets

 

TIMEWAIT sockets are finally implemented and we have initial support for iproute2, so just enable INET_DIAG and if enabled as a module make sure it is load prior to using the iproute2 utilities, like ss.

The latest iproute2 version now includes DCCP support directly.

Then use it:

[root@qemu ~]# ./ss -dane
State       Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN      0      0                  *:5001             *:*      ino:730 sk:cfd503a0
ESTAB       0      0          127.0.0.1:5001     127.0.0.1:32770  ino:731 sk:cfd51480
ESTAB       0      0          127.0.0.1:32770    127.0.0.1:5001   ino:741 sk:cfd517e0
[root@qemu ~]#
[root@qemu ~]# ./ss -dane
State       Recv-Q Send-Q Local Address:Port Peer Address:Port
TIME-WAIT   0      0          127.0.0.1:32770   127.0.0.1:5001    timer:(timewait,59sec,0)
                                                                  ino:0 sk:cf12a620

The above listing was with the ttcp test.

Mailing List Archives

 

DCCP@IETF - A discussion of the DCCP protocol by the IETF

dccp@vger.kernel.org - a discussion of the Linux implementation of DCCP

dccp@vger.kernel.org - Another archive for the DCCP@vger mailing list

TODO & testing

There is a TODO list also which tracks the issues needing working on.

There is a DCCP Testing page which also talks about how to test in DCCP Testing .

FAQ

 

Q: Why do I get an errno 13 (EACCES) or permission denied?
A: You are running SELinux that does not have DCCP support. Disable SELinux or upgrade to a newer version of the kernel.

Q: Why do I get an errno 90 (EMSGSIZE) or Message too long?
A: The packet size used is bigger than the PMTU.

The initial idea was to set the packet size option, similar to the example below, but this option became deprecated.

 void dccp_set_packet_size(int sock, int new_size)
 {
         /* This option is deprecated, DON'T USE THIS! */
         return setsockopt(sock, SOL_DCCP, DCCP_SOCKOPT_PACKET_SIZE,
                           &new_size, sizeof(new_size));
 }

Instead, you can get the current PMTU value, as shown below, and in your application you implement your sending mechanism filling the packets considering the mpu size:

 int dccp_get_mtu_size(int sock, int new_size)
 {
         int mtu_size;
         int ret = getsockopt(sock_fd, SOL_DCCP, DCCP_SOCKOPT_GET_CUR_MPS,
                              &mtu_size, sizeof(mtu_size));
         return mtu_size;
 }
 ...

Then use the dccp_get_mtu_size function to retrieve the mtu size and implement your sending mechanism considering this value.

IRC

 

server: irc.freenode.net
channel: #dccp-linux