The Intermediate Functional Block device is the successor to the IMQ iptables module that was never integrated. Advantage over current IMQ; cleaner in particular in SMP; with a _lot_ less code. Old Dummy device functionality is preserved while new one only kicks in if you use actions.
As far as i know the reasons listed below is why people use IMQ. It would be nice to know of anything else that i missed.
Drops excess packets …, throttling TCP window sizes and reducing the overall output rate of TCP-based flows
But I wont go back to putting netfilter hooks in the device to satisfy this. I also dont think its worth it hacking ifb some more to be aware of say L3 info and play ip rule tricks to achieve this.
Instead the plan is to have a contrack related action. This action will selectively either query/create contrack state on incoming packets. Packets could then be redirected to ifb based on what happens (e.g. on incoming packets); if we find they are of known state we could send to a different queue than one which didnt have existing state. This all however is dependent on whatever rules the admin enters.
At the moment this function does not exist yet. I have decided instead of sitting on the patch to release it and then if theres pressure i will add this feature.
What you can do with ifb currently with actions
Lets say you are policing packets from alias 192.168.200.200/32 you dont want those to exceed 100kbps going out.
tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \ match ip src 192.168.200.200/32 flowid 1:2 \ action police rate 100kbit burst 90k drop
If you run tcpdump on eth0 you will see all packets going out with src 192.168.200.200/32 dropped or not Extend the rule a little to see only the ones that made it out:
tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \ match ip src 192.168.200.200/32 flowid 1:2 \ action police rate 10kbit burst 90k drop \ action mirred egress mirror dev ifb0
Now fire tcpdump on ifb0 to see only those packets ..
tcpdump -n -i ifb0 -x -e -t
Essentially a good debugging/logging interface.
If you replace mirror with redirect, those packets will be blackholed and will never make it out. This redirect behavior changes with new patch (but not the mirror).
What you can do with the patch to provide functionality that most people use IMQ for below:
export TC="/sbin/tc" $TC qdisc add dev ifb0 root handle 1: prio $TC qdisc add dev ifb0 parent 1:1 handle 10: sfq $TC qdisc add dev ifb0 parent 1:2 handle 20: tbf rate 20kbit buffer 1600 limit 3000 $TC qdisc add dev ifb0 parent 1:3 handle 30: sfq $TC filter add dev ifb0 protocol ip pref 1 parent 1: handle 1 fw classid 1:1 $TC filter add dev ifb0 protocol ip pref 2 parent 1: handle 2 fw classid 1:2 ifconfig ifb0 up $TC qdisc add dev eth0 ingress # redirect all IP packets arriving in eth0 to ifb0 # use mark 1 --> puts them onto class 1:1 $TC filter add dev eth0 parent ffff: protocol ip prio 10 u32 \ match u32 0 0 flowid 1:1 \ action ipt -j MARK --set-mark 1 \ action mirred egress redirect dev ifb0
from another machine ping so that you have packets going into the box:
[root@jzny action-tests]# ping 10.22 PING 10.22 (10.0.0.22): 56 data bytes 64 bytes from 10.0.0.22: icmp_seq=0 ttl=64 time=2.8 ms 64 bytes from 10.0.0.22: icmp_seq=1 ttl=64 time=0.6 ms 64 bytes from 10.0.0.22: icmp_seq=2 ttl=64 time=0.6 ms --- 10.22 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.6/1.3/2.8 ms [root@jzny action-tests]#
Now look at some stats:
[root@jmandrake]:~# $TC -s filter show parent ffff: dev eth0 filter protocol ip pref 10 u32 filter protocol ip pref 10 u32 fh 800: ht divisor 1 filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 match 00000000/00000000 at 0 action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING target MARK set 0x1 index 1 ref 1 bind 1 installed 4195sec used 27sec Sent 252 bytes 3 pkts (dropped 0, overlimits 0) action order 2: mirred (Egress Redirect to device ifb0) stolen index 1 ref 1 bind 1 installed 165 sec used 27 sec Sent 252 bytes 3 pkts (dropped 0, overlimits 0) [root@jmandrake]:~# $TC -s qdisc qdisc sfq 30: dev ifb0 limit 128p quantum 1514b Sent 0 bytes 0 pkts (dropped 0, overlimits 0) qdisc tbf 20: dev ifb0 rate 20Kbit burst 1575b lat 2147.5s Sent 210 bytes 3 pkts (dropped 0, overlimits 0) qdisc sfq 10: dev ifb0 limit 128p quantum 1514b Sent 294 bytes 3 pkts (dropped 0, overlimits 0) qdisc prio 1: dev ifb0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 504 bytes 6 pkts (dropped 0, overlimits 0) qdisc ingress ffff: dev eth0 ---------------- Sent 308 bytes 5 pkts (dropped 0, overlimits 0) [root@jmandrake]:~# ifconfig ifb0 ifb0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:6 errors:0 dropped:3 overruns:0 frame:0 TX packets:3 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:32 RX bytes:504 (504.0 b) TX bytes:252 (252.0 b)
Dummy continues to behave like it always did. You send it any packet not originating from the actions it will drop them. In this case the three dropped packets were ipv6 ndisc.
Many readers have found this page to be unhelpful in terms of expressing how IFB is useful and how it should be used usefully.
These examples are taken from a posting of Jamal at http://www.mail-archive.com/netdev@vger.kernel.org/msg04900.html
What this script will demonstrate is the following sequence:
export TC="/root/tc" $TC qdisc del dev ifb0 root handle 1: prio $TC qdisc add dev ifb0 root handle 1: prio $TC qdisc add dev ifb0 parent 1:1 handle 10: sfq $TC qdisc add dev ifb0 parent 1:2 handle 20: tbf \ rate 20kbit buffer 1600 limit 3000 $TC qdisc add dev ifb0 parent 1:3 handle 30: sfq $TC filter add dev ifb0 parent 1: protocol ip prio 1 u32 \ match ip dst 11.0.0.0/24 flowid 1:1 $TC filter add dev ifb0 parent 1: protocol ip prio 2 u32 \ match ip dst 10.0.0.0/24 flowid 1:2 ifconfig ifb0 up
$TC qdisc del dev eth0 root handle 1: htb default 2 $TC qdisc add dev eth0 root handle 1: htb default 2 $TC class add dev eth0 parent 1: classid 1:1 htb rate 800Kbit $TC class add dev eth0 parent 1: classid 1:2 htb rate 800Kbit $TC class add dev eth0 parent 1:1 classid 1:10 htb rate 256kbit ceil 384kbit $TC class add dev eth0 parent 1:1 classid 1:20 htb rate 512kbit ceil 648kbit $TC filter add dev eth0 parent 1: protocol ip prio 1 u32 \ match ip dst 10.0.0.229/32 flowid 1:10 \ action mirred egress redirect dev ifb0
A Little test (be careful if you are sshed in and are classifying on that IP, counters may be not easy to follow)
A ping
mambo:~# ping -c2 10.0.0.229
First, look at ifb0, observe that second filter twice being successful
mambo:~# $TC -s filter show dev ifb0 parent 1: filter protocol ip pref 1 u32 filter protocol ip pref 1 u32 fh 800: ht divisor 1 filter protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 (rule hit 2 success 0) match 0b000000/ffffff00 at 16 (success 0 ) filter protocol ip pref 2 u32 filter protocol ip pref 2 u32 fh 801: ht divisor 1 filter protocol ip pref 2 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:2 (rule hit 2 success 2) match 0a000000/ffffff00 at 16 (success 2 )
Next the qdisc numbers, observe that 1:2 has 2 packets
mambo:~# $TC -s qdisc show dev ifb0 qdisc prio 1: bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 196 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc sfq 10: parent 1:1 limit 128p quantum 1514b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc tbf 20: parent 1:2 rate 20000bit burst 1599b lat 546.9ms Sent 196 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p----- requeues 0 qdisc sfq 30: parent 1:3 limit 128p quantum 1514b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0
Next look at eth0, observe class 1:10 which is where the pings went through after
// they came back from the ifb0 device. mambo:~# $TC -s class show dev eth0 class htb 1:1 root rate 800000bit ceil 800000bit burst 1699b cburst 1699b Sent 196 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 0 borrowed: 0 giants: 0 tokens: 16425 ctokens: 16425 class htb 1:10 parent 1:1 prio 0 rate 256000bit ceil 384000bit burst 1631b cburst 1647b Sent 196 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 2 borrowed: 0 giants: 0 tokens: 49152 ctokens: 33110 class htb 1:2 root prio 0 rate 800000bit ceil 800000bit burst 1699b cburst 1699b Sent 47714 bytes 321 pkt (dropped 0, overlimits 0 requeues 0) rate 3920bit 3pps backlog 0b 0p requeues 0 lended: 321 borrowed: 0 giants: 0 tokens: 16262 ctokens: 16262 class htb 1:20 parent 1:1 prio 0 rate 512000bit ceil 648000bit burst 1663b cburst 1680b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 0 borrowed: 0 giants: 0 tokens: 26624 ctokens: 21251
And now…
mambo:~# $TC -s filter show dev eth0 parent 1: filter protocol ip pref 1 u32 filter protocol ip pref 1 u32 fh 800: ht divisor 1 filter protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 235 success 4) match 0a0000e5/ffffffff at 16 (success 4 ) action order 1: mirred (Egress Redirect to device ifb0) stolen index 2 ref 1 bind 1 installed 114 sec used 100 sec Action statistics: Sent 196 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0
In order to use ifb you need:
export netif="ifb1" prio="2"
ip link set dev $netif up # ifb7 (mirred flowid 1:7) default class in ifb1 ($netif) ip link set dev ifb7 up
interface eth0
$TC qdisc del dev eth0 ingress 2>/dev/null $TC qdisc add dev eth0 ingress $TC filter add dev eth0 parent ffff: protocol ip prio 10 u32 \ match u32 0 0 flowid 1:1 \ action mirred egress redirect dev $netif
interface ifb1
$TC qdisc del root dev $netif 2>/dev/null #### $TC qdisc add dev $netif root handle 1:0 hfsc default 7 #### glowna klasa $TC class add dev $netif parent 1:0 classid 1:1 hfsc rt m2 10240kbit ## Class # Admin $TC class add dev $netif parent 1:1 classid 1:2 hfsc rt m2 2048kbit $TC qdisc add dev $netif parent 1:2 handle 2 sfq perturb 10 # all user $TC class add dev $netif parent 1:1 classid 1:4 hfsc rt m2 9144kbit # user $TC class add dev $netif parent 1:4 classid 1:5 hfsc rt m2 9144kbit # default --> bin $TC class add dev $netif parent 1:4 classid 1:7 hfsc rt m2 256kbit $TC qdisc add dev $netif parent 1:7 handle 7 sfq perturb 10
filters
# Admin ip $TC filter add dev $netif protocol ip parent 1:0 prio $prio u32 ht 800:: \ match ip src 172.1.0.0/16 flowid 1:2 # users ip $TC filter add dev $netif protocol ip parent 1:0 prio $prio u32 ht 800:: \ match ip src 10.1.1.0/24 flowid 1:5 $TC filter add dev $netif protocol ip parent 1:0 prio $prio u32 ht 800:: \ match ip src 10.2.1.0/24 flowid 1:5 # default $TC filter add dev $netif protocol ip parent 1:0 prio $prio u32 ht 800:: \ match ip src 0.0.0.0/0 at 12 flowid 1:7 \ action mirred egress mirror dev ifb7
# ok, # show traffic in ifb1 and ifb7 (default class) tcpdump -i ifb1 -n tcpdump -i ifb7 -n
# show tc -s filter show parent ffff: dev eth0 tc -s filter show dev ifb1 |grep flowid 1:7 tc -s filter show dev ifb1 |grep mirred -A3 -B3
# del qdisc tc qdisc del dev eth0 handle ffff: ingress