------------------------------------------------------------------------------- Centre for Advanced Internet Architectures, Swinburne University of Technology, Melbourne, Australia 18 April, 2016 Dummynet AQM v0.2 - CoDel, FQ-CoDel, PIE and FQ-PIE for FreeBSD's ipfw/dummynet framework Author: Rasool Al-Saadi ------------------------------------------------------------------------------- This document provides an overview of the v0.2 Dummynet AQM patch that includes CoDel, FQ-CoDel, PIE and FQ-PIE for FreeBSD's ipfw/dummynet framework For more details, refer to CAIA technical report CAIA-TR-160418A at http://caia.swin.edu.au/reports/160418A/CAIA-TR-160418A.pdf Dummynet AQM patch v0.2 is applied and has been tested against FreeBSD11.0-CURRENT Branch: freebsd-head Architecture: amd64 Revision: r297692* Date: 2016-04-08 Additionally, we provided other patch for FreeBSD 10.x-RELEASE. The patch has been applied, build and loaded without problem in FreeBSD 10.{1, 2, 3}. However, the actual performance testing has only been done under FreeBSD 10.1-RELEASE. * Should also be applicable to revisions from r266941 and up to r297692 (newer revisions not tested). Table of Contents 1. OVERVIEW 1.1 Licence 2. CODEL OVERVIEW 2.1 CoDel Parameters 2.2 Codel Synopsis 2.3 Examples of using Codel 3. FQ-CODEL OVERVIEW 3.1 FQ-CoDel Parameters 3.2 FQ-CoDel Synopsis 3.3 Examples of using FQ-CoDel 4. PIE OVERVIEW 4.1 PIE Parameters 4.2 pie Synopsis 4.3 Examples of using pie 5. FQ-PIE OVERVIEW 5.1 FQ-PIE Parameters 5.2 fq_pie Synopsis 5.3 Examples of using fq_pie 6. APPLYING THE PATCH/INSTALLATION 6.1 Applying the patch 6.2 Testing the patched ipfw/dummynet 6.3 Install the patched ipfw/dummynet 7. REFERENCES 8. DEVELOPMENT TEAM 9. ACKNOWLEDGEMENTS 1. OVERVIEW ------------ Dummynet AQM patch v0.2 provides CoDel and PIE AQM (Active Queue Management) and FQ-CoDel and FQ-PIE scheduler/AQM support for ipfw/dummynet to FreeBSD. It also implements loadable AQM modules framework for dummynet to make the process of implementing new AQM mechanisms much easier. 1.1 Licence Dummynet AQM patch v0.2 is released under a BSD licence and some parts released under LGPL dual-licensing. Refer to licence headers in each source file for further details. 2. CoDel ---------- CoDel (Controlled-Delay Active Queue Management) [1] is an AQM algorithm designed to address bufferbloat problem by drop packets depending on packet sojourn time in the queue. One of CoDel goals is to be parameterless and can works reasonably on the Internet without changing its parameters. 2.1 CoDel Parameters As mentioned, CoDel designed to work properly on the Internet without changing its default configurations. However, these default configurations are not suitable for all scenarios, for example high RTT, and can cause reduction in TCP connections throughput. Thus, our codel implementation provides options to change codel parameters for each pipe/queue individually as well as changing the default values. The following are codel configuration parameters and its default values: target: is the minimum acceptable persistent queue delay. Default: 5 ms, default can be changed by the sysctl variable net.inet.ip.dummynet.codel.target interval: CoDel does not drop a packet directly after packet sojourn becomes higher than target but wait for interval of time before dropping. interval should be set to maximum RTT for all expected connection. Default is 100 ms, default can be changed by the sysctl variable net.inet.ip.dummynet.codel.interval [no]ecn: enable/disable Explicit Congestion Notification message. Codel will mark packets with ECN instead of dropping them when ECN is on. Default: no ECN. 2.2 Codel Synopsis Codel AQM is used with dummynet 'pipe' or 'queue' and can be configured through ipfw interface. For more details about ipfw/dummynet, refer to ipfw(8) FreeBSD Man Pages [3]. It should be noted that any token after 'codel' token is considered as a parameter for codel. So, make sure that all pipe/queue configurations written before 'codel' token. Codel synopsis is as follow: ipfw pipe/queue x config [...] codel [target t] [interval t] [ecn | noecn] where t is time in seconds (s), milliseconds (ms) or microseconds (us). The default interpretation is milliseconds. 2.3 Examples of using Codel a) One pipe controlled by CoDel AQM (default configuration) and rate limit to 1 Mbits/s. ipfw pipe 1 config bw 1mbits/s codel ipfw add 100 pipe 1 from any to any b) Two queues controlled by CoDel AQM using different CoDel configurations parameters. The pipe that queue 1 and 2 use has rate limit to 10 Mbits/s and 20ms emulated delay. In more details, queue 1 and 2 connected to an implicit WF2Q+ scheduler that use pipe 1 for traffic shaping and adding emulated delay. ipfw pipe 1 config bw 10mbits/s delay 20ms ipfw queue 1 config pipe 1 codel target 7ms ecn ipfw queue 2 config pipe 1 codel target 8ms interval 160ms ecn ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 ipfw add 200 queue 2 ip from 172.16.0.0/16 to 172.16.0.0/16 c) Two queues - queue 1 controlled by CoDel AQM and queue 2 uses droptail. Both queues are connected to QFQ scheduler that uses pipe 1 for rate limit to 5 Mbits/s. ipfw pipe 1 config bw 5mbits/s ipfw sched 1 config pipe 1 type qfq ipfw queue 1 config sched 1 codel ipfw queue 2 config sched 1 ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 ipfw add 200 queue 2 ip from 172.16.0.0/16 to 172.16.0.0/16 3. FQ-CODEL OVERVIEW --------------------- FlowQueue-CoDel [2] is a scheduler/AQM hybrid scheme that ensures fairness among flows share same bottleneck and control queue delay at the same time. FQ-CoDel scheduler is based on DRR (Deficit Round Robin) scheduler and it manages two lists of queues (old queues and new queues) to ensure fairness between heavy and lightweight flows. Each queue of FQ-CoDel queues is controlled by CoDel AQM. 3.1 FQ-CoDel Parameters As FQ-CoDel uses CoDel AQM which works properly in most Internet situations using the default configurations, in most cases FQ-CoDel does not require any change to the default configurations. However, in certain scenarios the default parameters are not suitable and can lead to reduction in performance. Thus, our fq_codel implementation provides options to change FQ-CoDel parameters for each scheduler individually as well as changing the default values. The following are fq_codel configuration parameters and its default values: target: is the minimum acceptable persistent queue delay. Default: 5 ms, default can be changed by the sysctl variable net.inet.ip.dummynet.fqcodel.target interval: Codel does not drop a packet directly after packet sojourn becomes higher than target but wait for interval of time before dropping. Interval should be set to maximum RTT for all expected connection. Default is 100 ms, default can be changed by the sysctl variable net.inet.ip.dummynet.fqcodel.interval [no]ecn: enable/disable Explicit Congestion Notification message. fq_codel marks packets with ECN instead of dropping them by default. quantum: is number of bytes a queue can be served before being moved it to the tail of old queues list. Default: 1514 bytes, default can be changed by the sysctl variable net.inet.ip.dummynet.fqcodel.quantum limit: is the hard limit of all queues size managed by fq_codel schedular instance. Default: 10240 packets, default can be changed by the sysctl variable net.inet.ip.dummynet.fqcodel.limit flows: is the number of flows that fq_codel creates and manages. Default: 1024 queues, default can be changed by the sysctl variable net.inet.ip.dummynet.fqcodel.flows 3.2 fq_codel Synopsis fq_codel is used with dummynet scheduler object ('schd') and can be configured through ipfw interface. It should be noted that any token after 'fq_codel' token is considered as a parameter for FQ-CoDel. Thus, make sure that all scheduler configurations written before 'codel' token. fq_codel synopsis is as follow: ipfw sched x config [...] type fq_codel [target t] [interval t] [ecn | noecn] [quantum n] [limit n] [flows n] where t is time in seconds (s), milliseconds (ms) or microseconds (us). The default interpretation is milliseconds. n is an integer number 3.3 Examples of using fq_codel a) One scheduler with one queue, 2048 fq_codel sub-queues, target 7ms and quantum 2000 bytes ipfw pipe 1 config bw 10mbits/s ipfw sched 1 config pipe 1 type fq_codel target 7ms quantum 2000 ecn ipfw queue 1 config sched 1 ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 b) One scheduler with two queues (1024 fq_codel sub-queues by default), interval 150ms, ECN enabled ipfw pipe 1 config delay 10ms ipfw sched 1 config pipe 1 type fq_codel interval 150ms ecn ipfw queue 1 config sched 1 ipfw queue 2 config sched 1 ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 ipfw add 200 queue 2 ip from 172.16.0.0/16 to 172.16.0.0/16 4. PIE OVERVIEW --------------------- PIE drops or marks packets depending on calculated drop probability p during en-queue process, with the aim of achieving high throughput while keeping queue delay low. At regular time intervals (tupdate) a background process (re)calculates p based on queue delay deviations from target and queue delay trends. PIE approximates current queue delay by using a departure rate estimation method, or (optionally) by using a packet timestamp method similar to CoDel. PIE was designed to work properly on the Internet using the default configurations. However, the defaults in [4] are not appropriate for all network environments such as in datacenters due to very low latency. Moreover, PIE configurations should be tunable for studying and evaluation purposes. Thus, our implementation provides options to change PIE parameters away from the defaults for each pipe/queue individually. 4.1 PIE Parameters PIE has a number of configurable parameters and options derived from [4]: target: The acceptable persistent queue delay. Drop probability increases as queue delay increases higher than target. tupdate: The frequency of drop probability recalculation. alpha and beta: Drop probability weights. max_burst: The maximum period of time that PIE does not drop/mark packets. max_ecnth: When ECN is enabled, PIE drops packets instead of marking them when drop probability becomes higher than ECN probability threshold max_ecnth. PIE has the following options: [no]ecn: enable (ecn) or disable (noecn) ECN marking for ECN-enabled TCP flows. [no]capdrop: enable (capdrop) or disable(nocapdrop) cap drop adjustment. [no]derand: enable (derand) or disable(noderand) drop probability de-randomisation. De-randomisation eliminates the problem of dropping packets too close or to far. onoff: enable turing PIE on and off depending on queue load. PIE tunes on when over 1/3 of queue becomes full. [dre|ts]: Calculate queue delay using departure rate estimation (dre) or timestamps (ts). 4.2 PIE Synopsis PIE is used with dummynet ‘pipe’ or ‘queue’ and can be configured through ipfw [12] interface. PIE has the following synopsis: ipfw pipe/queue x config [...] pie [target t] [tupdate t] [alpha n] [beta n] [max_burst t] [max_ecnth n] [ecn | noecn] [capdrop | nocapdrop] [drand | nodrand] [onoff] [dre | ts] where t is time in second (s), millisecond (ms) or microsecond (us). The default interpretation is milliseconds. n is a real number Note: Any token after ‘pie’ is considered a PIE parameter, so ensure all pipe/queue configuration options are written before ‘pie’ 4.3 Examples of using PIE This subsection includes some examples of using PIE with ipfw/dummynet. It should be noted that ipfw passes packets that match a classification rule of dummynet pipe/queue to next rule by default. Thus, a rule with `allow' action should be added in some point after pipe/queue rule. a) One pipe controlled by PIE AQM (default configuration) and rate limit to 1 Mbits/s. ipfw pipe 1 config bw 1mbits/s pie ipfw add 100 pipe 1 ip from any to any b) Two queues controlled by PIE AQM using different PIE configurations parameters. The pipe that queue 1 and 2 use has rate limited to 10 Mbits/s and 20ms emulated delay. In more details, queue 1 and 2 connected to an implicit WF2Q+ scheduler that use pipe 1 for traffic shaping and adding emulated delay. ipfw pipe 1 config bw 10mbits/s delay 20ms ipfw queue 1 config pipe 1 pie target 25ms ecn ipfw queue 2 config pipe 1 pie target 20ms tupdate 30ms ecn ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 ipfw add 200 queue 2 ip from 172.16.0.0/16 to 172.16.0.0/16 c) Two queues - queue 1 controlled by PIE AQM and queue 2 uses CoDel. Both queues are connected to QFQ scheduler that uses pipe 1 for rate limited to 5 Mbits/s. ipfw pipe 1 config bw 5mbits/s ipfw sched 1 config pipe 1 type qfq ipfw queue 1 config sched 1 pie ipfw queue 2 config sched 1 codel ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 ipfw add 200 queue 2 ip from 172.16.0.0/16 to 172.16.0.0/16 5. FQ-PIE OVERVIEW --------------------- In the absence of any normative reference implementation or Internet Draft, our implementation of FQ-PIE is the combination of FQ-CoDel’s FlowQueue logic with PIE queue management on every dynamically created sub-queue. The goals are similar to FQ-CoDel – control queuing delays while sharing bottleneck capacity relatively evenly among competing flows. We set each instance of PIE to use timestamps (ts) rather than departure rate estimation (dre) in the context of FQ-PIE, as there have been doubts raised as to the accuracy of dre in such a context [4]. Our implementation uses the same default parameters for FQ-PIE as in [2], and provides options to change FQPIE parameters for each scheduler individually as well as changing the defaults. 5.1 FQ-PIE Parameters FQ-PIE configuration parameters and options, default values and sysctl control variables to change the default value are equivalent to PIE parameters described in section 4.1. The remaining parameters are borrowed from FQ-CoDel: quantum is number of bytes a queue can be served before being moved to the tail of old queues list, limit is the hard size limit of all queues managed by an instance of the fq_pie scheduler and flows is the number of flow queues that fq_pie creates and manages. 5.2 fq_pie Synopsis FQ-PIE is used with dummynet ‘pipe’ or ‘queue’ and can be configured through ipfw [12] interface. FQ-PIE has the following synopsis: ipfw sched x config [...] type fq_pie [target t] [tupdate t] [alpha n] [beta n] [max_burst t] [max_ecnth n] [quantum n] [limit n] [flows n] [ecn | noecn] [capdrop | nocapdrop] [drand | nodrand] [onoff] [dre | ts] where t is time in second (s), millisecond (ms) or microsecond (us). The default interpretation is milliseconds. n is a real number Note: Any token after ‘fq_pie’ is considered a FQ-PIE parameter, so ensure all pipe/queue configuration options are written before ‘fq_pie’. 5.3 Examples of using fq_pie This subsection includes some examples of using fq_pie with ipfw/dummynet. Note that ipfw passes packets that match a classification rule of dummynet pipe/queue to next rule by default. Thus, a rule with ‘allow’ action should be added in some point after pipe/queue rule. a) One scheduler with one queue, 2048 fq_pie subqueues, target 10ms, tupdate 10ms and quantum 2000 bytes. ECN is disabled ipfw pipe 1 config bw 10mbits/s ipfw sched 1 config pipe 1 type fq_pie target 10ms tupdate 10ms quantum 2000 flows 2048 noecn ipfw queue 1 config sched 1 ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 b) One scheduler with two queues (1024 fq_pie sub-queues by default), ECN enabled, maximum ECN threshold 0.5 and nocapdrop. ipfw pipe 1 config delay 10ms ipfw sched 1 config pipe 1 type fq_pie max_ecnth 0.5 nocapdrop ipfw queue 1 config sched 1 ipfw queue 2 config sched 1 ipfw add 100 queue 1 ip from 192.168.0.0/16 to 192.168.0.0/16 ipfw add 200 queue 2 ip from 172.16.0.0/16 to 172.16.0.0/16 6. APPLYING THE PATCH/TESTING/INSTALLATION ----------------------------------- Before applying Dummynet AQM patch v0.2, make sure you have installed FreeBSD11-CURRENT r297692 with its kernel source tree. You can checkout FreeBSD11-CURRENT r297692 using: svn checkout -r r297692 svn://svn.freebsd.org/base/head/ /usr/src/ We provided another patch for FreeBSD 10.x-RELEASE (10.0, 10.1, 10.2, 10.3) that includes ECN marking code for FreeBSD 10.x-RELEASE (extracted from FreeBSD11.0-CURRENT r266941) in addition to codel/fq_codel/pie/fq_pie code. Once the patch is applied, you only need to (re)build the dummynet.ko kernel module and ipfw userland command (rather than rebuild a complete kernel and world from source). As root user do the following steps: 6.1 Applying The Patch Apply the patch as follows: a) Extract the patch file tar -xvf dummynet-aqm-patch-0.2.tgz -C /usr/src/ cd /usr/src b) Apply the patch. For FreeBSD11-CURRENT use: patch -p1 < dummynet-aqm-patch-0.2/freebsd11-r297692.patch For FreeBSD 10.x-RELEASE use: patch -p1 < dummynet-aqm-patch-0.2/freebsd10.x.patch c) Copy ip_dummynet.h to include directory. cp /usr/src/sys/netinet/ip_dummynet.h /usr/include/netinet/ d) Build ipfw userland cd /usr/src/sbin/ipfw make e) Build dummynet kernel module cd /usr/src/sys/modules/dummynet/ make 6.2 Testing the patched ipfw/dummynet Use the following steps to check whether the patch applied cleanly and built both userland ipfw and dummynet kernel module: a) Check if old dummynet is already loaded kldstat | grep dummynet b) If dummynet is already loaded, unload it. kldunload dummynet c) Load the patched dummynet into FreeBSD kernel kldload /usr/src/sys/modules/dummynet/dummynet.ko d) Check debug messages using 'dmesg' command: dmesg | grep 'CODEL\|PIE' The Output should be something like: load_dn_sched dn_aqm PIE loaded load_dn_aqm dn_aqm CODEL loaded load_dn_sched dn_sched FQ_CODEL loaded load_dn_sched dn_sched FQ_PIE loaded d) Use the patched ipfw interface (specify a full pathname when use ipfw): /usr/src/sbin/ipfw/ipfw pipe 1 config codel /usr/src/sbin/ipfw/ipfw pipe 1 show The Output should be something like: 00001: unlimited 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 AQM CoDel target 5ms interval 100ms sched 65537 type FIFO flags 0x0 0 buckets 0 active 6.3 Install the Patched ipfw/dummynet To use the patched ipfw/dummynet by default, install them as follow: a) Install ipfw interface, ignore any error if appears. cd /usr/src/sbin/ipfw make install b) Install dummynet kernel module cp /usr/src/sys/modules/dummynet/dummynet.ko /boot/kernel/ c) (Optional) To avoid the following warning "KLD '/boot/kernel/dummynet.ko' is newer than the linker.hints file" try to generate hints for the kernel loader using the following command: kldxref /boot/kernel 7. REFERENCES -------------- [1] Nichols, K. et al, "Controlled Delay Active Queue Management", Internet-draft, IETF, March 2016, https://tools.ietf.org/html/draft-ietf-aqm-codel-03 [2] Hoeiland-Joergensen, T. et al, "Flowqueue-codel", Internet-draft, IETF, March 2016, https://tools.ietf.org/html/draft-ietf-aqm-fq-codel-06 [3] ipfw(8), FreeBSD Man Pages, https://www.freebsd.org/cgi/man.cgi?ipfw(8) [4] R. Pan, P.et al, “PIE: A Lightweight Control Scheme To Address the Bufferbloat Problem”, Internet-draft, IETF, April 2016, https://tools.ietf.org/html/draft-ietf-aqm-pie-06 8. DEVELOPMENT TEAM -------------------- The members of this project team are: Lead developer: Rasool Al-Saadi (ralsaadi@swin.edu.au) Project leader: Grenville Armitage (garmitage@swin.edu.au) 9. ACKNOWLEDGEMENTS -------------------- This project has been made possible in part by a gift from The Comcast Innovation Fund.