420 likes | 977 Views
CS 498 Lecture 9 Traffic Control for QoS. Jennifer Hou Department of Computer Science University of Illinois at Urbana-Champaign Reading: Chapters 18, The Linux Networking Architecture: Design and Implementation of Network Protocols in the Linux Kernel. Traffic Control. Two major functions
E N D
CS 498 Lecture 9Traffic Control for QoS Jennifer Hou Department of Computer Science University of Illinois at Urbana-Champaign Reading: Chapters 18, The Linux Networking Architecture: Design and Implementation of Network Protocols in the Linux Kernel
Traffic Control • Two major functions • Policing • Usually implemented at the router. • Data connections are monitored and packets that are transmitted violating a specified strategy are discarded. • Traffic shaping • Usually implemented at end hosts. • Data connections are regulated to conform to certain rate. Surplus packets are either marked and then sent or delayed at the sender side until the rate constraint no longer holds true.
Processing of Network Data Upper layers (TCP, UDP, …) Traffic control Ingress policing Input de-multiplexing Forwarding Output queuing
Traffic Control in Linux Kernel Local delivery Locally created data net/core/dev.c net/ipv4/ip_input.c Forwarding dev_queue_xmit net/sched/sch_ingress.c Traffic control in Incoming direction net/sched/sch_*.c net/sched/cls_*.c Traffic control in outing direction net/core/dev.c driver.c dev-> hard_start_xmit
Traffic Control in Linux Kernel .. ... p8022_rcv arp_rcv ip_rcv arp_send ip_queue_xmit ETH_P_802_2 dev.c br_input.c dev.c ... handle_bridge dev_queue_xmit net_rx_action CONFIG_BRIDGE dev->qdisc->enqueue do_softirq Scheduler eth1 eth0 CPU1 CPU2 dev.c net_tx_action softnet_data[cpun].input_pkt_queue qdisc_run dev.c netif_rx Scheduler qdisc_restart eth_type_trans() driver.c dev->qdisc->dequeue dev_alloc_skb() driver.c net_interrupt dev->hard_start_xmit
Components of Traffic Control • Queuing disciplines • Packets sent are passed to a queueing discipline and sorted within the queue in compliance with specific rules. • Packets can be removed no earlier than when the queueing discipline has marked them as ready for transmission. • Classes (within a queuing disciplines) • Within a queue discipline, packets can be allocated to different classes. • Filters: are used to allocate packets to classes with a queueing discipline.
Queuing Discipline • Each network device has a queuing discipline • It controls how packets are enqueued on the device are treated • Possible operations: keep, drop, mark • A simple one may just consist of a single queue Queuing discipline
Complex Queuing Discipline • Queuing discipline • May use filters to distinguish among different classes of packets • Process each class in a specific way • Two filters can point to one class • Classes • do not store packets • They use another queuing discipline to do that Enqueue dequeue Queueing discipline Filter Filter Filter Class 1 Class 2 Queueing Discipline Queueing Discipline
Policing • When packets of a connection are enqueued, the connection can be policed: • Letting the packets go • Dropping the packets • Letting the packets go but mark them
Data Structures Include/net/pkt_sched.h Include/net/sch_generic.h
Traffic Control in Linux Kernel • Traffic control kernel code resides mainly in net/sched • Traffic control in the incoming direction is handled by net/sched/sch_ingress.c. • Various scheduling disciplines in the outgoing direction are given in • net/sched/sch_*.c • net/sched/cls_*.c
Traffic Control in Linux Kernel • Interface used inside the kernel can be found in • /usr/src/linux-(version)/include/net/pkt_cls.h • /usr/src/linux-(version)/include/net/pket_sched.h • Interfaces between kernel traffic control and user space programs are decared in • /usr/include/linux/pkt_cls.h • /usr/include/linux/pkt_sched.h.
dev.c, net/sched/* dev_queue_xmit softirq.c, netdevice.h dev->qdisc->enqueue timer_handler netif_schedule Timer cpu_raise_softirq Scheduler NET_TX_SOFTIRQ qdisc_run do_softirq net_tx_action qdisc_restart dev->qdisc->dequeue driver.c dev->hard_start_xmit Inserting Traffic Control Enqueue dequeue Queueing discipline Filter Filter Filter Class 1 Class 2 Queueing Discipline Queueing Discipline
Queueing Discipline -- Qdisc struct Qdisc { int (*enqueue)(struct sk_buff *skb, struct Qdisc *dev); struct sk_buff * (*dequeue)(struct Qdisc *dev); unsigned flags;32 #define TCQ_F_BUILTIN 1 #define TCQ_F_THROTTLED 2 #define TCQ_F_INGRESS 4 int padded; struct Qdisc_ops *ops; u32handle; u32parent; atomic_trefcnt; struct sk_buff_head q; struct net_device *dev; struct list_head list; struct gnet_stats_basic bstats; struct gnet_stats_queue qstats; struct gnet_stats_rate_est rate_est; spinlock_t *stats_lock; struct rcu_head q_rcu; int (*reshape_fail)(struct sk_buff *skb, struct Qdisc *q); struct Qdisc *__parent; }; The Qdisc_ops data structure The socket buffer queue governed by this qdisc The network device to which the Qdisc is allocated When an outer queue passes a packet to an inner queue the packet may have to be discarded. If the outer queueing discipline implements the callback function reshape_fail then it can be invoked by the inner queueing discipline.
Queuing Disciplines –Qdisc_ops • struct Qdisc_ops { struct Qdisc_ops *next; struct Qdisc_class_ops *cl_ops; char id[IFNAMSIZ]; int priv_size; int (*enqueue)(struct sk_buff *, struct Qdisc *); struct sk_buff * (*dequeue)(struct Qdisc *); int (*requeue)(struct sk_buff *, struct Qdisc *); unsigned int (*drop)(struct Qdisc *); int (*init)(struct Qdisc *, struct rtattr *arg); void (*reset)(struct Qdisc *); void (*destroy)(struct Qdisc *); int (*change)(struct Qdisc *, struct rtattr *arg); int (*dump)(struct Qdisc *, struct sk_buff *); }; The packet should be arranged at the position in the queueing discipline where it has been before A queueing discipline can be added via register_qdisc() in init_module()
Qdisc_ops • enqueue() • Enqueues a packet • Return values are • NET_XMIT_SUCCESS, if the packet is accepted • NET_XMIT_DROP, if the packet is discarded • NET_XMIT_CN, if the packet is discarded because of buffer overflow • NET_XMIT_POLICED, if the packet is discarded because of violation of a policing rule. • NET_XMIT_BYPASS, if the packet is accepted, but will not leave the queue via the regular dequeue() function.
Qdisc_ops • dequeue() • Returns a pointer to a packet (skb) eligible for sending • A return value of null means that there are no packets ready to be sent. (The total number of packets in the queue is given in struct Qdisc* qq.len.) • requeue() • Puts a packet back into the original position in the queue where it had been before. • The number of packets running through the queue should not be increased. • drop() • Drops one packet from the queue
Qdisc_ops • init() • Initializes the queuing discipline • reset() • Resets the queuing discipline to its initial state (empty queue, reset counter, delete times) • destroy() • Removes a queuing discipline and frees all the resources reserved during the runtime of the queueing discipline. • change() • Changes the parameters of a queuing discipline • dump() • Returns output configuration parameters and statistics of a queueing discipline.
Qdisc_class_ops struct Qdisc_class_ops { /* Child qdisc manipulation */ int (*graft)(struct Qdisc *, unsigned long cl, struct Qdisc *, struct Qdisc **); struct Qdisc * (*leaf)(struct Qdisc *, unsigned long cl); /* Class manipulation routines */ unsigned long (*get)(struct Qdisc *, u32 classid); void (*put)(struct Qdisc *, unsigned long); int (*change)(struct Qdisc *, u32, u32, struct rtattr **, unsigned long *); int (*delete)(struct Qdisc *, unsigned long); void (*walk)(struct Qdisc *, struct qdisc_walker * arg); /* Filter manipulation */ struct tcf_proto ** (*tcf_chain)(struct Qdisc *, unsigned long); unsigned long (*bind_tcf)(struct Qdisc *, unsigned long, u32 classid); void (*unbind_tcf)(struct Qdisc *, unsigned long); /* rtnetlink specific */ int (*dump)(struct Qdisc *, unsigned long, struct sk_buff *skb, struct tcmsg*); int (*dump_stats)(struct Qdisc *, unsigned long, struct gnet_dump *); };
Qdisc_class_ops • graft(): binds a queueing discipline to a class • leaf(): returns a pointer to the queueing discipline currently bound to the class • get(): maps the classid to the internal identification and increments the reference counter by one. • Each class is associated with two ids • classid (of type u32) is used by the user and the configuration tools used in the user space. • Internal identification (of type unsigned long) is used within the kernel • put(): decrements the usage counter.
Qdisc_class_ops • change(): changes the class parameters • delete(): checks if the class is not referenced; and if not, deletes the class. • walk(): walks through the linked list of the all the classes of a queueing discipline and invokes the associated callback functions to obtain configuration/statistics data. • tcf_chain(): returns a pointer to the linked list for the filter bound to the class. • bind_tcf(): binds a filter to a class. • dump_class(): gives configuration and statistics data of a class.
tcf_proto struct tcf_proto { /* Fast access part */ struct tcf_proto *next; void *root; int (*classify)(struct sk_buff*, struct tcf_proto*, struct tcf_result *); u32protocol; /* All the rest */ u32 prio; u32 classid; struct Qdisc *q; void *data; struct tcf_proto_ops *ops; };
tcf_proto_ops struct tcf_proto_ops { struct tcf_proto_ops *next; char kind[IFNAMSIZ]; int (*classify)(struct sk_buff*, struct tcf_proto*, struct tcf_result *); int (*init)(struct tcf_proto*); void (*destroy)(struct tcf_proto*); unsigned long (*get)(struct tcf_proto*, u32handle); void (*put)(struct tcf_proto*, unsigned long); int (*change)(struct tcf_proto*, unsigned long, u32handle, struct rtattr **, unsigned long *); int (*delete)(struct tcf_proto*, unsigned long); void (*walk)(struct tcf_proto*, struct tcf_walker *arg); /* rtnetlink specific */ int (*dump)(struct tcf_proto*, unsigned long, struct sk_buff *skb, struct tcmsg*); struct module *owner; };
tcf_proto_ops • classify(): classifies a packet (checks if the filtering rule applies to the packet) • Possible return values are • TC_POLICE_OK: the packet is accepted by the filter. • TC_POLICE_RECLASSIFY: the packet violates agreed parameters and should be allocated to a different class. • TCP_POLICE_SHOT: the packet was dropped because of violation of agreed parameters • TCP_POLICE_UNSPEC: The rule does not match the packet, and the packet should be passed to the next filter. • tcf_result contains the classid and the internal identification of the class.
Queueing Discipline Example net/sched/sch_red.c
1 maxp minth maxth Dropping Probability pa Linux implementation pb
RED implementation I struct red_sched_data { /* Parameters */ u32limit; /* HARD maximal queue length */ u32 qth_min; /* Min average length threshold: A scaled */ u32 qth_max; /* Max average length threshold: A scaled */ char Wlog; /* log(W) */ char Plog; /* random number bits */ … unsigned long qave; /* Average queue length: A scaled */ int qcount; /* Packets since last random number generation */ u32 qR; /* Cached random number */ psched_time_t qidlestart; /* Start of idle period */ struct tc_red_xstatsst; };
RED implementation II: Compute average queue length • We want: • avg = avg * (1- w) +w * backlog • Code in Linux: • q->qave += sch->stats.backlog - (q->qave >> q->Wlog); • Why: • avg = q->qave * w • w = 2^(-wlog)
RED implementation III • Ideally avg should be calculated every constant clock interval • In Linux it is updated every packet outgoing • Care need to be taken for idle period
RED implementation IV:Decide dropping probability • We want: enqueue if • Linux code: • if (((q->qave - q->qth_min)>>q->Wlog)*q->qcount < q->qR) goto enqeue; • max_P = (qth_max – qth_min)/2^Plog • q->qR = rnd * 2^Plog