300 likes | 511 Views
Interface between the Kernel and User space for Traffic Controller on Linux. softgear@dcn.ssu.ac.kr. Overview. tc. User Space. include/linux/pkt_cls . h include/linux/pkt_sched.h net/netlink. struct sockaddr_nl struct nlmsghdr. netlink socket rtnetlink socket. net/core/rtnetlink.c
E N D
Interface between the Kernel and User space for Traffic Controller on Linux softgear@dcn.ssu.ac.kr
Overview tc User Space include/linux/pkt_cls.h include/linux/pkt_sched.h net/netlink struct sockaddr_nl struct nlmsghdr netlink socket rtnetlink socket net/core/rtnetlink.c linux/include/rtnetlink.h Kernel Space
Boot Time __initfunc net/core/dev.c pktsched_init net/sched/sch_api.c • declarations • binding
pktsched_init • struct rtnetlink_link *link_p; • if (link_p) {link_p[RTM_NEWQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_DELQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].dumpit = tc_dump_qdisc;link_p[RTM_NEWTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_DELTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].dumpit = tc_dump_tclass;}
User level Application • Create netlink socket • sendto • netlink_sendmsg net/netlink/af_netlink.c • rtnetlink_rcv_msg • call function in rtnetlink_link net/core/rtnetlink.c
rtnetlink_links • rtnetlink_links : array of pointers to rtnetlink_link • rtnetlink_link : command
TC program do_qdisc tc_qdisc_modify tc_qdisc_list do_class usage do_filter
tc_qdisc_modify • allocate “req” • initialize it
tc_qdisc_modify (con’t) • rtnl_open : create ‘rtnetlink’ socket family = AF_NETLINK type = SOCK_RAW protocol = NETLINK_ROUTE • setup and bind local address, sockaddr_nl local • call “rtnl_talk”
rtnl_talk • allocate “msghdr msg” • call “sendmsg” • sys_sendmsg
sys_sendmsg User space Kernel Space req msg Copy req msg • sock_sendmsg • scm_cookie scm • call ‘scm_send’ • call socket’s ‘sendmsg’ = netlink_ops • netlink_sendmsg
netlink_sendmsg skbuff memcpy_from_iovec msg msg • netlink_broadcast • netlink_unicast dstgroups
netlink_unicast pid socket’s protocol • find ‘linked list’ in nl_tablel • add_wait_queue skbuff socket’s receive queue • call ‘data_ready’ = rtnetlink_rcv
rtnetlink_rcv socket’s receive queue skbuff • invoke ‘rtnetlink_rcv_skb’
rtnetlink_rcv_skb nlh skbuff • invoke ‘rtnetlink_rcv_msg’ • passing ‘nlh’
rtnetlink_rcv_msg • invoke ‘doit’ in ‘rtnetlink_link’ • In this case, doit = tc_modify_qdisc
middle summary User Space tc nlmsghdr, tcmsg netlink, rtnetlink Kernel Space rtnetlink_rcv tc_get_qdisc tc_ctl_tfilter tc_modify_qdisc
tc_modify_qdisc • dev_get_by_index index = tcm->tcm_ifindex • if qdisc parent is set, call ‘qdisc_lookup’ : Find parent Q call ‘qdisc_leaf’
tc_modify_qdisc (con’t) • if tcm->tcm_handle is not empty, call ‘qdisc_lookup’ for band Q fail create_n_graft graft
tc_modify_qdisc (con’t) • if tcm->tcm_handle is empty, if q is empty else create_n_graft create graft
tc_modify_qdisc (con’t) • if (tcm->tcm_parent is not specified), if (tcm->tcm->handle is not empty) then call ‘qdisc_lookup’ • call qdisc_change(q,tca) • ‘qdisc_change’ call ‘prio_tune’
create_n_graft dev, tcm->tcm_handle, tca, &err qdisc_create
qdisc_create • find qdisc’s kind • using kind, get ‘Qdisc_ops’ • allocate space for Q displine • call ‘skb_queue_head_init’ • set up ‘enqueue’, ‘dequeue’ • call ‘ops->init’ = prio_init • insert new Q into qdisc_list
graft • call ‘qdisc_graft’ • connect ‘new’ to parent’s class or dev • if parent Q displine is empty, call ‘dev_graft_qdisc(dev,new)’ • else call ‘get’ from class • call ‘qdisc_notify’
dev_graft_qdisc • dev_deactive • put old ‘qdisc_sleeping’ to ‘oqdisc’ • if new Q is empty, set new Q to noop_qdisc • then, set dev’s qdisc_sleeping to new Q, dev->qdisc to noop_qdisc • Reactive device
prio_get • get minor class ID prio_graft • using minor class ID as index which band
qdisc_chage • directly call ‘sch->ops->change’ chage = prio_tune
prio_tune • argument opt contains ‘bands’ • outside band is set by ‘noop_qdisc’ • update child Q by ‘prio2band array’ • if Q == noop_qdisc qdisc_create_dflt • qdisc_creat_dflt set up child Q set up operator to ‘pfifo_qdisc_ops’