210 likes | 300 Views
DBLA: Distributed Block Learning Algorithm For Channel Selection In Cognitive Radio Networks. Chowdhury Sayeed Hyder Department of Computer Science & Engineering Michigan State University. - Chowdhury Sayeed Hyder , and Li Xiao. Outline. Background Cognitive Radio Network
E N D
DBLA: Distributed Block Learning Algorithm For Channel Selection In Cognitive Radio Networks ChowdhurySayeedHyder Department of Computer Science & Engineering Michigan State University - ChowdhurySayeedHyder, and Li Xiao
Outline • Background • Cognitive Radio Network • Channel Selection Problem • Distributed Block Learning Algorithm • Decision Period • Channel Ranking • Channel Switching • Simulation Results • Regret • Switching cost wowmom 2012
Background Figure: Underutilized Spectrum Figure: Current Spectrum Allocation in US Ref: Akyildiz, I., W. Lee, M. Vuran, and S. Mohanty, “NeXt Generation/ Dynamic Spectrum Access/ Cognitive Radio Wireless Networks: A Survey”, Computer Networks 2006 wowmom 2012
Background • Current Status • Spectrum Scarcity • Underutilized spectrum • Cognitive radio (CR) • Adapt its transmission and reception parameters (frequency, modulation rate, power etc.) • Cognitive Radio Network • Two types of user • Primary user or licensed user (PU) • Secondary user or opportunistic user (SU) • Requirements • SU cannot affect ongoing transmission of PUs • Must vacant the spectrum if PU arrives wowmom 2012
Problem Statement • Channel Selection Problem • Unknown PU activity • Time varying channel condition • Channel switching is not free! • Learning algorithm (exploration exploitation) • Our goal is to design a distributed learning algorithm that minimizes regret, minimizes switching cost, and adapts to time varying channels. wowmom 2012
Problem Statement ^ The expected regret following policy ρ Difference in reward between optimal channel selection and channel selection by any learning algorithm Switching regret wowmom 2012
Problem Statement The expected reward following optimal policy ρ The expected reward following centralized policy ρcent The expected reward following distributed policy ρdist wowmom 2012
Problem Statement = Ref: Y. Xiao and F. Hu, Cognitive Radio Networks, CRC press, 2008 • Switching regret • # number of switching x unit switching cost • Defined as the number of packets could have been transmitted within the time if it did not switch that channel. • Unit switching cost switching delay Estimated packet transmission time wowmom 2012
Problem Statement The expected regret following centralized policy ρcent The expected regret following distributed policy ρdist wowmom 2012
Distributed Block Learning Algorithm • Formulate the channel selection problem as multi arm bandit problem with multiple play and switching cost. • Present a distributed ‘block’ approach where each user selects channel independently • Decision period (when) • Channel Ranking (on what) • Channel Switching (why) • Channel Adaptation (how) wowmom 2012
Decision Period • Block and frame: • Timeslots are arranged in blocks, blocks are in frames. • Block length increases linearly, frame length increases exponentially with frame number • All blocks in a frame are of equal length wowmom 2012
Channel Ranking • Channel ranking based on • Time average statistics • What we already got from the channel • Upper bound statistics • What we expect from the channel wowmom 2012
Channel Switching • Only one channel is compared with the current channel (round robin) at the decision period • Channel switching rule • If the candidate channel has higher expectation than the current one. • If the current channel is not in the top rank wowmom 2012
Channel Adaptation • Opportunity cost • Increase the expectation of other channels if the idle rate of the current channel is not consistent with its overall idle rate. • Increases the probability of switching wowmom 2012
Simulation NS2 Channels’ idle probability follows Bernoulli distribution Number of channels: 9 Number of users: 4-8 Time slots: 50000 Unit switching cost: 0.5 wowmom 2012
Results (Regret) DBLA outperforms RAND in terms of regret minimization Normalized Regret vs. time (with and without switching cost) ρrand:A. Anandkumar, N. Michael, and A.Tang. “Opportunistic Spectrum Access with Multiple Users: Learning Under Competition, INFOCOM 2010 wowmom 2012
Results (Scalability) In the case of RAND, regret increases exponentially while in the case of DBLA, Rate of change in regret is almost linear. wowmom 2012
Results (switching) Regret vs. switching cost # of Switching vs. # of users DBLA has much less regret and less number of switching compared to RAND wowmom 2012
Results (adaptability) • Channels idle probability changes at each 10000 slots wowmom 2012
Conclusion & Future Work • Learning algorithm to rank channels which • minimizes regret • minimizes switching • is scalable • adapts to dynamic channel condition • Future Work • More realistic channel model • Theoretical proof analysis for upper bound wowmom 2012