180 likes | 189 Views
This research aims to test the belief that Bayesian Decision Making (DM) is a universal tool for fair and cost-efficient proposal ranking. It explores a promising negotiation methodology and provides insights for both FET and proposing researchers.
E N D
Miroslav Kárný Department of Adaptive Systems Institute of Information Theory and Automation Academy of Sciences of the Czech Republic school@utia.cas.cz,http://as.utia.cas.cz
… speaker’s home institute … nickname for Institute of Information Theory and Automation Cybernetics Communication & Control in Machines & Animals Cybernetics is speaker’s research domainandled to applications in: • Adaptive control of paper machines, rolling mills, drum boilers,… • Nuclear medicine modeling & DM, dynamic image studies … • Support of operators of complex systems (FET) • Traffic control in cities, optimization of financial strategies • Multiple participants’ DM and E-democracy … …? …! Bayesian DM: single-horse on decades-lasting trip with a good team
FET organizes a review process … … to select the best proposalsp among all submitted proposals • An expert e assigns marks emp {0,…,M}to several proposals within a small group ep of proposals • A small group of experts pe,reviewing the proposal p,harmonizes the final mark mp via discussion • Assembly of all experts completelyranks all proposals EC supports top proposals up to a budget-implied border-line
Addressed problem Procedure is good & fair … up to the extremelydisturbing step • An expert e assigns marks emp {0,…,M}to several proposals within a small group ep of proposals • A small group of experts pe, reviewing the proposal p, harmonizes the final mark mp via discussion • Assembly of all experts completelyranks all proposals • Each expert ehas studied a tiny portion of all proposals • Experts’ marks emp are subjectively scaled • Discrete-valued marks cause many coincidences • Time slot of the assembly is strongly limited errors manipulation expenses
Aims … of the research • to test belief that Bayesian DM is (almost) universaltool relying on the proper modeling only • to test a promising negotiation methodology needed in other contexts, too … of the talk • to help FET to be fair and cost-efficient • to help proposing researchers to be treated fairly • to share fun (?) from the conclusions
Basic idea Experts serve as rank-measuring devices Project proposal p has its objective rank rp! Ranking estimation of rank rp from marks emp, which are noise-corrupted observations of the objective rank
Guide • Experts as measuring devices • Prior knowledge • MAP estimate • Experimental results • Discussion
Experts as measuring devices emp… mark of proposal p by the expert e = rp… objective rank of proposal p + e… personal error experts try to be fair mark emp proportional to rp e independent of p e… personal error = eb… bias + e … personal fluctuations with variance ev interpretation of marks top M Nobel Prize top M flawless Simplicity & maximum entropy eassumed to be Gaussian
number of data 1 – 2 number of unknowns Prior knowledge Needed emp = rp +eb + e = (rp – C) + ( eb + C) + e, for anyC Available rank[0, largest mark] rp [0, M] biaseb[-M, M ] , noise variance ev [0, M2]
MAP estimate Posterior log-likelihood function • smoothly dependent on the estimatedr, b, v • concavein the estimatedr, b, v • defined on a convex domain • unique maximum • harmonised domain and data range • maximum in interior Evaluation Conditions for extreme are solved by successive approximations … fast, simple and reliable … can be used “on-line”
Experiments - proposals’ viewpoint Processed marks m {0, 0.5,…,30}; Assemblyranking available Extreme cases: #Proposal 32 1341 #Experts 33 588 acceptance Threshold 22 25 #proposals above T byA11 157 #proposals above T byus16 72 #proposals chosen by Aandus11 57 #common acceptance / A-one [%] 100 36 • typical numbers • prior does not spoil results with a few data
Histogram of rank estimates … box width about 2% of the mark range ! #(r>T 25) = 57 #(r >T 22) = 11
Experiments - experts’ viewpoint • mean (bias) / Threshold [%] 6 4 • minimum (bias) / T - 13 -45 • maximum (bias) / T 15 13 • mean (std. dev.) / T 13 12 • minimum (std. dev.) / T 10 7 • maximum (std. dev.) / T 21 38 Box width containing significant number of proposals 3 % of T !
Discussion Evaluation aspects • it works • it exhibits fast and reliable convergence • it is reasonably robust to variations of prior statistics Operational aspects • it can substitute or at least support assembly ranking • it allows continuous-valued marking • it avoids the need to harmonize marks within pe • it makes ranking less sensitive to experts’ biases & variations • it suppresses lottery-type results for gray-zone-ranked proposals(those with the rank around threshold) • it makes evaluation more objective
Discussion Quality assurance aspects • it checks reliability of experts, using their biases & variances: 70-80 [%] experts o.k. but unreliable or cheating rest still forms a significant portion • it allows tracking of “bad” experts • it opens a way to relate prior & posterior ranking, i.e., the achieved results of supported projects Methodological aspects • it can be tailored to other problems • it can serve as a tool supporting negotiation
Future • alternative models of experts, e.g., non-normal, Markov-chain type • comparison of prior and posterior ranking • application to other negotiation-type processes • application to individual marks & thresholds • quality assurance of the evaluation including experts’ competence !