k-NN 分类器实验

k-NN分类器实验 1．实验目的掌握k-NN最近邻算法的基本思想；认识影响算法性能的因素。编写对实际模式样本正确分类的算法程序。 • 2．实验内容 • （1）产生二维正态分布模式，并将产生的样本集随机地分为参照集和测试集； • （2）统计错分概率，分析产生错分的原因； • （3）（选做）针对视频数据集或对纸币数据集，重复上述实验。

3．拓展内容（选做，视频中的运动目标检测）1of63．拓展内容（选做，视频中的运动目标检测）1of6 要求：运用统计模式识别原理，设计一个视频中的运动目标检测程序。 GMM：Gaussian Mixture Model, GMMi~N(i,i), i=1,2,...,c GMM生成、更新用矩估计法建立或更新c个Gauss分布模型GMMi，i=1,2,...,c GMM Bayes分类器判决s(x,y,t)属于哪个GMMi，i=1,2,...,c 输入视频 s(x,y,t) 逐帧、逐点输入像素值s(x,y,t) 目标判决标记运动目标图1 视频中的运动目标检测原理框图

3．拓展内容（选做，视频中的运动目标检测）2of63．拓展内容（选做，视频中的运动目标检测）2of6 一、输入数据结构：Frames:hwn三维数组 t Frames(x,y,1:n), s(x,y,t) =Frames(x,y,t) (1xw, 1yh , 1tn) w y s(x,y,1) 第n帧 h 第2帧 x 第1帧图2 视频数据结构示意图

3．拓展内容（选做，视频中的运动目标检测）3of63．拓展内容（选做，视频中的运动目标检测）3of6 二、GMM：Gaussian Mixture Model, GMMi~N(i,i), i=1,2,...,c 表1 GMM(x,y)参数 GMM数据结构：GMM(x,y,,i),=(P(i),i, i)’hw3c五维数组

3．拓展内容（选做，视频中的运动目标检测）4of63．拓展内容（选做，视频中的运动目标检测）4of6 三、GMM初始化 GMM数据结构：GMM(x,y,,i),=(P(i),i, i)’hw3c五维数组 c = 3; %设有3个Guass模型 GMM(:,:,:,2:3)=zeros(h,w); %GMM2,GMM3为空 GMM(:,:,1,1)=ones(h,w); %GMM1只含一点 GMM(:,:,2,1)=Frames(:,:,1);%GMM1一阶矩（均值）为第一帧 GMM(:,:,3,1)=Frames(:,:,1)^2;%GMM1二阶矩为第一帧

3．拓展内容（选做，视频中的运动目标检测）5of63．拓展内容（选做，视频中的运动目标检测）5of6 四、GMM更新 No for k=2:n;x=1:h;y=1:w sGMM(x,y,:,:)? Yes s=Frames(x,y,k) End 合并最相似的两个GMM(x,y,:,j1) GMM(x,y,:, j2) j=arg min{GMM(x,y,1, j1), GMM(x,y,1,j2)} for i=1:c No sGMM(x,y,:,i)? Yes 更新GMM(x,y,:,j),j=i GMM(x,y,1,j)++； GMM(x,y,2,j) += s； GMM(x,y,3,j) += s^2；新建立GMM(x,y,:,j) GMM(x,y,1,j) = 1； GMM(x,y,2,j) = s； GMM(x,y,3,j) = s^2；判决s是否是背景

3．拓展内容（选做，视频中的运动目标检测）6of63．拓展内容（选做，视频中的运动目标检测）6of6 五、背景判决 (:) = GMM(x,y,2,:) / GMM(x,y,1,:) 2(:) = GMM(x,y,3,:) / GMM(x,y,1,:) - (:)2g(:) = GMM(x,y,1,:)/ 2(:) if j == arg max{g (:)} then s is background.

% OneNN_Exe_6_3.m 用 1-NN 解习题2.16 clear all;close all;clc; X1 = [1 0; 0 1; 0 -1]; % 第一类样本集 X2 = [0 0; 0 2; 0, -2; -2 0]; % 第二类样本集 sX{1} = X1;sX{2} = X2; figure;title('1-NN for Exe 6.3'); xlabel('x');ylabel('y'); hold on; plot(X1(:,1),X1(:,2),'ro');plot(X2(:,1),X2(:,2),'bs'); for y = -2.5:0.1:2.5 for x = -2.5:0.1:2.5 z = OneNNFun(sX,[x y]); if (z == 1) plot(x,y,'m*'); elseif z == 2 plot(x,y,'b'); end end end

function ClassNo = OneNNFun(sX,x) % 1-NN Nearest Neighbor classification function % Input: sample set sX is a cell array. % ClassNum = length(sX) is the number of classes % sX{i}: Ni-by-d matrix whose rows are observations % (sample vectors) % and whose columns are variables (features), where, % Ni = length(X{i}(:,1)): % the number of samples in class i; % d: the dimension of sample vectors; % i=1:ClassNum % x: 1-by-d sample vector need be classified; % Output: % ClassNo: the class number which x belong to % by 1-NN algorithm.

ClassNum = length(sX); ClassNo = 0; if (ClassNum < 2) return; end Index = 0; for i = 1:ClassNum N = length(sX{i}(:,1)); for j = 1:N Index = Index + 1; D(Index) = norm(x - sX{i}(j,:)); end % next j k(i) = Index; end % next i [Min IX] = min(D); ClassNo = 1; for i = 1:ClassNum if IX > k(i) ClassNo = ClassNo+1; else break; end end % next i

% EditKnn.m %--- Edit Knn 算法 clear all;close all;clc; N = 100; % the number of samples BestK = 3; mu1=[-1,0]'; mu2=[1,0]'; Cov1=[1,0.5;0.5,1]; Cov2=[1,-0.5;-0.5,1]; %--- 每类生成N个服从上述参数的随机矢量 X1 = NormRandomFun(mu1,Cov1,N); X2 = NormRandomFun(mu2,Cov2,N); %--- 建立样本集Cell oX{1} = X1; oX{2} = X2; %--- 剪辑样本 sX = EditKnnFun(oX,BestK);

%--- 用第二组样本测试1-NN分类器的错误率 X11 = NormRandomFun(mu1,Cov1,N);%生成测试样本集 X22 = NormRandomFun(mu2,Cov2,N); E12 = 0; %错将第一类判为第二类的错分概率 E21 = 0; %错将第二类判为第一类的错分概率 for i = 1:N y1 = KnnFun(X11(i,:),sX,1); if (y1 ~= 1) E12 = E12 + 1;%错将第一类判为第二类 end y1 = KnnFun(X22(i,:),sX,1); if (y1 ~= 2) E21 = E21 + 1;%错将第二类判为第一类 end end E12 = E12/N E21 = E21/N Pe = (E21+E12)/2

%--- 显示样本分布 delta = 0.2; [X,Y] = meshgrid(-5:delta:5); d = length(X(1,:)); figure; %--- 显示原始样本集 plot(X1(:,1),X1(:,2),'ro'); hold on; plot(X2(:,1),X2(:,2),'bo');

%--- 标记剪辑样本集 plot(sX{1}(:,1),sX{1}(:,2),'ro', 'MarkerEdgeColor','k','MarkerFaceColor','r'); plot(sX{2}(:,1),sX{2}(:,2),'bo', 'MarkerEdgeColor','k','MarkerFaceColor','b'); axis([X(1,1),X(1,d),Y(1,1),Y(d,1)]); title([num2str(N),' normal samples,Edit Knn,K=',num2str(BestK),',ErrorProb=',num2str(Pe)]); xlabel('x1'); ylabel('x2');

%--- 用 1-NN 分类,显示判决区域 for i = 1:d for j = 1:d x = [X(i,j),Y(i,j)]; y1 = KnnFun(x,sX,1); if (y1 == 1) plot(X(i,j),Y(i,j),'m.'); else plot(X(i,j),Y(i,j),'g.'); end end end % END of EditKnn.m

function ClassNo = KnnFun(x,sX,k) % Knn nearest neighbor classification function % Input: sample set sX is a cell array. % ClassNum = length(X) is the number of classes % x: 1-by-d sample vector to be classified; % sX{i}: Ni-by-d matrix whose rows are sample vectors % and whose columns are features, where, % Ni = length(X{i}(:,1)): % the number of samples in class i; % d: the dimension of sample vectors; % i=1:ClassNum % k: the number of neighbors; % % Output: ClassNo: % the class number whichx belong to % by Knn algorithm.

ClassNum = length(sX); if ((ClassNum < 2)||(k < 1)) return; end Index = 0; for i = 1:ClassNum N = length(sX{i}(:,1)); for j = 1:N Index = Index + 1; D(Index) = norm(x - sX{i}(j,:)); Si(Index) = i; % the class No. of IndexTH sample end% next j end% next i

% the counter of nearest neigghbor Knum = zeros(1,ClassNum); for j = 1:k [V, Mi] = min(D); % find the min element in D % increase neighbor counter of class Si(Mi) Knum(Si(Mi)) = Knum(Si(Mi)) + 1; % delete this min element for finding next % nearest neighbor D(Mi) = inf; end% next j [V, ClassNo] = max(Knum); return;

思考题： （1）如果已知类别样本集不含错误样本，还有必要进行样本剪辑吗？（2）剪辑k-NN最近邻法中，第一步用k-NN法进行剪辑，为什么第二步却用1-NN法进行分类？（3）1-NN最近邻法中，是否所有已知类别样本对正确分类的贡献是一样的？若不同，哪些样本贡献大？请用二维图示说明之。

（使用sort函数的KnnFun） function ClassNo = KnnFun(x,sX,k) % Knn nearest neighbor classification function % using sort function, it’s fit for the case: k >= logN % Input: sample set sX is a cell array. % ClassNum = length(X) is the number of classes % x: 1-by-d sample vector to be classified; % sX{i}: Ni-by-d matrix whose rows are sample vectors % and whose columns are features, where, % Ni = length(X{i}(:,1)): % the number of samples in class i; % d: the dimension of sample vectors; % i=1:ClassNum % k: the number of neighbors; % Output: ClassNo: % the class number whichx belong to % by Knn algorithm.

ClassNum = length(sX); ClassNo = 0; if ((ClassNum < 2)||(k < 1)) return; end Index = 0; for i = 1:ClassNum N = length(sX{i}(:,1)); for j = 1:N Index = Index + 1; D(Index) = norm(x - sX{i}(j,:)); %计算x到各样本的距离 end% next j ClassSegment(i) = Index; %记录类分段标记 NNo(i) = 0; end% next i

[D IX] = sort(D);%按升序排序 for j = 1:k cNo = 1;%所属类号 for i = 1:ClassNum if IX(j) > ClassSegment(i) cNo = cNo+1; %大于类分段标记则是下一类 else break; %已找到所属类号 end end% next i NNo(cNo) = NNo(cNo)+1; %最近邻元个数+1 end% next j

MaxNNum = 0; %最大最近邻元个数 for i=1:ClassNum if MaxNNum < NNo(i) MaxNNum = NNo(i); ClassNo = i; % x归属该类 end end% next i % END of function ClassNo = KnnFun(x,sX,k)

k-NN 分类器实验

k-NN 分类器实验

Presentation Transcript