380 likes | 646 Views
前向人工神经网络敏感性研究. 曾晓勤. 河海大学计算机及信息工程学院 2003年10月. 一. 引言. 1. 前向神经网络( FNN) 介绍 ● 神经元 – 离散型: 自适应线性元( Adaline) – 连续型: 感知机( Perceptron) ● 神经网络 – 离散型: 多层自适应线性网( Madaline) – 连续型: 多层感知机( BP 网或 MLP). 2. 研究提出. ● 问题 – 硬件精度对权的影响 – 环境噪音对输入的影响 ● 动机 – 参数的扰动对网络会产生怎样影响?
E N D
前向人工神经网络敏感性研究 曾晓勤 河海大学计算机及信息工程学院 2003年10月
一. 引言 1. 前向神经网络(FNN)介绍 ●神经元 –离散型:自适应线性元(Adaline) –连续型:感知机(Perceptron) ●神经网络 –离散型:多层自适应线性网(Madaline) –连续型:多层感知机(BP网或MLP)
2. 研究提出 ●问题 – 硬件精度对权的影响 – 环境噪音对输入的影响 ●动机 –参数的扰动对网络会产生怎样影响? –如何衡量网络输出偏差的大小?
3. 研究内容 ●建立网络输出与网络参数扰动之间的关系 ●分析该关系,揭示网络的行为规律 ●量化网络输出偏差
4. 研究意义 ●指导网络设计,增强网络抗干扰能力 ●度量网络性能,如容错和泛化能力 ●研究其它网络课题的基础,如网络结构的 裁剪和参数的挑选等
二.研究纵览(典型方法和文献) • Madaline的敏感性 • ●n维几何模型(超球面) • M. Stevenson, R. Winter, and B. Widrow, “Sensitivity of Feedforward Neural Networksto Weight Errors,” IEEE Trans. on Neural,Networks, vol. 1, no. 1, 1990. • ●统计模型(方差) • S. W. Piché, “The Selection of Weight Accuracies for Madalines,” IEEE Trans. on Neural Networks, vol. 6, no. 2, 1995.
2. MLP的敏感性 ●分析方法(偏微分) S. Hashem, “Sensitivity Analysis for Feed- Forward Artificial Neural Networks with Differentiable Activation Functions”, Proc. of IJCNN, vol. 1, 1992. ●统计方法(标准差) J. Y. Choi & C. H. Choi, “Sensitivity Ana- lysis of Multilayer Perceptron with Differ- entiable Activation Functions,” IEEE Trans. on Neural Networks, vol. 3, no. 1, 1992.
3. 敏感性的应用 ●输入属性筛选 J. M. Zurada, A. Malinowski, S. Usui, “Perturbation Method for Deleting Redundant Inputs of Perceptron Networks”, Neurocomputing, vol. 14, 1997. ●网络结构裁减 A. P. Engelbrecht, “A New Pruning Heuristic Based on Variance Analysis of Sensitivity Information”, IEEE Trans. on Neural Networks, vol. 12, no. 6, 2001.
●容错和泛化问题 • J.L. Bernier et al, “A Quantitive Study of Fault Tolerance, Noise Immunity and Generalization Ability of MLPs,” Neural Computation, vol. 12, 2000.
三. 研究方法 1. 自底向上方法 ●单个神经元 ●整个网络 2.概率统计方法 ●概率(离散型) ●均值(连续型) 3. n-维几何模型 ●超矩形的顶点(离散型) ●超矩形体(连续型)
四.已获成果(代表性论文) ●敏感性分析: “Sensitivity Analysis of Multilayer Percep- tron to Input and Weight Perturbations,” IEEE Trans. on Neural Networks, vol. 12, no.6, pp. 1358-1366, Nov. 2001.
●敏感性量化: “A Quantified Sensitivity Measure for Multi- layer Perceptron to Input Perturbation,” Neural Computation, vol. 15, no. 1, pp. 183-212, Jan. 2003.
●隐层节点的裁剪(敏感性应用): “Hidden Neuron Pruning for MultilayerPerceptrons Using Sensitivity Measure,” Proc. of IEEE ICMLC2002, pp. 1751-1757, Nov. 2002. ●输入属性重要性的判定(敏感性应用): “Determining the Relevance of Input Features for Multilayer Perceptrons,” Proc. of IEEE SMC2003, Oct. 2003.
五. 未来工作 ●进一步完善已有的结果,使之更加实用 – 放松限制条件 –扩大分析范围 – 精确量化计算 ●进一步应用所得的结果,解决实际问题 ●探索新方法,研究新类型的网络
结束 谢谢各位!
Effects of input & weight deviations on neurons’ sensitivity • Sensitivity increases with input and weigh deviations, but the increase has an upper bound.
Effects of input dimension on neurons’ sensitivity There exists an optimal value for the dimension of input, which yields the highest sensitivity value.
Effects of input & weight deviations on MLPs’ sensitivity Sensitivity of an MLP increases with the input and weight deviations.
Effects of the number of neurons in a layer • Sensitivity of MLPs: { n-2-2-1 | 1n 10 } to the dimension of input.
Sensitivity of MLPs: { 2-n-2-1 | 1n 10 } to the number of neurons in the 1st layer.
Sensitivity of MLPs: { 2-2-n-1 | 1n 10 } to the number of neurons in the 2nd layer . There exists an optimal value for the number of neurons in a layer, which yields the highest sensitivity value. The nearer a layer to the output layer is, The more effect the number of neurons in the layer has.
Effects of the number of layers • Sensitivity of MLPs:{2-1,2-2-1,..,2-2-2-2-2-2-2-2-2-2-1} to the number of layers. Sensitivity decreases with the number increasing, and the decrease almost levels off when the number becomes large.
Simulation 1 (Function Approximation) • Implement an MLP to approximate the function: where • Implementation considerations • The MLP architecture is restricted to 2-n-1. • The convergence condition is MES-goal=0.01&Epoch105. • The lowest trainable number of hidden neurons is n=5. • The pruning processes start with MLPs of 2-5-1 and stop at an architecture of 2-4-1. • The relevant data used by and resulted from the pruning process are listed in Table 1and Table 2.
MLP 2-5-1 Epoch MSE (training) MSE (testing) Trained weights and bias MSE-(goal=0.01 & epoch<=100000) Sensitivity Relevance 1 30586 0.000999816 0.0117005 [-12.9212 -0.2999] [ 33.7943 -34.6057] [ 31.4768 -31.0169] [-0.5607 -0.8140] [ 1.1737 -1.1026] [-5.4507 12.7341 -13.0816 -12.0171 8.7152] bias=0 0.031794 0.002272 0.001406 0.027066 0.001815 0.1733 0.0289 0.0184 0.3253 0.0158 2 65209 0.000999959 0.0124573 [ 32.6223 -33.3731] [-0.7361 0.7202] [-31.8412 31.2399] [-15.1872 -0.0937] [-0.3989 -1.0028] [ 11.9959 -15.4905 12.2103 -6.0877 -12.5057] bias=0 0.002176 0.000463 0.001821 0.031017 0.027068 0.0261 0.0072 0.0222 0.1888 0.3385 3 26094 0.000999944 0.0120354 [-15.0940 17.6184] [-19.9163 21.4109] [-14.0535 -0.8460] [ 1.0263 -0.1258] [ 26.7757 -26.1259] [ 8.8172 -18.6532 -6.8307 16.8506 -10.4671] bias=0 0.013547 0.006661 0.026220 0.028352 0.002324 0.1194 0.1242 0.1791 0.4777 0.0243 TABLE 1. Data for 3 MLPs with 5 hidden neurons to realize the function
MLP 2-4-1 Epoch MSE (training) MSE (testing) Retrained weights and bias (goal=0.01 & epoch<=100000) Sensitivity Relevance 1 (Obtained by removing the 5th neuron from the MLP of 2-5-1) 2251 0.000999998 0.0114834 [-14.4387 -0.7003] [ 34.8366 -35.6080] [ 33.1285 -32.6271] [-1.5065 0.0184] [-5.7036 13.0579 -13.2457 -12.1803] bias=4.2349 0.027014 0.002100 0.001460 0.031343 0.1541 0.0274 0.0193 0.3818 2 (Obtained by removing the 2nd neuron from the MLP of 2-5-1) 1945 0.000999921 0.0119645 [ 33.5805 -34.2727] [-32.9313 32.3172] [-15.8016 -0.5610] [-1.3318 0.0103] [12.6267 12.7961 -6.1782 -13.3652] bias=-7.9468 0.001954 0.001800 0.026902 0.029283 0.0247 0.0230 0.1662 0.3914 3 (Obtained by removing the 5th neuron from the MLP of 2-5-1) 13253 0.000999971 0.011926 [-34.3974 33.8148] [-34.3250 34.7990] [-1.2909 0.0198] [ 11.8097 0.8879] [ 15.7984 -15.6503 -12.9606 6.0722] bias=-1.4194 0.001637 0.001316 0.028834 0.028122 0.0259 0.0206 0.3737 0.1708 TABLE 2. Data for the 3 pruned MLPs with 4 hidden neurons to realize the function
Simulation 2 (Classification) • Implement an MLP to solve the XOR problem: 0 1 • Implementation considerations • The MLP architecture is restricted to 2-n-1. • The convergence condition is MES-goal=0.1&Epoch105. • The pruning processes start with MLPs of 2-5-1 and stop at an architecture of 2-4-1. • The relevant data used by and resulted from the pruning process are listed in Table 3and Table 4.
MLP 2-5-1 Epoch MSE (training) MSE (testing) Trained weights and bias (goal=0.1 & epoch<=100000) Sensitivity Relevance 1 44518 0.0999997 0.109217 [ 2.8188 -8.1143] [ 2.4420 -0.5450] [ 2.5766 3.7037] [ 1.4955 -2.9245] [-2.5714 -3.7124] [ 14.0153 -43.9907 28.0636 19.5486 -68.6432] bias=0 0.047599 0.035747 0.031518 0.027355 0.031513 0.6671 1.5725 0.8845 0.5348 2.1632 2 51098 0.0999998 0.113006 [ 1.4852 -3.8902] [ 1.0692 0.1466] [-1.0723 -0.1455] [-7.0301 2.5695] [-3.1382 -2.8094] [ 23.9314 -19.1824 27.1565 14.9694 -91.6363] bias=0 0.037593 0.020170 0.020178 0.045504 0.032550 0.8997 0.3869 0.5480 0.6812 2.9828 3 33631 0.0999994 0.11369 [ 3.2920 2.9094] [-1.0067 3.4724] [-7.0578 2.4377] [-3.2921 -2.9096] [ 1.5303 -0.0606] [ 45.7579 -30.0598 16.5386 -52.2874 -29.7040] bias=0 0.031498 0.039166 0.046210 0.031497 0.031715 1.4413 1.1773 0.7642 1.6469 0.9421 TABLE 3. Data for 3 MLPs with 5 hidden neurons to realize the function
MLP 2-4-1 Epoch MSE (training) MSE (testing) Retrained weights and bias (goal=0.1 & epoch<=100000) Sensitivity Relevance 1 (Obtained by removing the 4th neuron from the MLP of 2-5-1) 22611 0.0999999 0.109085 [ 2.8745 -6.8849] [ 1.9844 0.0405] [ 2.6295 3.8648] [-2.6270 -3.8656] [ 22.5649 -51.3458 33.0982 -74.4371] bias=5.5570 0.043173 0.028627 0.030708 0.030717 0.9742 1.4699 1.0164 2.2865 2 (Obtained by removing the 2nd neuron from the MLP of 2-5-1) 14457 0.0999998 0.112792 [ 1.1511 -3.9352] [-1.4080 -0.2348] [-6.8277 2.3307] [-3.2002 -2.9670] [ 26.3668 31.5437 16.3482 -98.8089] bias=-12.4656 0.040841 0.029591 0.045979 0.031612 1.0768 0.9334 0.7517 3.1235 3 (Obtained by removing the 3rd neuron from the MLP of 2-5-1) 17501 0.0999997 0.111499 [ 3.0386 3.7789] [-1.3471 4.6670] [-3.0386 -3.7789] [ 3.5143 -0.7579] [ 59.1526 -34.0215 -58.5949 -36.1761] bias=1.7474 0.029114 0.043097 0.029114 0.041372 1.7222 1.4662 1.7059 1.4967 TABLE 4. Data for the 3 pruned MLPs with 4 hidden neurons to realize the function