* Department of Informatics, King's College London, London, United Kingdom

An Optimal State Dependent Haptic Guidance Controller Via a Hard Rein AnuradhaRanasinghe*, KasparAlthoefer*, Jacques Penders†, ProkarDasgupta‡, and ThrishanthaNanayakkara* * Department of Informatics, King's College London, London, United Kingdom † Sheffield Centre for Robotics, Sheffield Hallam University, Sheffield, United Kingdom ‡ Urology Centre MRC Centre for Transplantation, King's College London, London, United Kingdom

Blind people depend on haptic feedback for their movement in daily life

Human - human demonstration experiment Posterior Deltoid Anterior Deltoid Control policy Follower Guider Triceps Biceps Low vision Wireless motion sensors Locomotion force Free axis Tug force Swing arm Tactile Sensor array Hard Rein Robot

State (ɸ) : Error of following Action (θ): The swing angle of the REIN Control policy is to understand Guider Follower Control policy Learn an optimum order and a temporal structure. n+4 n-3 n-1 n+1 n+2 n+3 n-4 n-2 n Present action Predicted states Past states

Best model selection R2 (Coefficient of determination) in AR model Dependent variable a2 a1 a3 a4 Model • ɸ3 • ɸ4 ɸ1 • ɸ2

The guider’s model What is the order of proposed control policy? Whether it is reactive or predictive? The guider predictive model The guider reactive model Statistical significance for model order 1st to 3rd Order Δ % 1st to 2nd Order Δ % 1st to 4th Order Δ % n+4 n-3 n-1 n+1 n+2 n+3 n-4 n-2 n Present action Predicted states Past states (*Statistical significance was computed using the Mann Whitney U test)

The response model of the follower Statistical significance for model order What is the order of proposed response model? Whether it is reactive or predictive? *Statistical significance was computed using the Mann Whitney U test n+4 n-3 n-1 n+1 n+2 n+3 n-4 n-2 n Present action Predicted states Past states

Polynomial parameters avg: -2.60 std: 0.57 a0gPre a1gPre avg: 2.40 std: 1.06 avg:7.1e-04 avg: -0.74 std: 0.005 std: 0.54 cgPre a3gPre

Follower as a virtual damped inertial system Low vision Locomotion force Free axis Tug force Swing arm Tactile sensor array Hard Rein Robot where , sampling step force applied along the follower's heading direction the virtual mass position vector in the horizontal plane and the virtual damping coefficient.

Optimality of muscle activation over trials Anterior Deltoid Posterior Deltoid Did they try to optimize energy? Did they learn? Cost function of total muscle activation Triceps Biceps 1st to 3rd Order Δ % 1st to 2nd Order Δ % 1st to 4th Order Δ %

Conclusions The guider learns 3rd order predictive model The follower learns 2nd order reactive model Exploration happens over first half of the trials, learning and optimizing simultaneously happen in second half of the trials

What has to be explored 1.Control algorithm will be tested in human-robot interaction. Limitation of the algorithm will be tested. It will be used for adaptive path planning algorithm. 2. Control algorithm will be tested an uneven terrain/ stair climbing or complex environment 3. What is the dynamic nature of predictive controller? quantify the properties of predictive behavior

* Department of Informatics, King's College London, London, United Kingdom