1 / 37

Natural User Interface with Kinect for Windows

Natural User Interface with Kinect for Windows. Clemente Giorio & Paolo Patierno. Natural User Interface. Hardware Overview. 3-axis ACCELEROMETER. IR PROJECTOR. MIC ARRAY. DEPTH CAMERA. RGB CAMERA. Hardware Requirements :

kishi
Download Presentation

Natural User Interface with Kinect for Windows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural User Interface with Kinect for Windows Clemente Giorio & Paolo Patierno

  2. Natural User Interface

  3. Hardware Overview 3-axis ACCELEROMETER IR PROJECTOR MIC ARRAY DEPTH CAMERA RGB CAMERA • Hardware Requirements: • Windows 7, Windows 8, Windows Embedded Standard 7, or Windows Embedded POSReady 7. • CPU x86 or x64 • Dual-core 2.66-GHz • Dedicated USB 2.0 bus • 2 GB RAM TILT MOTOR

  4. Inside Kinect

  5. IR Projector The pattern is composed by 3x3 sub-patterns of 211x165 dots pattern (for a total of 633x495 dots). In each sub-patterns one spot is much brighter than all the others. 827nm

  6. Depth Camera CMOS with an IR-pass filter up-to 640x480pixels Each pixel, based on 11 bits, can represents 2048 levels of depth.

  7. RGB Camera IR frame CMOS 1280x960@12fps 30fps@640x480with 8bits per channel producing a Bayer filter output with a RGGBD pattern

  8. Tilt Motor & 3-axis Accelerometer 3-axis accelerometer configured for a 2g range (g is the acceleration value due to gravity) with 1-3 degree accuracy. Tilt Motor

  9. Mic Array 4 x mic 24-bit Analog to Digital Converter The captured audio is encoded using Pulse-Code Modulation (PCM) with a sampling rate of 16 KHz and 16-bit depth. Advantages of multi-microphones Enhanced Noise Suppression, Acoustic Echo Cancellation Beam-forming technique.

  10. SDK Overview

  11. Camera Data

  12. Step 1: Register for VideoFrameReady Event /// Active Kinect sensor privateKinectSensorsensor; // Turn on the color stream to receive color frames this.sensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30); // Add an eventhandler to be calledwheneverthereis new color frame datathis.sensor.ColorFrameReady += this.SensorColorFrameReady; // Start the sensor! this.sensor.Start();              

  13. Step 2: Read the Stream ///Eventhandler for Kinect sensor'sColorFrameReadyevent privatevoidSensorColorFrameReady(objectsender, ColorImageFrameReadyEventArgs e)  {       using (ColorImageFramecolorFrame = e.OpenColorImageFrame())              { if (colorFrame != null)                  {                     // Copy the pixel data from the image to a temporary array colorFrame.CopyPixelDataTo(this.colorPixels);                      // Write the pixel data intoour bitmap this.colorBitmap.WritePixels(new Int32Rect(0, 0,  • this.colorBitmap.PixelWidth,  • this.colorBitmap.PixelHeight),                          • this.colorPixels,                          • this.colorBitmap.PixelWidth * sizeof(int),0);                  }              }          }

  14. DepthFrameReadyEvent voidsensor_DepthFrameReady(objectsender, DepthImageFrameReadyEventArgs e) {      using (DepthImageFramedepthFrame = e.OpenDepthImageFrame())     {          if (depthFrame != null)         {              // Copy the pixel data from the image to a temporary array depthFrame.CopyDepthImagePixelDataTo(this.depthPixels);              //convert the depthpixels to coloredpixels ConvertDepthData2RGB(depthFrame.MinDepth, depthFrame.MaxDepth);              this.depthBitmap.WritePixels( new Int32Rect(0, 0, this.depthBitmap.PixelWidth,  this.depthBitmap.PixelHeight),   this.colorDepthPixels,   this.depthBitmap.PixelWidth * sizeof(int), 0);              UpdateFrameRate();         }     } }

  15. Depth Data • ImageFrame.Image.Bits • Array of bytes - public byte[] Bits; • Array –Starts at top left of image –Moves left to right, then top to bottom –Representsdistance for pixel in millimeters

  16. Distance • 2 bytes per pixel (16 bits) • Depth – Distance per pixel –Bitshiftsecond byte by 8 –Distance (0,0) = (int)(Bits[0] | Bits[1] << 8); –VB (int)(CInt(Bits(0)) Or CInt(Bits(1)) << 8); • DepthAndPlayerIndex – Includes Player index –Bitshift by 3 first byte (player index), 5 second byte –Distance (0,0) =(int)(Bits[0] >> 3 | Bits[1] << 5); –VB:(int)(CInt(Bits(0)) >> 3 Or CInt(Bits(1)) << 5);

  17. Skeleton Tracking • Skeleton Data Y X Z

  18. Skeleton Seated 10 Joints Default 20 Joints

  19. Skeleton API

  20. Joint Data • Maximum two players tracked at once • Six player proposals • Each player with set of <x, y, z> joints in meters • Each joint has associated state • Tracked, Not tracked, or Inferred • Inferred - Occluded, clipped, or low confidence joints

  21. Step 1: SkeletonFrameReadyevent // Turn on the skeletonstream to receiveskeletonframes this.sensor.SkeletonStream.Enable(); // Add an eventhandler to be calledwheneverthereis new color frame data this.sensor.SkeletonFrameReady += this.SensorSkeletonFrameReady; ///Eventhandler for Kinect sensor'sSkeletonFrameReadyeventprivatevoidSensorSkeletonFrameReady (objectsender, SkeletonFrameReadyEventArgs e) { Skeleton[] skeletons = newSkeleton[0];      using (SkeletonFrameskeletonFrame = e.OpenSkeletonFrame()) { if (skeletonFrame != null)         {              skeletons = newSkeleton[skeletonFrame.SkeletonArrayLength];              skeletonFrame.CopySkeletonDataTo(skeletons);         }     }

  22. Step 2: Read the skeleton data using (DrawingContext dc = this.drawingGroup.Open()) { // Draw a transparent background to set the render size dc.DrawRectangle(Brushes.Black, null,  newRect(0.0, 0.0, RenderWidth, RenderHeight));      if (skeletons.Length != 0)     {          foreach (Skeletonskelinskeletons) {              RenderClippedEdges(skel, dc);              if (skel.TrackingState == SkeletonTrackingState.Tracked) {                  this.DrawBonesAndJoints(skel, dc);}              elseif (skel.TrackingState == SkeletonTrackingState.PositionOnly) {                  dc.DrawEllipse(this.centerPointBrush, null,this.SkeletonPointToScreen(skel.Position),                 BodyCenterThickness,                  BodyCenterThickness);  }}}     // preventdrawingoutside of our render area this.drawingGroup.ClipGeometry =  newRectangleGeometry(newRect(0.0, 0.0, RenderWidth, RenderHeight)); } }

  23. Step 3: Use the joint data // Left Arm this.DrawBone(skeleton, drawingContext, JointType.ShoulderLeft, JointType.ElbowLeft);              this.DrawBone(skeleton, drawingContext, JointType.ElbowLeft, JointType.WristLeft);              this.DrawBone(skeleton, drawingContext, JointType.WristLeft, JointType.HandLeft);

  24. Step 4: Fine-tune

  25. Audio • As microphone • For Speech Recognition

  26. Speech Recognition • Kinect Grammar available to download • Grammar – What we are listening for –Code – GrammarBuilder, Choices –Speech Recognition Grammar Specification (SRGS) • C:\Program Files (x86)\Microsoft Speech Platform SDK\Samples\Sample Grammars\

  27. Grammar <grammarversion="1.0"xml:lang="it-IT"root="rootRule"tag-format="semantics/1.0-literals"xmlns="http://www.w3.org/2001/06/grammar"> <ruleid="rootRule"> <one-of> <item> <tag>FORWARD</tag> <one-of> <item> avanti </item> <item> vai avanti </item> <item> avanza </item> </one-of> </item> <item> <tag>BACKWARD</tag> <one-of> <item> indietro </item> <item> vai indietro </item> <item> indietreggia </item> </one-of> </item> </one-of> </rule> </grammar>

  28. Netduino Plus based robot Magician chassis • Struttura • 2 DC motors Motor driver WiFi bridge Netduino Plus

  29. Demo MotionControlRemote Connect & Commands MotionServer MotionClient MotionControlTB6612FNG TB6612FNG

  30. Demo

  31. DEMO Per visualizzare qualche attimo registrato durante la sessione:Demo GestureRecognition: https://vimeo.com/58336449 Demo Speech Recognition in Napoletano: https://vimeo.com/58336020

  32. Resources & Contact Kinect for Windows: http://www.microsoft.com/en-us/kinectforwindows/ MSDN: http://msdn.microsoft.com/en-us/library/hh855347.aspx Clemente Giorio: http://it.linkedin.com/pub/clemente-giorio/11/618/3a Paolo Patierno: http://it.linkedin.com/in/paolopatierno

  33. Ringraziamo gli sponsor!

More Related