Iron Reign Computer Vision

Iron Reign Computer Vision Rover Ruckus Season

Outline Need outline completed by tonight - Tuesday evening • What is FTC - (backgrounder for Computer Visionaries - not needed for DPRG • ftc_app - basic description • Vuforia • OpenCV4Android • “Tensorflow Lite” - a black box mineral recognition engine for all teams to use without really having to understand computer vision • Roll your own CNN in Tensorflow

ftc_app • Android app framework published by ftctechnh on github • Common framework for all ftc teams - this is our starting ground • Robot controller - phone on robot connected to underlying hardware controllers • Driver station - phone that operators use to: • Send remote joystick commands • Get telemetry on robot status • Change the active opmode • Restart the robot • Opmodes • Configurations

ftc_app control system Robot Side Driver Side

Robot Wiring

The REV Expansion Hub REV Robotics started by Greg Needel, former DPRG member

Game Elements Gold cubes Silver balls

Why do we need Computer Vision in the first place? Main reason: Sampling is a part of the FTC challenge this year. It awards 30 points to a team which can move only the gold mineral, not the silver mineral, during the autonomous portion of the match (i.e. no driver control).

Vuforia • Augmented reality SDK for mobile devices, but includes image tracking • Developed by Qualcomm, now owned by PTC, developers of Creo and MathCAD • What we use it for • Localization against reference targets - 1 slide • Real-time target tracking - demo of cartbot • Initial frame capture for hand off to downstream CV • Not using Vuforia target tracking in this year’s challenge • Demo - Cartbot tracking a target

What is OpenCV? • Collection of open source Computer Vision algorithms • This is the “standard” computer vision library • Can be combined into powerful pipelines • Been an FTC standard for many years • Very stable and well-tested library • Written in native code for maximum efficiency

Pros and Cons of OpenCV Pros • Easy to use and design • Tools like GRIP make it even easier • Iron Reign has a good track record with OpenCV - using it for the last 3 years • As an added benefit, this means that OpenCV is already integrated into the build so we don’t have to do any fiddling around with Gradle Cons • We can only test pipelines on lighting conditions that we have images for • If a competition has different lighting conditions, our pipeline might fail • OpenCV lacks the “wow factor” associated with Machine Learning/AI during judging.

OpenCV4Android • The name describes it • Java wrappers around opencv functionality tuned for on-phone use • Native code is possible but a heavier lift • Not integrated with the shipping ftc_app project - teams have to do it themselves • High school teams struggle with learning the basic vision algorithms • Then coding accurately becomes an issue • We started with sample apps like ColorBlobDetector and adapted to ftc_app • Manually coded hybrids of Vuforia and OpenCV • Now we use interactive vision pipeline explorers and code generators • RoboRealm - contributed licenses, closed source • GRIP - based on OpenCV - generates pipeline code in Java, C++ and Python

But OpenCV looks too scary....

Use GRIP to design your pipelines

Our OpenCV Pipeline

Handling anomalies

Our final pipeline

Tensorflow Lite TensorFlow is Google’s open source machine learning library. It models neural networks using “tensors,” which are basically neurons. TensorFlow Lite is the solution for running ML models on mobile and embedded devices.

Refresher on Neural Networks

TensorFlow Object Detection TensorFlow’s Object Detection is implemented via a sliding window classifier. A basic sliding window classifier

“Tensorflow” as bundled in ftc_app • Game specific solution to targeting game elements • Easy to follow tutorial on getting it working • Gives a recognition confidence level and location of detected elements • Slow and sloppy on our phones (what’s the fps)

“Tensorflow” as bundled in ftc_app • Black box • Not sure what algorithms are involved, though likely a small CNN called a MobileNet • Probably a “re-trained” Imagenet (transfer learning) • But without knowing, not sure how to improve recognition or speed • Trying to improve repeatability through on-bot lighting • Sketchy performance • We have had trouble getting it to work right • Higher accuracy with minerals on the left than the right during testing • Our sister team said that “mounting their phone at a sideways angle helped improve accuracy”¯\_(ツ)_/¯

Pros and Cons of “TensorFlow” Pros • Easy to get started(sample code provided) • Already integrated into ftc_app build Cons • Black box - we don’t know what is happening • Very sketchy performance • Little to no “wow-factor” as this is available to all teams

Rolling our own Convolutional Neural Network Instead of using a black box model, why not write our own model? We actually had the idea to train our own CNN before TensorFlow Lite integration was even released to ftc_app.

Step 0: Determine Training Objective of the model Given an image, we wished for the network to output 2 integers x and y, (0 < x < 320 and 0 < y < 240; our image is 320x240). These two integers would be the location of the gold mineral on the image.

Step 1: Capturing training data

Step 2: Label training data We could label coordinates by hand, but this is too difficult. Instead, we wrote a program to do help us label the data.

Step 2.1: Write labeling program Available at github.com/arjvik/MineralLabler

Step 2.2: Use labeling program to label images

Step 3: Train model

CNN Structure

Step 3.1: Try again with Java/DL4J

Our CNN Structure (in Java)

Step 4: Continue adapting the model to improve it We are considering turning our model into a sliding window classifier, to potentially increase accuracy. We also need to capture and label more training data, to better fit our model.

Step 5: ?????? Step 6: Profit Of course, if we can get this to work, it will be a great benefit, both during the robot game, and also during judging. For now, we will keep trying to improve it, but we haven’t forgotten about the other two vision methods either. Iron Reign believes in parallel development, and we will continue working on all three before evaluating which one we wish to use later on in the season.

Summary

Resources GRIP pipeline generator: https://wpilib.screenstepslive.com/s/4485/m/24194/l/463566-introduction-to-grip TensorFlow Lite on Android:https://medium.com/tensorflow/using-tensorflow-lite-on-android-9bbc9cb7d69d TensorFlow for Poets codelab:https://codelabs.developers.google.com/codelabs/tensorflow-for-poets Find out more about Iron Reign’s vision algorithms: https://www.ironreignrobotics.com/tags/vision/index

Iron Reign Computer Vision

Iron Reign Computer Vision

Presentation Transcript

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Vision Computer

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision