500 likes | 810 Views
Android Sensor Programming. Lecture 13 Wenbing Zhao Department of Electrical Engineering and Computer Science Cleveland State University w.zhao1@csuohio.edu. Outline. Google mobile vision (part II): Text recognition ML Kit: Machine Learning for Mobile Developers.
E N D
Android Sensor Programming Lecture 13 Wenbing Zhao Department of Electrical Engineering and Computer Science Cleveland State University w.zhao1@csuohio.edu Android Sensor Programming
Outline Google mobile vision (part II): Text recognition ML Kit: Machine Learning for Mobile Developers Android Sensor Programming
Optical Character Recognition (OCR) Optical Character Recognition (OCR)https://codelabs.developers.google.com/codelabs/mobile-vision-ocr/#0 OCR gives a computer the ability to read text that appears in an image, letting applications make sense of signs, articles, flyers, pages of text, menus, or any other place that text appears as part of an image Initializing the Mobile Vision TextRecognizer Setting up a Processor to receive frames from a camera as they come in and look for text Rendering out that text to the screen at its location Sending that text to Android's TextToSpeech engine to speak it aloud Android Sensor Programming
OCR Reader Create a new app and name it OCRReader Modify manifest <?xml version="1.0" encoding="utf-8"?><manifest xmlns:android="http://schemas.android.com/apk/res/android"package="com.wenbing.ocrreader"android:installLocation="auto" > <uses-feature android:name="android.hardware.camera" /> <uses-permission android:name="android.permission.CAMERA" /> <applicationandroid:allowBackup="true"android:fullBackupContent="false"android:hardwareAccelerated="true"android:label="OcrReaderApp"android:supportsRtl = "true"android:theme="@style/Theme.AppCompat" > <meta-dataandroid:name="com.google.android.gms.version"android:value="@integer/google_play_services_version" /> <meta-dataandroid:name="com.google.android.gms.vision.DEPENDENCIES"android:value="ocr" /> <activity android:name="com.wenbing.ocrreader.MainActivity"android:label="Read Text"> <intent-filter> <action android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> </activity> </application></manifest> Android Sensor Programming
OCR Reader Modify build.gradle (Module: app) apply plugin: 'com.android.application'android {compileSdkVersion26buildToolsVersion"26.0.2"defaultConfig {applicationId"com.wenbing.ocrreader"minSdkVersion24targetSdkVersion26versionCode1versionName"1.0"testInstrumentationRunner"android.support.test.runner.AndroidJUnitRunner"}buildTypes { release {minifyEnabledfalseproguardFilesgetDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'} }}dependencies { compile fileTree(dir: 'libs', include: ['*.jar'])androidTestCompile('com.android.support.test.espresso:espresso-core:2.2.2', { exclude group: 'com.android.support', module: 'support-annotations'}) compile 'com.android.support:support-v4:24.2.0'compile 'com.google.android.gms:play-services-vision:9.4.0+'compile 'com.android.support:design:24.2.0'testCompile'junit:junit:4.12'} Android Sensor Programming
OCR Reader Add a Java class and name it CameraSource. Copy and paste the file you downloaded in BarcodeReader app Modify activity_main.xml layout: <?xml version="1.0" encoding="utf-8"?><LinearLayoutxmlns:android="http://schemas.android.com/apk/res/android"android:id="@+id/topLayout"android:layout_width="match_parent"android:layout_height="match_parent"android:keepScreenOn="true"> <com.wenbing.ocrreader.CameraSourcePreviewandroid:id="@+id/preview"android:layout_width="match_parent"android:layout_height="match_parent"> <com.wenbing.ocrreader.GraphicOverlayandroid:id="@+id/graphicOverlay"android:layout_width="match_parent"android:layout_height="match_parent" /> </com.wenbing.ocrreader.CameraSourcePreview></LinearLayout> Android Sensor Programming
OCR Reader Modify strings.xml <resources> <string name="ok">OK</string> <string name="permission_camera_rationale">Access to the camera is needed for detection</string> <string name="no_camera_permission">This application cannot run because it does not have the camera permission. The application will now exit.</string> <string name="low_storage_error">Ocr dependencies cannot be downloaded due to low device storage</string> <string name="title_activity_main">Ocr Detector Sample</string> <string name="ocr_header">Click "DetectText" to detect text</string> <string name="read_text">Detect Text</string> <string name="auto_focus">Auto Focus</string> <string name="use_flash">Use Flash</string> <string name="ocr_success">Text read successfully</string> <string name="ocr_failure">No text captured</string> <string name="ocr_error">"Error reading text: %1$s"</string></resources> Android Sensor Programming
OCR Reader Add a Java class and name it CameraSourcePreview. It is the same as the one in BarcodeReader except the following method: @Override protected void onLayout(boolean changed, int left, int top, int right, int bottom) { int previewWidth = 320; int previewHeight = 240; if (mCameraSource != null) { Size size = mCameraSource.getPreviewSize(); if (size != null) { previewWidth = size.getWidth(); previewHeight = size.getHeight(); } } // Swap width and height sizes when in portrait, since it will be rotated 90 degrees if (isPortraitMode()) { int tmp = previewWidth; previewWidth = previewHeight; previewHeight = tmp; } final int viewWidth = right - left; final int viewHeight = bottom - top; int childWidth; int childHeight; int childXOffset = 0; int childYOffset = 0; float widthRatio = (float) viewWidth / (float) previewWidth; float heightRatio = (float) viewHeight / (float) previewHeight; Android Sensor Programming
OCR Reader CameraSourcePreview // To fill the view with the camera preview, while also preserving the correct aspect ratio, // it is usually necessary to slightly oversize the child and to crop off portions along one // of the dimensions. We scale up based on the dimension requiring the most correction, and // compute a crop offset for the other dimension. if (widthRatio > heightRatio) { childWidth = viewWidth; childHeight = (int) ((float) previewHeight * widthRatio); childYOffset = (childHeight - viewHeight) / 2; } else { childWidth = (int) ((float) previewWidth * heightRatio); childHeight = viewHeight; childXOffset = (childWidth - viewWidth) / 2; } for (int i = 0; i < getChildCount(); ++i) { // One dimension will be cropped. We shift child over or up by this offset and adjust // the size to maintain the proper aspect ratio. getChildAt(i).layout(-1 * childXOffset, -1 * childYOffset, childWidth - childXOffset, childHeight - childYOffset); } try { startIfReady(); } catch (SecurityException se) { Log.e(TAG,"Do not have permission to start the camera", se); } catch (IOException e) { Log.e(TAG, "Could not start camera source.", e); } } Android Sensor Programming
OCR Reader Add a Java class and name it GraphicOverlay. It is the same as the one in FaceTracker except the addition of the following method: /** * Returns the first graphic, if any, that exists at the provided absolute screen coordinates. * These coordinates will be offset by the relative screen position of this view. * @return First graphic containing the point, or null if no text is detected. */ public T getGraphicAtLocation(float rawX, float rawY) { synchronized (mLock) { // Get the position of this View so the raw location can be offset relative to the view. int[] location = new int[2]; this.getLocationOnScreen(location); for (T graphic : mGraphics) { if (graphic.contains(rawX - location[0], rawY - location[1])) { return graphic; } } return null; } } Android Sensor Programming
OCR Reader Add a Java class and name it OcrGraphic import android.graphics.Canvas; import android.graphics.Color; import android.graphics.Paint; import android.graphics.RectF; import com.google.android.gms.vision.text.Text; import com.google.android.gms.vision.text.TextBlock; import java.util.List; // Graphic instance for rendering TextBlock position, size, and ID within an associated graphic overlay view. public class OcrGraphic extends GraphicOverlay.Graphic { private int mId; private static final int TEXT_COLOR = Color.WHITE; private static Paint sRectPaint; private static Paint sTextPaint; private final TextBlockmText; OcrGraphic(GraphicOverlay overlay, TextBlock text) { super(overlay); mText = text; if (sRectPaint == null) { sRectPaint = new Paint(); sRectPaint.setColor(TEXT_COLOR); sRectPaint.setStyle(Paint.Style.STROKE); sRectPaint.setStrokeWidth(4.0f); } if (sTextPaint == null) { sTextPaint = new Paint(); sTextPaint.setColor(TEXT_COLOR); sTextPaint.setTextSize(54.0f); } // Redraw the overlay, as this graphic has been added. postInvalidate(); } Android Sensor Programming
OCR Reader OcrGraphic public int getId() { return mId; } public void setId(int id) { this.mId = id; } public TextBlockgetTextBlock() { return mText; } /** * Checks whether a point is within the bounding box of this graphic. * The provided point should be relative to this graphic's containing overlay. * @param x An x parameter in the relative context of the canvas. * @param y A y parameter in the relative context of the canvas. * @return True if the provided point is contained within this graphic's bounding box. */ public boolean contains(float x, float y) { if (mText == null) { return false; } RectFrect = new RectF(mText.getBoundingBox()); rect.left = translateX(rect.left); rect.top = translateY(rect.top); rect.right = translateX(rect.right); rect.bottom = translateY(rect.bottom); return (rect.left < x && rect.right > x && rect.top < y && rect.bottom > y); } Android Sensor Programming
OCR Reader In OcrGraphic /** * Draws the text block annotations for position, size, and raw value on the supplied canvas. */ @Override public void draw(Canvas canvas) { if (mText == null) { return; } // Draws the bounding box around the TextBlock. RectFrect = new RectF(mText.getBoundingBox()); rect.left = translateX(rect.left); rect.top = translateY(rect.top); rect.right = translateX(rect.right); rect.bottom = translateY(rect.bottom); canvas.drawRect(rect, sRectPaint); // Break the text into multiple lines and draw each one according to its own bounding box. List<? extends Text> textComponents = mText.getComponents(); for(Text currentText : textComponents) { float left = translateX(currentText.getBoundingBox().left); float bottom = translateY(currentText.getBoundingBox().bottom); canvas.drawText(currentText.getValue(), left, bottom, sTextPaint); } } } We want to see if the graphic has text, translate its bounding box to the appropriate coordinates for the canvas, and then draw the box and text. Why do we have to translate the coordinates of the bounding box? Because the bounding coordinates are relative to the frame that was detected on, not the one that we're viewing. If you zoom in using pinch-to-zoom, for instance, they won't line up. Android Sensor Programming
Draw the Graphics to screen In OcrGraphic.java: An alternative implementation of draw() @Overridepublic void draw(Canvas canvas) {if (mText== null) {return; }// Draws the bounding box around the TextBlock.RectFrect = new RectF(mText.getBoundingBox());rect.left= translateX(rect.left);rect.top= translateY(rect.top);rect.right= translateX(rect.right);rect.bottom= translateY(rect.bottom);canvas.drawRect(rect, sRectPaint);// Render the text at the bottom of the box. canvas.drawText(mText.getValue(), rect.left, rect.bottom, sTextPaint); } Android Sensor Programming
OCR Reader Add a Java class and name it OrcDetectorProcessor import android.util.Log; import android.util.SparseArray; import com.google.android.gms.vision.Detector; import com.google.android.gms.vision.text.TextBlock; // A very simple Processor which gets detected TextBlocks and adds them to the overlay as OcrGraphics. public class OcrDetectorProcessor implements Detector.Processor<TextBlock> { private GraphicOverlay<OcrGraphic> mGraphicOverlay; OcrDetectorProcessor(GraphicOverlay<OcrGraphic> ocrGraphicOverlay) { mGraphicOverlay = ocrGraphicOverlay; } @Override public void receiveDetections(Detector.Detections<TextBlock> detections) { mGraphicOverlay.clear(); SparseArray<TextBlock> items = detections.getDetectedItems(); for (int i = 0; i < items.size(); ++i) { TextBlock item = items.valueAt(i); if (item != null && item.getValue() != null) { Log.d("OcrDetectorProcessor", "Text detected! " + item.getValue()); } OcrGraphic graphic = new OcrGraphic(mGraphicOverlay, item); mGraphicOverlay.add(graphic); } } // Frees the resources associated with this detection processor. @Override public void release() { mGraphicOverlay.clear(); } } Called by the detector to deliver detection results. If your application called for it, this could be a place to check for equivalent detections by tracking TextBlocks that are similar in location and content from previous frames, or reduce noise by eliminating TextBlocks that have not persisted through multiple detections. Android Sensor Programming
OCR Reader In MainActivity.java: imports import android.Manifest; import android.annotation.SuppressLint; import android.app.Activity; import android.app.AlertDialog; import android.app.Dialog; import android.content.Context; import android.content.DialogInterface; import android.content.Intent; import android.content.IntentFilter; import android.content.pm.PackageManager; import android.hardware.Camera; import android.speech.tts.TextToSpeech; import android.support.annotation.NonNull; import android.support.design.widget.Snackbar; import android.support.v4.app.ActivityCompat; import android.support.v7.app.AppCompatActivity; import android.os.Bundle; import android.util.Log; import android.view.GestureDetector; import android.view.MotionEvent; import android.view.ScaleGestureDetector; import android.view.View; import android.widget.Toast; import com.google.android.gms.common.ConnectionResult; import com.google.android.gms.common.GoogleApiAvailability; import com.google.android.gms.vision.text.TextBlock; import com.google.android.gms.vision.text.TextRecognizer; import java.io.IOException; import java.util.Locale; Android Sensor Programming
OCR Reader In MainActivity.java: member variables private static final String TAG = "OcrCaptureActivity"; // Intent request code to handle updating play services if needed. private static final int RC_HANDLE_GMS = 9001; // Permission request codes need to be < 256 private static final int RC_HANDLE_CAMERA_PERM = 2; // Constants used to pass extra data in the intent public static final String AutoFocus = "AutoFocus"; public static final String UseFlash = "UseFlash"; public static final String TextBlockObject = "String"; private CameraSourcemCameraSource; private CameraSourcePreviewmPreview; private GraphicOverlay<OcrGraphic> mGraphicOverlay; // Helper objects for detecting taps and pinches. private ScaleGestureDetectorscaleGestureDetector; private GestureDetectorgestureDetector; // A TextToSpeech engine for speaking a String value. private TextToSpeechtts; Android Sensor Programming
OCR Reader In MainActivity.java: // Initializes the UI and creates the detector pipeline. @Override public void onCreate(Bundle bundle) { super.onCreate(bundle); setContentView(R.layout.activity_main); mPreview = (CameraSourcePreview) findViewById(R.id.preview); mGraphicOverlay = (GraphicOverlay<OcrGraphic>) findViewById(R.id.graphicOverlay); // Set good defaults for capturing text. booleanautoFocus = true; booleanuseFlash = false; int rc = ActivityCompat.checkSelfPermission(this, Manifest.permission.CAMERA); if (rc == PackageManager.PERMISSION_GRANTED) { createCameraSource(autoFocus, useFlash); } else { requestCameraPermission(); } gestureDetector = new GestureDetector(this, new CaptureGestureListener()); scaleGestureDetector = new ScaleGestureDetector(this, new ScaleListener()); Snackbar.make(mGraphicOverlay, "Tap to Speak. Pinch/Stretch to zoom", Snackbar.LENGTH_LONG).show(); Android Sensor Programming
OCR Reader In MainActivity.java: Continue onCreate(Bundle bundle) // Set up the Text To Speech engine. TextToSpeech.OnInitListener listener = new TextToSpeech.OnInitListener() { @Override public void onInit(final int status) { if (status == TextToSpeech.SUCCESS) { Log.d("OnInitListener", "Text to speech engine started successfully."); tts.setLanguage(Locale.US); } else { Log.d("OnInitListener", "Error starting the text to speech engine."); } } }; tts = new TextToSpeech(this.getApplicationContext(), listener); } Android Sensor Programming
OCR Reader In MainActivity.java: private void requestCameraPermission() { Log.w(TAG, "Camera permission is not granted. Requesting permission"); final String[] permissions = new String[]{Manifest.permission.CAMERA}; if (!ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.CAMERA)) { ActivityCompat.requestPermissions(this, permissions, RC_HANDLE_CAMERA_PERM); return; } final Activity thisActivity = this; View.OnClickListener listener = new View.OnClickListener() { @Override public void onClick(View view) { ActivityCompat.requestPermissions(thisActivity, permissions, RC_HANDLE_CAMERA_PERM); } }; Snackbar.make(mGraphicOverlay, R.string.permission_camera_rationale, Snackbar.LENGTH_INDEFINITE) .setAction(R.string.ok, listener) .show(); } Android Sensor Programming
OCR Reader In MainActivity.java: @Override public booleanonTouchEvent(MotionEvent e) { boolean b = scaleGestureDetector.onTouchEvent(e); boolean c = gestureDetector.onTouchEvent(e); return b || c || super.onTouchEvent(e); } @Override protected void onResume() { super.onResume(); startCameraSource(); } @Override protected void onPause() { super.onPause(); if (mPreview != null) { mPreview.stop(); } } @Override protected void onDestroy() { super.onDestroy(); if (mPreview != null) { mPreview.release(); } } Android Sensor Programming
OCR Reader A text recognizer is created to find text. An associated multi-processor instance is set to receive the text recognition results, track the text, and maintain graphics for each text block on screen. The factory is used by the multi-processor to create a separate tracker instance for each text block. In MainActivity.java: // Creates and starts the camera. @SuppressLint("InlinedApi") private void createCameraSource(booleanautoFocus, booleanuseFlash) { Context context = getApplicationContext(); TextRecognizertextRecognizer = new TextRecognizer.Builder(context).build(); textRecognizer.setProcessor(new OcrDetectorProcessor(mGraphicOverlay)); if (!textRecognizer.isOperational()) { Log.w(TAG, "Detector dependencies are not yet available."); IntentFilterlowstorageFilter = new IntentFilter(Intent.ACTION_DEVICE_STORAGE_LOW); booleanhasLowStorage = registerReceiver(null, lowstorageFilter) != null; if (hasLowStorage) { Toast.makeText(this, R.string.low_storage_error, Toast.LENGTH_LONG).show(); Log.w(TAG, getString(R.string.low_storage_error)); } } // Creates and starts the camera. mCameraSource = new CameraSource.Builder(getApplicationContext(), textRecognizer) .setFacing(CameraSource.CAMERA_FACING_BACK) .setRequestedPreviewSize(1280, 1024) .setRequestedFps(2.0f) .setFlashMode(useFlash ? Camera.Parameters.FLASH_MODE_TORCH : null) .setFocusMode(autoFocus ? Camera.Parameters.FOCUS_MODE_CONTINUOUS_PICTURE : null) .build(); } Android Sensor Programming
OCR Reader In MainActivity.java: @Override public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) { if (requestCode != RC_HANDLE_CAMERA_PERM) { Log.d(TAG, "Got unexpected permission result: " + requestCode); super.onRequestPermissionsResult(requestCode, permissions, grantResults); return; } if (grantResults.length != 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) { Log.d(TAG, "Camera permission granted - initialize the camera source"); // we have permission, so create the camerasource booleanautoFocus = getIntent().getBooleanExtra(AutoFocus,false); booleanuseFlash = getIntent().getBooleanExtra(UseFlash, false); createCameraSource(autoFocus, useFlash); return; } DialogInterface.OnClickListener listener = new DialogInterface.OnClickListener() { public void onClick(DialogInterface dialog, int id) { finish(); } }; AlertDialog.Builder builder = new AlertDialog.Builder(this); builder.setTitle("Multitracker sample") .setMessage(R.string.no_camera_permission) .setPositiveButton(R.string.ok, listener) .show(); } Android Sensor Programming
OCR Reader In MainActivity.java: /** * Starts or restarts the camera source, if it exists. If the camera source doesn't exist yet * (e.g., because onResume was called before the camera source was created), this will be called * again when the camera source is created. */ private void startCameraSource() throws SecurityException { // check that the device has play services available. int code = GoogleApiAvailability.getInstance().isGooglePlayServicesAvailable( getApplicationContext()); if (code != ConnectionResult.SUCCESS) { Dialog dlg = GoogleApiAvailability.getInstance().getErrorDialog(this, code, RC_HANDLE_GMS); dlg.show(); } if (mCameraSource != null) { try { mPreview.start(mCameraSource, mGraphicOverlay); } catch (IOException e) { Log.e(TAG, "Unable to start camera source.", e); mCameraSource.release(); mCameraSource = null; } } } Android Sensor Programming
OCR Reader In MainActivity.java: /** * onTap is called to speak the tapped TextBlock, if any, out loud. * * @param rawX - the raw position of the tap * @param rawY - the raw position of the tap. * @return true if the tap was on a TextBlock */ private booleanonTap(float rawX, float rawY) { OcrGraphic graphic = mGraphicOverlay.getGraphicAtLocation(rawX, rawY); TextBlock text = null; if (graphic != null) { text = graphic.getTextBlock(); if (text != null && text.getValue() != null) { Log.d(TAG, "text data is being spoken! " + text.getValue()); // Speak the string. tts.speak(text.getValue(), TextToSpeech.QUEUE_ADD, null, "DEFAULT"); } else { Log.d(TAG, "text data is null"); } } else { Log.d(TAG,"no text detected"); } return text != null; } Android Sensor Programming
OCR Reader In MainActivity.java: private class CaptureGestureListener extends GestureDetector.SimpleOnGestureListener { @Override public booleanonSingleTapConfirmed(MotionEvent e) { return onTap(e.getRawX(), e.getRawY()) || super.onSingleTapConfirmed(e); } } private class ScaleListener implements ScaleGestureDetector.OnScaleGestureListener { // Responds to scaling events for a gesture in progress. Reported by pointer motion. @Override public booleanonScale(ScaleGestureDetector detector) { return false; } // Responds to the beginning of a scaling gesture. Reported by new pointers going down. @Override public booleanonScaleBegin(ScaleGestureDetector detector) { return true; } // Responds to the end of a scale gesture. Reported by existing pointers going up. @Override public void onScaleEnd(ScaleGestureDetector detector) { if (mCameraSource != null) { mCameraSource.doZoom(detector.getScaleFactor()); } } } } Android Sensor Programming
ML Kit: Machine Learning for Mobile Developers https://developers.google.com/ml-kit/ Image labeling Text recognition Face detection Barcode scanning Landmark detection Android Sensor Programming
ML Kit: Machine Learning for Mobile Developers You will need to register a Google account using your existing email (any email will be fine) Log in to the Firebase console,gotoMLKitpanel Android Sensor Programming
ML Kit: Machine Learning for Mobile Developers For every app you build, you will have to register with the firebase After registration, you will be asked to download a json file google-services.json to be placed into your project for compilation Add Firebase to your Android project Add to the root-level build.gradle file: buildscript {// ... dependencies {// ...classpath'com.google.gms:google-services:4.0.1'// google-services plugin }}allprojects {// ... repositories {// ... google() // Google's Maven repository }} Android Sensor Programming
ML Kit: Machine Learning for Mobile Developers Add Firebase to your Android project Then, in your module Gradle file (usually the app/build.gradle), add the apply plugin line at the bottom of the file to enable the Gradle plugin: You should also add the dependencies for the Firebase SDKs you want to use. We recommend starting with com.google.firebase:firebase-core, which provides Google Analytics for Firebase functionality Available firebase libraries: https://firebase.google.com/docs/android/setup#available_libraries dependencies {// ... implementation 'com.google.firebase:firebase-core:16.0.1'// Getting a "Could not find" error? Make sure you have// added the Google maven respository to your root build.gradle}// ADD THIS AT THE BOTTOMapply plugin: 'com.google.gms.google-services' Android Sensor Programming
ML Kit: Image Labeling https://firebase.google.com/docs/ml-kit/label-images With ML Kit's image labeling APIs, you can recognize entities in an image without having to provide any additional contextual metadata, using either an on-device API or a cloud-based API Image labeling gives you insight into the content of images When you use the API, you get a list of the entities that were recognized: people, things, places, activities, etc. Each label found comes with a score that indicates the confidence the ML model has in its relevance With this information, you can perform tasks such as automatic metadata generation and content moderation Android Sensor Programming
ML Kit: Image Labeling The device-based API supports 400+ labels, such as the following examples: Android Sensor Programming
ML Kit: Image Labeling The cloud-based API supports 10,000+ labels Android Sensor Programming
ML Kit: Google Knowledge Graph entity IDs In addition the text description of each label that ML Kit returns, it also returns the label's Google Knowledge Graph entity ID This ID is a string that uniquely identifies the entity represented by the label, and is the same ID used by the Knowledge Graph Search API https://developers.google.com/knowledge-graph/ You can use this string to identify an entity across languages, and independently of the formatting of the text description Android Sensor Programming
ML Kit: Google Knowledge Graph entity IDs Android Sensor Programming
Label Images with ML Kit on Android Configuration in Android Studio In the project-level build.gradle, add: In the app-level build.gradle, add: dependencies {classpath 'com.android.tools.build:gradle:3.1.3'classpath 'com.google.gms:google-services:4.0.1'} dependencies { implementation fileTree(dir: 'libs', include: ['*.jar']) implementation 'com.android.support:appcompat-v7:27.1.1' implementation 'com.android.support.constraint:constraint-layout:1.1.2’ implementation 'com.google.firebase:firebase-core:16.0.1' implementation 'com.google.firebase:firebase-ml-vision:16.0.0' implementation 'com.google.firebase:firebase-ml-vision-image-label-model:15.0.0'testImplementation 'junit:junit:4.12'androidTestImplementation 'com.android.support.test:runner:1.0.2'androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'}apply plugin: 'com.google.gms.google-services' Android Sensor Programming
Label Images with ML Kit on Android Configuration in Android Studio If you use the on-device API, configure your app to automatically download the ML model to the device after your app is installed from the Play Store To do so, add the following declaration to your app's AndroidManifest.xml file: If you do not enable install-time model downloads, the model will be downloaded the first time you run the on-device detector. Requests you make before the download has completed will produce no results <application ...> ...<meta-dataandroid:name="com.google.firebase.ml.vision.DEPENDENCIES"android:value="label"/><!-- To use multiple models: android:value="label,model2,model3" --></application> Android Sensor Programming
Register with Firebase Go to https://console.firebase.google.com/u/0/project/mlkit-aa998/overview Click “Add another app” link, then “Add Firebase to your Android app Android Sensor Programming
Register with Firebase Go to https://console.firebase.google.com/u/0/project/mlkit-aa998/overview Click “Add another app” link, then “Add Firebase to your Android app Android Sensor Programming
Register with Firebase Then, follow the steps as prompted Android Sensor Programming
On-device image labeling Configure the image labeler By default, the on-device image labeler returns at most 10 labels for an image. If you want to change this setting, create a FirebaseVisionLabelDetectorOptions object as in the following example Run the image labeler Create a FirebaseVisionImage object from a Bitmap object You can also create a FirebaseVisionImage from a file, from a ByteBuffer or byte array, or a media.Image taken by the device’s camera FirebaseVisionLabelDetectorOptions options =newFirebaseVisionLabelDetectorOptions.Builder() .setConfidenceThreshold(0.8f).build(); FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap); Android Sensor Programming
On-device image labeling Run the image labeler Get an instance of FirebaseVisionLabelDetector: Finally, pass the image to the detectInImage method: FirebaseVisionLabelDetector detector = FirebaseVision.getInstance().getVisionLabelDetector();// Or, to set the minimum confidence required:FirebaseVisionLabelDetector detector = FirebaseVision.getInstance().getVisionLabelDetector(options); Task<List<FirebaseVisionLabel>> result = detector.detectInImage(image) .addOnSuccessListener(newOnSuccessListener<List<FirebaseVisionLabel>>() {@OverridepublicvoidonSuccess(List<FirebaseVisionLabel> labels) {// Task completed successfully// ... } }) .addOnFailureListener(newOnFailureListener() {@OverridepublicvoidonFailure(@NonNullException e) {// Task failed with an exception// ... } }); Android Sensor Programming
On-device image labeling Get information about labeled objects If the image labeling operation succeeds, a list of FirebaseVisionLabel objects will be passed to the success listener Each FirebaseVisionLabel object represents something that was labeled in the image. For each label, you can get the label's text description, its Knowledge Graph entity ID (if available), and the confidence score of the match for (FirebaseVisionLabel label: labels) {String text = label.getLabel();StringentityId = label.getEntityId();float confidence = label.getConfidence();} Android Sensor Programming
A complete on-device image labeling app Preparation: find an interesting image and copy it to res/mipmap: Android Sensor Programming
A complete on-device image labeling app Setup the layout for the activity as shown in the previous slide, and here: <?xml version="1.0" encoding="utf-8"?><RelativeLayoutxmlns:android="http://schemas.android.com/apk/res/android"xmlns:tools="http://schemas.android.com/tools"android:layout_width="match_parent"android:layout_height="match_parent"tools:context=".MainActivity"> <Buttonandroid:text="Lable Image"android:layout_width="wrap_content"android:layout_height="wrap_content"android:layout_centerHorizontal="true"android:id="@+id/btnLabelImage" /> <ImageSwitcherandroid:id="@+id/imageSwitcher"android:layout_width="match_parent"android:layout_height="250dp"android:layout_alignParentStart="true"android:layout_alignParentTop="true"android:layout_marginTop="50dp" /> <TextViewandroid:id="@+id/txtLabels"android:layout_width="match_parent"android:layout_height="172dp"android:layout_alignParentBottom="true"android:layout_alignParentEnd="true"android:layout_below="@+id/imageSwitcher"android:layout_centerHorizontal="true"android:layout_marginTop="5dp"android:textColor="#ffffff"android:background="#000000" /></RelativeLayout> Android Sensor Programming
A complete on-device image labeling app Populate MainActivity import android.support.v7.app.AppCompatActivity;import android.os.Bundle;import android.graphics.BitmapFactory;import android.graphics.Bitmap;import java.util.List;import com.google.android.gms.tasks.OnSuccessListener;import com.google.android.gms.tasks.OnFailureListener;import android.support.annotation.NonNull;import android.widget.ImageSwitcher;import android.support.v7.app.ActionBar;import android.widget.TextView;import android.widget.ViewSwitcher;import android.widget.ImageView;import android.view.View;import android.widget.Button;import com.google.android.gms.tasks.Task;import com.google.firebase.ml.vision.FirebaseVision;import com.google.firebase.ml.vision.common.FirebaseVisionImage;import com.google.firebase.ml.vision.label.FirebaseVisionLabel;import com.google.firebase.ml.vision.label.FirebaseVisionLabelDetector; Android Sensor Programming
A complete on-device image labeling app Populate MainActivity public class MainActivityextends AppCompatActivity {@Overrideprotected void onCreate(Bundle savedInstanceState) {super.onCreate(savedInstanceState);setContentView(R.layout.activity_main);ImageSwitcherimgSwitcher = (ImageSwitcher) findViewById(R.id.imageSwitcher);imgSwitcher.setFactory(new ViewSwitcher.ViewFactory() {@Overridepublic View makeView() {ImageViewmyView = new ImageView(getApplicationContext());myView.setScaleType(ImageView.ScaleType.FIT_CENTER);myView.setLayoutParams(new ImageSwitcher.LayoutParams(ActionBar.LayoutParams.WRAP_CONTENT,ActionBar.LayoutParams.WRAP_CONTENT)); return myView;} });imgSwitcher.setImageResource(R.mipmap.windows);Button btnLabelImage = (Button) findViewById(R.id.btnLabelImage); Android Sensor Programming
Populate MainActivity btnLabelImage.setOnClickListener(new View.OnClickListener() {@Overridepublic void onClick(View v) { Bitmap bm = BitmapFactory.decodeResource(getResources(), R.mipmap.windows);FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bm);FirebaseVisionLabelDetector detector = FirebaseVision.getInstance().getVisionLabelDetector();Task<List<FirebaseVisionLabel>> result = detector.detectInImage(image).addOnSuccessListener(new OnSuccessListener<List<FirebaseVisionLabel>>() {@Overridepublic void onSuccess(List<FirebaseVisionLabel> labels) {// Task completed successfullyfor (FirebaseVisionLabel label: labels) { String text = label.getLabel();String entityId = label.getEntityId(); float confidence = label.getConfidence();TextView tv = (TextView) findViewById(R.id.txtLabels);tv.append(text+"; entity id="+entityId+"; confidence="+confidence+"\n");} } }).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNullException e) {// Task failed with an exception // ...} });} });}} Android Sensor Programming
A complete on-device image labeling app Android Sensor Programming
Homework#25: OCR Reader Add the following feature into the OCR Reader app: Open/create a file in a public folder in the onCreate() in MainActivity.java, and append the TextBlock that is tapped on to the file Much more challenging: Develop a business card reader app that is capable of reading the card owner’s name, affiliation, email, and homepage, and store such information in a database (the SQR lite database in Android). You may also allow the app user to enter information regarding the location and occasion when the card is obtained. Android Sensor Programming