Spaces:
Runtime error
Runtime error
File size: 16,569 Bytes
b18cfd3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 |
# TensorFlow Lite Android image classification example
This document walks through the code of a simple Android mobile application that
demonstrates
[image classification](https://www.tensorflow.org/lite/models/image_classification/overview)
using the device camera.
## Explore the code
We're now going to walk through the most important parts of the sample code.
### Get camera input
This mobile application gets the camera input using the functions defined in the
file
[`CameraActivity.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/CameraActivity.java).
This file depends on
[`AndroidManifest.xml`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/AndroidManifest.xml)
to set the camera orientation.
`CameraActivity` also contains code to capture user preferences from the UI and
make them available to other classes via convenience methods.
```java
model = Model.valueOf(modelSpinner.getSelectedItem().toString().toUpperCase());
device = Device.valueOf(deviceSpinner.getSelectedItem().toString());
numThreads = Integer.parseInt(threadsTextView.getText().toString().trim());
```
### Classifier
This Image Classification Android reference app demonstrates two implementation
solutions,
[`lib_task_api`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_task_api)
that leverages the out-of-box API from the
[TensorFlow Lite Task Library](https://www.tensorflow.org/lite/inference_with_metadata/task_library/image_classifier),
and
[`lib_support`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_support)
that creates the custom inference pipleline using the
[TensorFlow Lite Support Library](https://www.tensorflow.org/lite/inference_with_metadata/lite_support).
Both solutions implement the file `Classifier.java` (see
[the one in lib_task_api](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_task_api/src/main/java/org/tensorflow/lite/examples/classification/tflite/Classifier.java)
and
[the one in lib_support](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_support/src/main/java/org/tensorflow/lite/examples/classification/tflite/Classifier.java))
that contains most of the complex logic for processing the camera input and
running inference.
Two subclasses of the `Classifier` exist, as in `ClassifierFloatMobileNet.java`
and `ClassifierQuantizedMobileNet.java`, which contain settings for both
floating point and
[quantized](https://www.tensorflow.org/lite/performance/post_training_quantization)
models.
The `Classifier` class implements a static method, `create`, which is used to
instantiate the appropriate subclass based on the supplied model type (quantized
vs floating point).
#### Using the TensorFlow Lite Task Library
Inference can be done using just a few lines of code with the
[`ImageClassifier`](https://www.tensorflow.org/lite/inference_with_metadata/task_library/image_classifier)
in the TensorFlow Lite Task Library.
##### Load model and create ImageClassifier
`ImageClassifier` expects a model populated with the
[model metadata](https://www.tensorflow.org/lite/convert/metadata) and the label
file. See the
[model compatibility requirements](https://www.tensorflow.org/lite/inference_with_metadata/task_library/image_classifier#model_compatibility_requirements)
for more details.
`ImageClassifierOptions` allows manipulation on various inference options, such
as setting the maximum number of top scored results to return using
`setMaxResults(MAX_RESULTS)`, and setting the score threshold using
`setScoreThreshold(scoreThreshold)`.
```java
// Create the ImageClassifier instance.
ImageClassifierOptions options =
ImageClassifierOptions.builder().setMaxResults(MAX_RESULTS).build();
imageClassifier = ImageClassifier.createFromFileAndOptions(activity,
getModelPath(), options);
```
`ImageClassifier` currently does not support configuring delegates and
multithread, but those are on our roadmap. Please stay tuned!
##### Run inference
`ImageClassifier` contains builtin logic to preprocess the input image, such as
rotating and resizing an image. Processing options can be configured through
`ImageProcessingOptions`. In the following example, input images are rotated to
the up-right angle and cropped to the center as the model expects a square input
(`224x224`). See the
[Java doc of `ImageClassifier`](https://github.com/tensorflow/tflite-support/blob/195b574f0aa9856c618b3f1ad87bd185cddeb657/tensorflow_lite_support/java/src/java/org/tensorflow/lite/task/core/vision/ImageProcessingOptions.java#L22)
for more details about how the underlying image processing is performed.
```java
TensorImage inputImage = TensorImage.fromBitmap(bitmap);
int width = bitmap.getWidth();
int height = bitmap.getHeight();
int cropSize = min(width, height);
ImageProcessingOptions imageOptions =
ImageProcessingOptions.builder()
.setOrientation(getOrientation(sensorOrientation))
// Set the ROI to the center of the image.
.setRoi(
new Rect(
/*left=*/ (width - cropSize) / 2,
/*top=*/ (height - cropSize) / 2,
/*right=*/ (width + cropSize) / 2,
/*bottom=*/ (height + cropSize) / 2))
.build();
List<Classifications> results = imageClassifier.classify(inputImage,
imageOptions);
```
The output of `ImageClassifier` is a list of `Classifications` instance, where
each `Classifications` element is a single head classification result. All the
demo models are single head models, therefore, `results` only contains one
`Classifications` object. Use `Classifications.getCategories()` to get a list of
top-k categories as specified with `MAX_RESULTS`. Each `Category` object
contains the srting label and the score of that category.
To match the implementation of
[`lib_support`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_support),
`results` is converted into `List<Recognition>` in the method,
`getRecognitions`.
#### Using the TensorFlow Lite Support Library
##### Load model and create interpreter
To perform inference, we need to load a model file and instantiate an
`Interpreter`. This happens in the constructor of the `Classifier` class, along
with loading the list of class labels. Information about the device type and
number of threads is used to configure the `Interpreter` via the
`Interpreter.Options` instance passed into its constructor. Note that if a GPU,
DSP (Digital Signal Processor) or NPU (Neural Processing Unit) is available, a
[`Delegate`](https://www.tensorflow.org/lite/performance/delegates) can be used
to take full advantage of these hardware.
Please note that there are performance edge cases and developers are adviced to
test with a representative set of devices prior to production.
```java
protected Classifier(Activity activity, Device device, int numThreads) throws
IOException {
tfliteModel = FileUtil.loadMappedFile(activity, getModelPath());
switch (device) {
case NNAPI:
nnApiDelegate = new NnApiDelegate();
tfliteOptions.addDelegate(nnApiDelegate);
break;
case GPU:
gpuDelegate = new GpuDelegate();
tfliteOptions.addDelegate(gpuDelegate);
break;
case CPU:
break;
}
tfliteOptions.setNumThreads(numThreads);
tflite = new Interpreter(tfliteModel, tfliteOptions);
labels = FileUtil.loadLabels(activity, getLabelPath());
...
```
For Android devices, we recommend pre-loading and memory mapping the model file
to offer faster load times and reduce the dirty pages in memory. The method
`FileUtil.loadMappedFile` does this, returning a `MappedByteBuffer` containing
the model.
The `MappedByteBuffer` is passed into the `Interpreter` constructor, along with
an `Interpreter.Options` object. This object can be used to configure the
interpreter, for example by setting the number of threads (`.setNumThreads(1)`)
or enabling [NNAPI](https://developer.android.com/ndk/guides/neuralnetworks)
(`.addDelegate(nnApiDelegate)`).
##### Pre-process bitmap image
Next in the `Classifier` constructor, we take the input camera bitmap image,
convert it to a `TensorImage` format for efficient processing and pre-process
it. The steps are shown in the private 'loadImage' method:
```java
/** Loads input image, and applys preprocessing. */
private TensorImage loadImage(final Bitmap bitmap, int sensorOrientation) {
// Loads bitmap into a TensorImage.
image.load(bitmap);
// Creates processor for the TensorImage.
int cropSize = Math.min(bitmap.getWidth(), bitmap.getHeight());
int numRoration = sensorOrientation / 90;
ImageProcessor imageProcessor =
new ImageProcessor.Builder()
.add(new ResizeWithCropOrPadOp(cropSize, cropSize))
.add(new ResizeOp(imageSizeX, imageSizeY, ResizeMethod.BILINEAR))
.add(new Rot90Op(numRoration))
.add(getPreprocessNormalizeOp())
.build();
return imageProcessor.process(inputImageBuffer);
}
```
The pre-processing is largely the same for quantized and float models with one
exception: Normalization.
In `ClassifierFloatMobileNet`, the normalization parameters are defined as:
```java
private static final float IMAGE_MEAN = 127.5f;
private static final float IMAGE_STD = 127.5f;
```
In `ClassifierQuantizedMobileNet`, normalization is not required. Thus the
nomalization parameters are defined as:
```java
private static final float IMAGE_MEAN = 0.0f;
private static final float IMAGE_STD = 1.0f;
```
##### Allocate output object
Initiate the output `TensorBuffer` for the output of the model.
```java
/** Output probability TensorBuffer. */
private final TensorBuffer outputProbabilityBuffer;
//...
// Get the array size for the output buffer from the TensorFlow Lite model file
int probabilityTensorIndex = 0;
int[] probabilityShape =
tflite.getOutputTensor(probabilityTensorIndex).shape(); // {1, 1001}
DataType probabilityDataType =
tflite.getOutputTensor(probabilityTensorIndex).dataType();
// Creates the output tensor and its processor.
outputProbabilityBuffer =
TensorBuffer.createFixedSize(probabilityShape, probabilityDataType);
// Creates the post processor for the output probability.
probabilityProcessor =
new TensorProcessor.Builder().add(getPostprocessNormalizeOp()).build();
```
For quantized models, we need to de-quantize the prediction with the NormalizeOp
(as they are all essentially linear transformation). For float model,
de-quantize is not required. But to uniform the API, de-quantize is added to
float model too. Mean and std are set to 0.0f and 1.0f, respectively. To be more
specific,
In `ClassifierQuantizedMobileNet`, the normalized parameters are defined as:
```java
private static final float PROBABILITY_MEAN = 0.0f;
private static final float PROBABILITY_STD = 255.0f;
```
In `ClassifierFloatMobileNet`, the normalized parameters are defined as:
```java
private static final float PROBABILITY_MEAN = 0.0f;
private static final float PROBABILITY_STD = 1.0f;
```
##### Run inference
Inference is performed using the following in `Classifier` class:
```java
tflite.run(inputImageBuffer.getBuffer(),
outputProbabilityBuffer.getBuffer().rewind());
```
##### Recognize image
Rather than call `run` directly, the method `recognizeImage` is used. It accepts
a bitmap and sensor orientation, runs inference, and returns a sorted `List` of
`Recognition` instances, each corresponding to a label. The method will return a
number of results bounded by `MAX_RESULTS`, which is 3 by default.
`Recognition` is a simple class that contains information about a specific
recognition result, including its `title` and `confidence`. Using the
post-processing normalization method specified, the confidence is converted to
between 0 and 1 of a given class being represented by the image.
```java
/** Gets the label to probability map. */
Map<String, Float> labeledProbability =
new TensorLabel(labels,
probabilityProcessor.process(outputProbabilityBuffer))
.getMapWithFloatValue();
```
A `PriorityQueue` is used for sorting.
```java
/** Gets the top-k results. */
private static List<Recognition> getTopKProbability(
Map<String, Float> labelProb) {
// Find the best classifications.
PriorityQueue<Recognition> pq =
new PriorityQueue<>(
MAX_RESULTS,
new Comparator<Recognition>() {
@Override
public int compare(Recognition lhs, Recognition rhs) {
// Intentionally reversed to put high confidence at the head of
// the queue.
return Float.compare(rhs.getConfidence(), lhs.getConfidence());
}
});
for (Map.Entry<String, Float> entry : labelProb.entrySet()) {
pq.add(new Recognition("" + entry.getKey(), entry.getKey(),
entry.getValue(), null));
}
final ArrayList<Recognition> recognitions = new ArrayList<>();
int recognitionsSize = Math.min(pq.size(), MAX_RESULTS);
for (int i = 0; i < recognitionsSize; ++i) {
recognitions.add(pq.poll());
}
return recognitions;
}
```
### Display results
The classifier is invoked and inference results are displayed by the
`processImage()` function in
[`ClassifierActivity.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/ClassifierActivity.java).
`ClassifierActivity` is a subclass of `CameraActivity` that contains method
implementations that render the camera image, run classification, and display
the results. The method `processImage()` runs classification on a background
thread as fast as possible, rendering information on the UI thread to avoid
blocking inference and creating latency.
```java
@Override
protected void processImage() {
rgbFrameBitmap.setPixels(getRgbBytes(), 0, previewWidth, 0, 0, previewWidth,
previewHeight);
final int imageSizeX = classifier.getImageSizeX();
final int imageSizeY = classifier.getImageSizeY();
runInBackground(
new Runnable() {
@Override
public void run() {
if (classifier != null) {
final long startTime = SystemClock.uptimeMillis();
final List<Classifier.Recognition> results =
classifier.recognizeImage(rgbFrameBitmap, sensorOrientation);
lastProcessingTimeMs = SystemClock.uptimeMillis() - startTime;
LOGGER.v("Detect: %s", results);
runOnUiThread(
new Runnable() {
@Override
public void run() {
showResultsInBottomSheet(results);
showFrameInfo(previewWidth + "x" + previewHeight);
showCropInfo(imageSizeX + "x" + imageSizeY);
showCameraResolution(imageSizeX + "x" + imageSizeY);
showRotationInfo(String.valueOf(sensorOrientation));
showInference(lastProcessingTimeMs + "ms");
}
});
}
readyForNextImage();
}
});
}
```
Another important role of `ClassifierActivity` is to determine user preferences
(by interrogating `CameraActivity`), and instantiate the appropriately
configured `Classifier` subclass. This happens when the video feed begins (via
`onPreviewSizeChosen()`) and when options are changed in the UI (via
`onInferenceConfigurationChanged()`).
```java
private void recreateClassifier(Model model, Device device, int numThreads) {
if (classifier != null) {
LOGGER.d("Closing classifier.");
classifier.close();
classifier = null;
}
if (device == Device.GPU && model == Model.QUANTIZED) {
LOGGER.d("Not creating classifier: GPU doesn't support quantized models.");
runOnUiThread(
() -> {
Toast.makeText(this, "GPU does not yet supported quantized models.",
Toast.LENGTH_LONG)
.show();
});
return;
}
try {
LOGGER.d(
"Creating classifier (model=%s, device=%s, numThreads=%d)", model,
device, numThreads);
classifier = Classifier.create(this, model, device, numThreads);
} catch (IOException e) {
LOGGER.e(e, "Failed to create classifier.");
}
}
```
|