PyCoral API Examples¶
A self-contained port of pycoral examples. The purpose is to learn PyCoral API usage. It targets 7 examples that can run on a single Edge TPU.
| Example | Description | Key APIs |
|---|---|---|
| classify_image | Image classification | classify.get_classes, common.input_size, common.set_input |
| detect_image | Object detection | detect.get_objects, common.set_resized_input |
| semantic_segmentation | Semantic segmentation | segment.get_output, common.set_resized_input |
| movenet_pose_estimation | Pose estimation (17 keypoints) | common.output_tensor, common.set_input |
| small_object_detection | Tile-based small object detection + NMS | detect.get_objects, common.set_resized_input |
| backprop_last_layer | Transfer learning (last layer backprop) | SoftmaxRegression, classify.get_scores |
| imprinting_learning | Transfer learning (weight imprinting) | ImprintingEngine, classify.get_classes |
Prerequisites¶
- CMake 3.16 or later
- Internet connection (only for initial model/test data download)
- pycoral / tflite-runtime installed on Coral Dev Board (pre-installed on Mendel Linux)
python3-opencvinstalled on the board if using--displayflag for image output examples
Build, Deploy, and Run¶
Download Models and Test Data¶
Downloaded files are saved to build/pycoral-examples/data/.
Deploy to Board¶
Scripts and data are transferred to /home/mendel/work/pycoral-examples/ on the board.
Run Examples (ctest)¶
# Run all examples (excluding transfer-learning)
ctest --test-dir build/pycoral-examples -V -E "(backprop|imprinting)"
# Classification only
ctest --test-dir build/pycoral-examples -V -R classify
# Detection examples only
ctest --test-dir build/pycoral-examples -V -L detect
Running Transfer Learning¶
Transfer learning examples require downloading training data to the board first.
# For backprop_last_layer
cmake --build build/pycoral-examples --target download-flower-photos
ctest --test-dir build/pycoral-examples -V -R backprop
# For imprinting_learning
cmake --build build/pycoral-examples --target download-imprinting-data
ctest --test-dir build/pycoral-examples -V -R imprinting
Example Walkthrough¶
Image Classification¶
Classifies a bird image with MobileNet V2 and outputs the top-k class labels and scores.
Source code: classify_image.py
APIs used: edgetpu.make_interpreter, common.input_size, common.input_details, common.set_input, classify.get_classes, dataset.read_label_file
Processing flow:
read_label_file()loads the label filemake_interpreter()creates an interpreter with Edge TPU delegatecommon.input_size()gets the model's input size, PIL resizes the image- Check quantization parameters to determine if preprocessing is needed
common.set_input()sets data to the input tensorinterpreter.invoke()runs inference (1st run includes model load)classify.get_classes()retrieves classification results
# 1. Load labels
labels = read_label_file(args.labels) if args.labels else {}
# 2. Create Edge TPU interpreter
interpreter = make_interpreter(args.model)
interpreter.allocate_tensors()
# 3. Get input size and resize image
if common.input_details(interpreter, 'dtype') != np.uint8:
raise ValueError('Only support uint8 input type.')
size = common.input_size(interpreter)
image = Image.open(args.input).convert('RGB').resize(size, Image.LANCZOS)
# 4. Check quantization parameters
params = common.input_details(interpreter, 'quantization_parameters')
scale = params['scales']
zero_point = params['zero_points']
# 5. Set input tensor (determine if preprocessing is needed)
if abs(scale * 128.0 - 1) < 1e-5 and abs(128.0 - zero_point) < 1e-5:
common.set_input(interpreter, image) # No preprocessing needed
else:
normalized = (np.asarray(image) - 128.0) / (128.0 * scale) + zero_point
np.clip(normalized, 0, 255, out=normalized)
common.set_input(interpreter, normalized.astype(np.uint8))
# 6. Run inference
interpreter.invoke()
# 7. Get classification results
classes = classify.get_classes(interpreter, args.top_k, args.threshold)
for c in classes:
print(f'{labels.get(c.id, c.id)}: {c.score:.5f}')
Object Detection¶
Detects objects in an image and draws bounding boxes with labels.
Source code: detect_image.py
APIs used: edgetpu.make_interpreter, common.set_resized_input, detect.get_objects, dataset.read_label_file
Processing flow:
read_label_file()loads the label filemake_interpreter()creates the interpretercommon.set_resized_input()resizes the image to model input size while returning the scale factorinterpreter.invoke()runs inferencedetect.get_objects()retrieves detection results above the score threshold- OpenCV draws bounding boxes
# 1. Load labels
labels = read_label_file(args.labels) if args.labels else {}
# 2. Create Edge TPU interpreter
interpreter = make_interpreter(args.model)
interpreter.allocate_tensors()
# 3. set_resized_input handles resize and scale calculation in one call
image = Image.open(args.input)
_, scale = common.set_resized_input(
interpreter, image.size,
lambda size: image.resize(size, Image.LANCZOS))
# 4. Run inference
interpreter.invoke()
# 5. Get detection results
objs = detect.get_objects(interpreter, args.threshold, scale)
for obj in objs:
print(f'{labels.get(obj.id, obj.id)}')
print(f' id: {obj.id}')
print(f' score: {obj.score}')
print(f' bbox: {obj.bbox}')
# 6. Draw bounding boxes with OpenCV
result = cv2.cvtColor(np.array(image.convert('RGB')), cv2.COLOR_RGB2BGR)
for obj in objs:
bbox = obj.bbox
cv2.rectangle(result, (bbox.xmin, bbox.ymin), (bbox.xmax, bbox.ymax),
(0, 0, 255), 2)
text = f'{labels.get(obj.id, obj.id)} {obj.score:.2f}'
cv2.putText(result, text, (bbox.xmin + 4, bbox.ymin + 16),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)
Use --output to save result images and --display to show on HDMI display.
Semantic Segmentation¶
Uses DeepLab V3 to estimate per-pixel classes and visualize with PASCAL VOC colormap.
Source code: semantic_segmentation.py
APIs used: edgetpu.make_interpreter, common.input_size, common.set_resized_input, common.set_input, segment.get_output
Processing flow:
make_interpreter()creates the interpretercommon.input_size()gets the model input size- With
--keep_aspect_ratio,set_resized_input()maintains aspect ratio; otherwise usesset_input() interpreter.invoke()runs inferencesegment.get_output()retrieves segmentation results- PASCAL VOC colormap assigns colors to each class
- Input image and mask are concatenated side-by-side
# 1. Create Edge TPU interpreter
interpreter = make_interpreter(args.model, device=':0')
interpreter.allocate_tensors()
# 2. Get input size
width, height = common.input_size(interpreter)
# 3. Resize and set input image
img = Image.open(args.input)
if args.keep_aspect_ratio:
resized_img, _ = common.set_resized_input(
interpreter, img.size,
lambda size: img.resize(size, Image.LANCZOS))
else:
resized_img = img.resize((width, height), Image.LANCZOS)
common.set_input(interpreter, resized_img)
# 4. Run inference
interpreter.invoke()
# 5. Get segmentation results
result = segment.get_output(interpreter)
if len(result.shape) == 3:
result = np.argmax(result, axis=-1)
# 6. Apply PASCAL VOC colormap
new_width, new_height = resized_img.size
result = result[:new_height, :new_width]
colormap = create_pascal_label_colormap()
mask = colormap[result]
# 7. Concatenate input image and mask side-by-side
resized_bgr = cv2.cvtColor(np.array(resized_img.convert('RGB')),
cv2.COLOR_RGB2BGR)
mask_bgr = cv2.cvtColor(mask, cv2.COLOR_RGB2BGR)
output_image = np.hstack([resized_bgr, mask_bgr])
Use --output to save result images and --display to show on HDMI display.
Pose Estimation¶
Estimates 17 human body keypoints using MoveNet and draws the skeleton.
Source code: movenet_pose_estimation.py
APIs used: edgetpu.make_interpreter, common.input_size, common.set_input, common.output_tensor
Processing flow:
make_interpreter()creates the interpretercommon.input_size()gets input size and resizes the image with PILcommon.set_input()sets the resized imageinterpreter.invoke()runs inferencecommon.output_tensor()retrieves the output tensor (17 x 3: y, x, score)- Keypoints and skeleton connections are drawn with OpenCV
# 1. Create Edge TPU interpreter
interpreter = make_interpreter(args.model)
interpreter.allocate_tensors()
# 2-3. Get input size, resize, and set input
img = Image.open(args.input)
resized_img = img.resize(common.input_size(interpreter), Image.LANCZOS)
common.set_input(interpreter, resized_img)
# 4. Run inference
interpreter.invoke()
# 5. Get output tensor (17 keypoints x [y, x, score])
pose = common.output_tensor(interpreter, 0).copy().reshape(17, 3)
# pose[i] = [y, x, score] (normalized coordinates 0.0-1.0)
# 6. Draw skeleton
_SKELETON = [
(0, 1), (0, 2), (1, 3), (2, 4), # head
(5, 6), # shoulders
(5, 7), (7, 9), (6, 8), (8, 10), # arms
(5, 11), (6, 12), # torso
(11, 12), # hips
(11, 13), (13, 15), (12, 14), (14, 16), # legs
]
result = cv2.cvtColor(np.array(img.convert('RGB')), cv2.COLOR_RGB2BGR)
h, w = result.shape[:2]
for i, j in _SKELETON:
if pose[i][2] > 0.2 and pose[j][2] > 0.2:
y1, x1 = int(pose[i][0] * h), int(pose[i][1] * w)
y2, x2 = int(pose[j][0] * h), int(pose[j][1] * w)
cv2.line(result, (x1, y1), (x2, y2), (0, 255, 0), 2)
for i in range(17):
if pose[i][2] > 0.2:
y, x = int(pose[i][0] * h), int(pose[i][1] * w)
cv2.circle(result, (x, y), 4, (0, 0, 255), -1)
17 keypoints: nose, left_eye, right_eye, left_ear, right_ear, left_shoulder, right_shoulder, left_elbow, right_elbow, left_wrist, right_wrist, left_hip, right_hip, left_knee, right_knee, left_ankle, right_ankle
Use --output to save result images and --display to show on HDMI display.
Tile-based Small Object Detection¶
Splits a large image into tiles of multiple sizes for detection, then applies NMS to remove duplicates. Uses a no-NMS model (_no_nms) with custom post-processing for ~2x speedup.
Source code: small_object_detection.py
APIs used: edgetpu.make_interpreter, common.set_resized_input, detect.get_objects, dataset.read_label_file
Processing flow:
make_interpreter()creates the interpreter,read_label_file()loads labels- Loop over multiple tile sizes specified by
--tile_sizes - Crop each tile and feed to model via
set_resized_input() detect.get_objects()retrieves detection results- Remap detection BBox from tile coordinates to original image coordinates
- Apply NMS (Non-Maximum Suppression) per label
# 1. Create interpreter and load labels
interpreter = make_interpreter(args.model)
interpreter.allocate_tensors()
labels = read_label_file(args.labels) if args.labels else {}
img = Image.open(args.input).convert('RGB')
img_size = img.size
tile_sizes = [
tuple(map(int, ts.split('x')))
for ts in args.tile_sizes.split(',')
]
# 2-5. Detect per tile, remap BBox to original image coordinates
objects_by_label = {}
for tile_size in tile_sizes:
for tile_location in tiles_location_gen(img_size, tile_size,
args.tile_overlap):
tile = img.crop(tile_location)
_, scale = common.set_resized_input(
interpreter, tile.size,
lambda size, t=tile: t.resize(size, Image.NEAREST))
interpreter.invoke()
objs = detect.get_objects(interpreter, args.score_threshold, scale)
for obj in objs:
bbox = [obj.bbox.xmin, obj.bbox.ymin,
obj.bbox.xmax, obj.bbox.ymax]
# Remap tile coordinates to original image coordinates
bbox[0] += tile_location[0]
bbox[1] += tile_location[1]
bbox[2] += tile_location[0]
bbox[3] += tile_location[1]
label = labels.get(obj.id, '')
objects_by_label.setdefault(label, []).append(
Object(label, obj.score, bbox))
# 6. Apply NMS per label
all_objects = []
for label, objects in objects_by_label.items():
idxs = non_max_suppression(objects, args.iou_threshold)
for idx in idxs:
all_objects.append(objects[idx])
Use --output to save result images and --display to show on HDMI display.
Transfer Learning (Last Layer Backprop)¶
Extracts features with an embedding extractor and trains the last layer using SoftmaxRegression. Uses the flower photos dataset.
Source code: backprop_last_layer.py
APIs used: edgetpu.make_interpreter, common.input_size, common.set_input, classify.get_scores, classify.num_classes, SoftmaxRegression
Processing flow:
get_image_paths()extracts image paths and labels from subdirectory structureshuffle_and_split()splits data into train / val / test setsmake_interpreter()creates the embedding extractor interpreter- Extract feature vectors for all images using
classify.get_scores() - Train the last layer with
SoftmaxRegressionusing SGD (500 iterations) model.serialize_model()combines trained weights with the original model and saves- Verify accuracy on test data with the saved model
# 1-2. Load data and split
image_paths, labels, label_map = get_image_paths(data_dir)
train_and_val, test_data = shuffle_and_split(image_paths, labels)
# 3. Create embedding extractor interpreter
interpreter = make_interpreter(model_path, device=':0')
interpreter.allocate_tensors()
# 4. Extract feature vectors
input_size = common.input_size(interpreter)
feature_dim = classify.num_classes(interpreter)
embeddings = np.empty((len(image_paths), feature_dim), dtype=np.float32)
for idx, path in enumerate(image_paths):
with Image.open(path) as img:
img = img.convert('RGB').resize(input_size, Image.NEAREST)
common.set_input(interpreter, img)
interpreter.invoke()
embeddings[idx, :] = classify.get_scores(interpreter)
# 5. Train last layer with SoftmaxRegression
feature_dim = train_and_val['data_train'].shape[1]
num_classes = np.max(train_and_val['labels_train']) + 1
model = SoftmaxRegression(
feature_dim, num_classes, weight_scale=5e-2, reg=0.0)
model.train_with_sgd(
train_and_val, num_iter=500, learning_rate=1e-2, batch_size=100)
# 6. Combine trained weights with original model and save
out_model_path = os.path.join(output_dir, 'retrained_model_edgetpu.tflite')
with open(out_model_path, 'wb') as f:
f.write(model.serialize_model(model_path))
# 7. Verify accuracy on test data
retrained_interpreter = make_interpreter(out_model_path, device=':0')
retrained_interpreter.allocate_tensors()
test_embeddings = extract_embeddings(
test_data['data_test'], retrained_interpreter)
accuracy = np.mean(
np.argmax(test_embeddings, axis=1) == test_data['labels_test'])
Training data: Download to board via the download-flower-photos target (flower_photos: 5 classes x ~700 images)
Transfer Learning (Weight Imprinting)¶
Adds new classes via weight imprinting. Fast learning without backpropagation.
Source code: imprinting_learning.py
APIs used: ImprintingEngine, edgetpu.make_interpreter, common.input_size, common.set_input, classify.get_scores, classify.get_classes
Processing flow:
ImprintingEnginegenerates an extractor from the modelmake_interpreter()creates the extractor interpreterread_data()splits the dataset into train / test sets- Extract feature vectors from training images using
classify.get_scores()and update weights viaengine.train() engine.serialize_model()saves the trained model- Evaluate Top-1 through Top-5 accuracy using
classify.get_classes()with the saved model
# 1. Generate extractor from model via ImprintingEngine
engine = ImprintingEngine(args.model_path, keep_classes=args.keep_classes)
# 2. Create extractor interpreter
extractor = make_interpreter(
engine.serialize_extractor_model(), device=':0')
extractor.allocate_tensors()
shape = common.input_size(extractor)
# 3. Split dataset
train_set, test_set = read_data(args.data, args.test_ratio)
# 4. Extract features and update weights
num_classes = engine.num_classes
for class_id, (category, image_list) in enumerate(train_set.items()):
images = prepare_images(
image_list, os.path.join(args.data, category), shape)
for tensor in images:
common.set_input(extractor, tensor)
extractor.invoke()
embedding = classify.get_scores(extractor)
engine.train(embedding, class_id=num_classes + class_id)
# 5. Save trained model
with open(args.output, 'wb') as f:
f.write(engine.serialize_model())
# 6. Evaluate accuracy with saved model
interpreter = make_interpreter(args.output)
interpreter.allocate_tensors()
size = common.input_size(interpreter)
for category, image_list in test_set.items():
for img_name in image_list:
img = Image.open(os.path.join(args.data, category, img_name))
img = img.resize(size, Image.NEAREST)
common.set_input(interpreter, img)
interpreter.invoke()
candidates = classify.get_classes(
interpreter, top_k=5, score_threshold=0.1)
Training data: Download to board via the download-imprinting-data target (open_image_v4_subset: 10 categories x 20 images)
CMake Targets / ctest List¶
Configuration¶
Target List¶
| Target | Command | Description |
|---|---|---|
deploy |
cmake --build build/pycoral-examples --target deploy |
Transfer scripts + data to board |
download-flower-photos |
cmake --build build/pycoral-examples --target download-flower-photos |
Download flower_photos dataset on board |
download-imprinting-data |
cmake --build build/pycoral-examples --target download-imprinting-data |
Download imprinting dataset on board |
ctest List¶
| Test Name | Labels | Description |
|---|---|---|
classify-image |
edgetpu, classify |
Image classification |
detect-image |
edgetpu, detect |
Object detection |
semantic-segmentation |
edgetpu, segmentation |
Semantic segmentation |
movenet-pose |
edgetpu, pose |
Pose estimation |
small-object-detection |
edgetpu, detect |
Tile-based small object detection |
backprop-last-layer |
edgetpu, transfer-learning |
Transfer learning (backprop) |
imprinting-learning |
edgetpu, transfer-learning |
Transfer learning (imprinting) |
Connection Settings¶
| Variable | Default | Description |
|---|---|---|
CORAL_IP |
(empty = auto-detect via mdt) | Coral Dev Board IP address |
Results¶
classify_image¶
| Item | Value |
|---|---|
| Model | mobilenet_v2_1.0_224_inat_bird_quant_edgetpu |
| 1st inference (includes model load) | 14.70 ms |
| 2nd inference | 3.37 ms |
| Label | Score |
|---|---|
| Ara macao (Scarlet Macaw) | 0.75781 |
| Platycercus elegans (Crimson Rosella) | 0.07422 |
| Coracias caudatus (Lilac-breasted Roller) | 0.01562 |
detect_image¶
| Item | Value |
|---|---|
| Model | ssd_mobilenet_v2_coco_quant_postprocess_edgetpu |
| 1st inference (includes model load) | 36.96 ms |
| 2nd inference | 12.58 ms |
| Label | Score | BBox |
|---|---|---|
| tie | 0.840 | (227, 419, 292, 541) |
| person | 0.805 | (2, 4, 513, 595) |
semantic_segmentation¶
| Item | Value |
|---|---|
| Model | deeplabv3_mnv2_pascal_quant_edgetpu |
| 1st inference (includes model load) | 223.14 ms |
| 2nd inference | 220.26 ms |
movenet_pose_estimation¶
| Item | Value |
|---|---|
| Model | movenet_single_pose_lightning_ptq_edgetpu |
| 1st inference (includes model load) | 29.12 ms |
| 2nd inference | 26.90 ms |
| Keypoint | Coordinates (x, y) | Score |
|---|---|---|
| nose | (0.578, 0.332) | 0.500 |
| left_eye | (0.590, 0.320) | 0.635 |
| right_eye | (0.565, 0.311) | 0.701 |
| left_shoulder | (0.578, 0.422) | 0.635 |
| right_shoulder | (0.418, 0.414) | 0.500 |
| left_hip | (0.389, 0.598) | 0.754 |
| right_hip | (0.287, 0.615) | 0.430 |
| left_ankle | (0.512, 0.848) | 0.635 |
| right_ankle | (0.340, 0.889) | 0.701 |
small_object_detection¶
| Item | Value |
|---|---|
| Model | ssd_mobilenet_v2_coco_quant_no_nms_edgetpu |
| Tile sizes | 1352x900, 500x500, 250x250 |
| Detection time | 922.44 ms |
| Detected | 18 objects (kite: 7, person: 10, backpack: 1) |
backprop_last_layer¶
| Item | Value |
|---|---|
| Model | mobilenet_v1_1.0_224_quant_embedding_extractor_edgetpu |
| Dataset | flower_photos (5 classes x ~700 images) |
| Preprocessing time | 38.48 sec |
| Training time | 7.64 sec |
| Total time | 50.34 sec |
| Test accuracy | 89.65% |
imprinting_learning¶
| Item | Value |
|---|---|
| Model | mobilenet_v1_1.0_224_l2norm_quant_edgetpu |
| Dataset | open_image_v4_subset (10 categories x 20 images) |
| Training time | 36576.60 ms |
| Top-k | Accuracy |
|---|---|
| Top 1 | 98% |
| Top 2 | 100% |
| Top 3 | 100% |
| Top 4 | 100% |
| Top 5 | 100% |