DAMO-YOLO手机检测结果结构化解析：Python提取坐标/置信度/类别代码实例

张

张建站

2026/7/5 5:23:20

10分钟阅读

DAMO-YOLO手机检测结果结构化解析Python提取坐标/置信度/类别代码实例1. 项目背景与价值在实际的手机检测应用中我们往往需要获取比可视化结果更详细的结构化数据。比如在安防监控系统中不仅要看到手机被框出来还需要记录每个检测到的手机的具体位置坐标、置信度分数以及时间戳等信息用于后续的分析和报警。DAMO-YOLO手机检测系统虽然提供了友好的Web界面但真正的工程价值在于能够程序化地获取和处理检测结果。本文将详细介绍如何通过Python代码解析DAMO-YOLO的检测输出提取关键的坐标、置信度和类别信息。为什么需要结构化解析数据记录需要将检测结果保存到数据库或文件中后续处理基于检测结果进行进一步的分析和决策系统集成将检测能力嵌入到更大的应用系统中性能监控统计检测准确率和误检率等指标2. 环境准备与快速开始2.1 安装必要依赖在开始编写解析代码前确保已安装以下Python库pip install torch torchvision pip install opencv-python pip install Pillow pip install numpy2.2 获取检测结果数据DAMO-YOLO的检测结果通常包含以下几个核心信息边界框坐标xmin, ymin, xmax, ymax置信度分数confidence score类别标签class label有时还包括其他元数据3. 核心解析代码实现3.1 基础数据结构定义首先我们定义一个简单的类来存储检测结果class DetectionResult: def __init__(self, bbox, confidence, class_name): 初始化检测结果对象参数: bbox: 边界框坐标 [xmin, ymin, xmax, ymax] confidence: 置信度分数 (0-1) class_name: 类别名称 self.bbox bbox self.confidence confidence self.class_name class_name def __str__(self): return f{self.class_name}: {self.confidence:.2f} at {self.bbox}3.2 解析DAMO-YOLO输出假设DAMO-YOLO返回的检测结果是一个包含多个检测框的列表每个检测框是一个字典def parse_damo_yolo_output(detection_output, confidence_threshold0.5): 解析DAMO-YOLO的检测输出参数: detection_output: DAMO-YOLO返回的原始检测结果 confidence_threshold: 置信度阈值低于此值的检测结果将被过滤返回: List[DetectionResult]: 解析后的检测结果列表 results [] # 假设detection_output是一个包含检测框的列表 for detection in detection_output: # 提取边界框坐标 bbox detection[bbox] # [xmin, ymin, xmax, ymax] # 提取置信度 confidence detection[confidence] # 提取类别 class_name detection[class_name] # 过滤低置信度的检测结果 if confidence confidence_threshold: result DetectionResult(bbox, confidence, class_name) results.append(result) return results3.3 处理实际应用场景在实际应用中我们可能需要处理各种格式的输入和输出。以下是一个更完整的示例def process_image_detection(image_path, confidence_threshold0.5): 完整处理流程加载图像 - 运行检测 - 解析结果参数: image_path: 输入图像路径 confidence_threshold: 置信度阈值返回: List[DetectionResult]: 解析后的检测结果 # 1. 加载图像 import cv2 image cv2.imread(image_path) # 2. 运行DAMO-YOLO检测这里需要根据实际API调整 # 假设有一个函数 run_detection() 来执行检测 raw_detections run_detection(image) # 3. 解析检测结果 results parse_damo_yolo_output(raw_detections, confidence_threshold) # 4. 输出结构化结果 print(f检测到 {len(results)} 个手机:) for i, result in enumerate(results, 1): print(f手机 {i}: {result}) return results4. 高级解析技巧4.1 处理不同输出格式不同的模型版本或配置可能产生不同格式的输出。以下是一个更健壮的解析函数def robust_parse_detection(output_data, confidence_threshold0.5): 健壮的解析函数处理不同格式的检测输出参数: output_data: 检测输出数据可能是多种格式 confidence_threshold: 置信度阈值返回: List[DetectionResult]: 解析后的检测结果 results [] # 处理列表格式的输出 if isinstance(output_data, list): for item in output_data: # 处理字典格式的检测结果 if isinstance(item, dict): try: # 尝试不同的键名来获取数据 bbox item.get(bbox, item.get(box, item.get(bounding_box, []))) confidence item.get(confidence, item.get(conf, item.get(score, 0))) class_name item.get(class_name, item.get(class, item.get(label, unknown))) if confidence confidence_threshold and len(bbox) 4: results.append(DetectionResult(bbox, confidence, class_name)) except Exception as e: print(f解析检测结果时出错: {e}) continue return results4.2 坐标转换与归一化处理检测结果中的坐标可能是归一化坐标或绝对坐标需要进行适当处理def convert_bbox_coordinates(bbox, image_size, from_normalizedTrue): 转换边界框坐标格式参数: bbox: 边界框坐标 [x1, y1, x2, y2] 或 [x_center, y_center, width, height] image_size: 图像尺寸 (width, height) from_normalized: 输入坐标是否为归一化坐标 (0-1) 返回: List: 转换后的绝对坐标 [xmin, ymin, xmax, ymax] img_width, img_height image_size if from_normalized: # 归一化坐标转绝对坐标 xmin int(bbox[0] * img_width) ymin int(bbox[1] * img_height) xmax int(bbox[2] * img_width) ymax int(bbox[3] * img_height) else: # 已经是绝对坐标 xmin, ymin, xmax, ymax map(int, bbox) return [xmin, ymin, xmax, ymax]5. 结果导出与保存5.1 导出为JSON格式import json from datetime import datetime def export_to_json(detection_results, image_path, output_path): 将检测结果导出为JSON格式参数: detection_results: 检测结果列表 image_path: 原图像路径 output_path: 输出JSON文件路径 # 构建导出数据结构 export_data { timestamp: datetime.now().isoformat(), image_path: image_path, detection_count: len(detection_results), detections: [] } # 添加每个检测结果 for result in detection_results: detection_info { bbox: result.bbox, confidence: result.confidence, class_name: result.class_name, bbox_center: [ (result.bbox[0] result.bbox[2]) / 2, # x_center (result.bbox[1] result.bbox[3]) / 2 # y_center ], bbox_size: [ result.bbox[2] - result.bbox[0], # width result.bbox[3] - result.bbox[1] # height ] } export_data[detections].append(detection_info) # 保存为JSON文件 with open(output_path, w, encodingutf-8) as f: json.dump(export_data, f, indent2, ensure_asciiFalse) print(f检测结果已导出到: {output_path})5.2 导出为CSV格式import csv def export_to_csv(detection_results, image_path, output_path): 将检测结果导出为CSV格式参数: detection_results: 检测结果列表 image_path: 原图像路径 output_path: 输出CSV文件路径 with open(output_path, w, newline, encodingutf-8) as csvfile: fieldnames [image_path, class_name, confidence, xmin, ymin, xmax, ymax, width, height] writer csv.DictWriter(csvfile, fieldnamesfieldnames) writer.writeheader() for result in detection_results: width result.bbox[2] - result.bbox[0] height result.bbox[3] - result.bbox[1] writer.writerow({ image_path: image_path, class_name: result.class_name, confidence: f{result.confidence:.4f}, xmin: result.bbox[0], ymin: result.bbox[1], xmax: result.bbox[2], ymax: result.bbox[3], width: width, height: height }) print(f检测结果已导出到: {output_path})6. 完整使用示例6.1 端到端示例代码# 完整的使用示例 def complete_example(): # 图像路径 image_path example_image.jpg # 运行检测并解析结果 results process_image_detection(image_path, confidence_threshold0.5) # 导出结果 json_output detection_results.json csv_output detection_results.csv export_to_json(results, image_path, json_output) export_to_csv(results, image_path, csv_output) # 打印统计信息 print(\n 检测统计 ) print(f总检测数: {len(results)}) if results: confidences [r.confidence for r in results] print(f平均置信度: {sum(confidences)/len(confidences):.2%}) print(f最高置信度: {max(confidences):.2%}) print(f最低置信度: {min(confidences):.2%}) return results # 运行示例 if __name__ __main__: detection_results complete_example()6.2 实际应用场景扩展# 批量处理多张图像 def batch_process_images(image_paths, output_dir): 批量处理多张图像参数: image_paths: 图像路径列表 output_dir: 输出目录 import os # 确保输出目录存在 os.makedirs(output_dir, exist_okTrue) all_results [] for image_path in image_paths: print(f处理图像: {image_path}) # 运行检测 results process_image_detection(image_path) all_results.extend(results) # 生成输出文件名 base_name os.path.splitext(os.path.basename(image_path))[0] json_output os.path.join(output_dir, f{base_name}_detection.json) csv_output os.path.join(output_dir, f{base_name}_detection.csv) # 导出结果 export_to_json(results, image_path, json_output) export_to_csv(results, image_path, csv_output) return all_results # 实时处理视频流 def process_video_stream(video_path, output_callbackNone): 处理视频流中的每一帧参数: video_path: 视频文件路径或摄像头索引 output_callback: 处理检测结果的回调函数 import cv2 cap cv2.VideoCapture(video_path) frame_count 0 while True: ret, frame cap.read() if not ret: break # 运行检测 # 注意这里需要实现 frame_to_detection_input 函数 detection_input frame_to_detection_input(frame) raw_detections run_detection(detection_input) results parse_damo_yolo_output(raw_detections) # 使用回调函数处理结果 if output_callback: output_callback(frame_count, frame, results) frame_count 1 cap.release()7. 总结通过本文介绍的Python代码你可以轻松地解析DAMO-YOLO手机检测系统的输出结果提取包括边界框坐标、置信度分数和类别信息在内的结构化数据。这些数据可以用于数据记录与分析将检测结果保存到数据库或文件中用于后续分析系统集成将手机检测能力嵌入到更大的应用系统中性能监控统计检测准确率、误检率等性能指标实时处理处理视频流中的手机检测任务关键要点回顾使用DetectionResult类来结构化存储检测信息通过parse_damo_yolo_output函数解析原始检测输出提供了JSON和CSV两种导出格式满足不同需求包含批量处理和视频流处理的扩展示例这些代码示例可以根据你的具体需求进行修改和扩展为你的手机检测项目提供强大的数据处理能力。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。

ChatGLM3-6B-128K与SpringBoot集成：企业级应用开发

ChatGLM3-6B-128K与SpringBoot集成：企业级应用开发 1. 引言在企业级应用开发中，AI能力的集成已经成为提升产品竞争力的关键因素。ChatGLM3-6B-128K作为支持128K上下文长度的开源大语言模型，为企业处理长文本任务提供了强大的技术基础。当它…...

2026/6/30 11:54:48 阅读更多 →

18位高精度ADC避坑指南：MCP3421电压采集的5个常见错误与解决方案

18位高精度ADC避坑指南：MCP3421电压采集的5个常见错误与解决方案在嵌入式系统开发中，电池电量监测是一个常见但容易踩坑的功能模块。MCP3421作为一款18位高精度ADC芯片，凭借其I2C接口和灵活的配置选项，成为许多开发者的首选。然而…...

2026/6/26 8:06:48 阅读更多 →

Anaconda3环境变量配置避坑指南：解决‘conda command not found‘问题

Anaconda3环境变量配置深度解析：从原理到实战的完整避坑手册刚完成Anaconda3安装的兴奋感，往往会被终端里冰冷的"conda: command not found"提示瞬间浇灭。这不是个例——据统计，超过35%的Anaconda用户在初次安装后都会遇到环境变…...

2026/6/29 23:30:52 阅读更多 →

从论文到实践：一维卷积神经网络在RUL预测中的复现与调优

1. 为什么选择一维卷积做RUL预测？我第一次接触RUL（剩余使用寿命）预测时，发现大多数论文都在用二维卷积处理传感器数据。直到实际处理CMAPSS航空发动机数据集时，才意识到一维卷积才是更自然的选择。想象一下&#xff0c…...

2026/7/5 0:01:14 阅读更多 →

STM32与SPI EEPROM高效数据存储与检索方案

1. 项目背景与核心需求在嵌入式系统开发中，快速精确的数据检索是一个常见但极具挑战性的需求。特别是在工业控制、医疗设备和物联网终端等场景下，系统往往需要在毫秒级时间内完成关键参数的读取和写入操作。传统基于Flash存储的方案存在擦写次数有限、操…...

2026/7/5 0:01:48 阅读更多 →

23-AGENTS.md高级用法

23 AGENTS.md 高级用法概述上一篇文章介绍了 AGENTS.md 的三层加载机制，这是 AGENTS.md 体系的基础。但在实际的大型项目中，三层结构往往不够灵活。团队经常面临这样的场景：同一个 Git 仓库中包含多个服务或模块，每个模块都有自己的独特规范，同时还要继承项目级的通用…...

2026/7/5 0:06:48 阅读更多 →