Thermal imaging offers unique advantages for automated agricultural inspection due to its robustness against lighting variations and surface texture inconsistencies. However, deploying deep learning models on thermal data under real-time and resource-constrained conditions remains challenging.
In this study, we benchmark four lightweight YOLO nano-variants—YOLOv5n, YOLOv8n, YOLOv11n, and YOLOv12n—using a thermal image dataset for okra maturity grading. The models are evaluated across heterogeneous computing platforms, including GPU-accelerated inference using TensorRT and CPU-based inference using ONNX Runtime, with a focus on accuracy–latency trade-offs.
The results show that YOLOv8n achieves the best balance between detection accuracy and inference speed under short training budgets, delivering sub-2 ms GPU latency and throughput exceeding 600 FPS. YOLOv5n demonstrates superior CPU efficiency, making it well suited for edge and embedded deployments. Attention-based architectures achieve higher peak accuracy only when longer training durations are permitted.
This benchmark provides practical guidance for selecting lightweight object detection models for real-time thermal imaging systems and highlights architectural considerations for deploying AI in high-performance and edge-based agricultural applications.
🔗 Article link: https://link.springer.com/article/10.1007/s11227-026-08226-w