Deep Learning in the Smart Pig Farming: Improving Farrowing Pen Management Using Computer Vision

深度學習在智慧養豬的應用﹔利用計算機視覺促進分娩欄管理

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
  • Weitao XU (Supervisor)
  • Kai LIU (Co-supervisor)
  • Chee Wei TAN (External person) (External Co-Supervisor)
Award date27 Jun 2023

Abstract

Smart pig farming has emerged with the help of recent technologies, including Internet of Things, wearable sensors, and computer vision. Among these technologies, computer vision is an attractive choice for smart pig farming due to its low-cost operation and contactless requirements. In the context of smart pig farming, farrowing pens are critical enclosures where sows give birth and take care of their piglets until weaning. However, when applying computer vision into farrowing pens, the occlusion problem often occurs especially due to farrowing crates. These metal stalls confine a sow in place and prevent it from accidentally lying on its piglets to reduce piglet mortality. The use of farrowing crates causes inevitable visual occlusions when applying computer vision in farrowing pens. Moreover, the occlusion effect is amplified for small-sized piglets, making it challenging to detect and monitor them accurately. The occlusions might lead to low accuracy and even the failure of a computer vision-based method, limiting the application of computer vision in real-world applications. Deep learning, a powerful tool for analyzing images and videos, has great potential in assisting computer vision for automatic animal monitoring, even with occlusions. Therefore, the thesis aims to investigate the possibilities of deep learning to manage the occlusion problem when using computer vision in farrowing pens. Four main research questions should be asked. (1) How are deep learning methods affected by occlusion? (2) How to detect piglets with occlusion? (3) How to represent piglets with occlusion? And (4) how to recover the invisible part of piglets under occlusion?

For the first question, the thesis assesses the applicability and capacity limit of deep learning methods on scenarios of pigs in farrowing pens as a case study. Six state-of-the-art deep learning methods in three major computer vision tasks, including object detection (Faster Region-based Convolutional Neural Network (Faster R-CNN) and You Only Look Once v4 (Yolov4)), semantic segmentation (Fully Convolutional Networks (FCN) and Unet), and instance segmentation (Mask R-CNN and You Only Look At CoefficienTs++ (Yolact++)) were evaluated, respectively. To further challenge these methods, virtual occlusions were randomly and gradually added in ten cases on the images and their capacity limitations under extreme occlusion were tested. The result showed that all six deep learning methods achieved satisfactory performance on the original baseline case while the performance fell when occlusion became heavier. Object detection methods were less affected by occlusion than instance segmentation methods due to their rougher output. The semantic segmentation method Unet demonstrated the least effect by occlusion due to its pixel-wise classification characters, while the instance segmentation method Mask R-CNN was most affected by occlusions.

For the second question, the thesis proposes detecting a piglet by its object center, which could be determined from its visible pixels. To achieve this, a two-stage center clustering network called CClusnet-Count was proposed for detecting piglets under occlusion, which could be applied for piglet counting. In the first stage, the CClusnet-Count predicted a semantic segmentation map and a center offset vector map for each image. In the second stage, scattered center points were produced by combining the two maps, and the mean-shift algorithm is applied to determine the piglet's number and object center. The results showed that CClusnet-Count achieved 0.43 mean absolute error per image for piglet counting, had a better performance than previous network architectures, and outperformed existing counting methods. The CClusnet-Count can be applied in similar settings with different types of occlusions, and have the potential to achieve high accuracy in animal detection and monitoring.

For the third question, the thesis investigates the advantage of masks from instance segmentation as an object representation. A center clustering network for instance segmentation (CClusnet-Inseg) was further developed and extended from CClusnet-Count. CClusnet-Inseg traced the clustered centers back to their original pixels and assigned the pixels whose centers were in the same group to form masks. While the occluded mask was limited, CClusnet-Inseg predicted a unique occlusion-resistant object center to represent an object's location. The results showed that CClusnet-Inseg achieved a mean average precision (mAP) of 0.84 and outperformed all other methods compared in this study. The CClusnet-Inseg could be further applied in multi-object tracking for animal movement monitoring and spatial distribution analysis.

For the last question, the thesis examines the necessity of an amodal mask (i.e., a complete mask with both visible and invisible masks) and thus amodal instance segmentation for animal monitoring with occlusion. Due to the difficulty of obtaining an amodal dataset (data labeled by amodal mask) for amodal instance segmentation, a novel semi-supervised generative adversarial network for amodal instance segmentation was devised, denoted "the AISGAN." This method only required a regular modal dataset and thus extends the applicability of amodal method. AISGAN achieved a mean Intersection of Union (mIoU) of 0.823, greater than the mIoUs of Mask R-CNN, Raw, and Convex Hull (0.801, 0.780, and 0.778, respectively). As a semi-supervised method, the mIoU of the AISGAN was further enhanced (by 0.6%) when fine-tuned with unlabeled new data. The AISGAN could be applied for occlusion-resistant monitoring and thus is a promising tool for automated animal monitoring in complex housing environments.

In this thesis, the occlusion problem in farrowing pens is investigated and three deep-learning algorithms are developed for detecting, representing, and recovering piglets in these scenes. The proposed algorithms enable computer vision to manage occlusion problems in farrowing pens and improve the management of smart pig farming. With further acceleration and algorithm embedding in hardware, these works lay the foundation for future research and practical applications in precision livestock farming.