A Review of Machine Vision Algorithms in Basketball

Authors

  • Anshuo Zhang Southwest Jiaotong University, Mechanical Design, Manufacturing and Automation, Chengdu, Sichuan Province, 611756, China

Keywords:

Basketball Foul Detection, Machine Vision, Tactical Analysis Algorithm, Technical Movement Decomposition Algorithm

Abstract

This paper provides a systematic review of research progress in applying machine vision algorithms to basketball from 2020 to 2025, focusing on three core domains: foul detection, tactical analysis, and skill decomposition. Findings indicate that basketball machine vision technology has significantly enhanced the accuracy of game analysis and the scientific rigor of training through deep learning methods. However, major challenges remain in real-time processing, adaptation to complex environments, and multimodal fusion. Regarding foul detection, research has evolved from traditional feature extraction methods to deep learning models. Li (2021) employed the Baum-Welch algorithm to construct two-dimensional motion contours, yet achieved limited practical recognition accuracy. Jia (2025) utilized a convolutional neural network (CNN) combined with hybrid Gaussian filtering, boosting accuracy to 98.1%, though the model response time reached 1.07 seconds. Wang (2025) integrated attention mechanisms with the ResNet50 network, reducing detection time to under 14 milliseconds while maintaining a 95% feature matching rate, reflecting a shift from accuracy-first to real-time optimization. In tactical analysis, Fu (2020) optimized a multi-object tracking (MOT) framework achieving 83.6% tracking accuracy (MOTA), laying the foundation for tactical analysis. Chen (2021) developed a VR defensive trajectory generation model that significantly enhances real-time performance through autoregressive generation algorithms. Xu (2024)'s TacViT model leverages visual self-attention mechanisms to enhance global trajectory feature extraction, advancing the symbolic analysis of tactical understanding. Regarding technical action decomposition, Tian (2024) compressed redundant video frames by 91% using keyframe extraction algorithms and designed the DAMR_3DNet model, achieving an F1 score of 0.92 on the UCF101 dataset. Ma (2021) constructed the NPU RGB+D dataset and employed an LSTM-DGCN model to reduce action recognition error rates under dense defense to 13%. Sun (2022) further established an injury early-warning model from a biomechanical perspective, forming a closed-loop system of “identification → analysis → prevention.” This study employs a systematic literature review methodology, leveraging authoritative publications from ScienceDirect and Engineering Village databases to construct a three-dimensional technical analysis framework. The review identifies four major bottlenecks in current research: insufficient data scale, difficulty balancing real-time performance and accuracy, shallow multimodal fusion, and poor model interpretability. Future research should focus on constructing large-scale, fine-grained datasets; developing lightweight model architectures; deepening cross-modal fusion and AIoT integration; and enhancing algorithm interpretability. This will propel basketball machine vision from theoretical innovation to practical application, ultimately achieving intelligent referee assistance, scientific training optimization, and smart arena construction.

Downloads

Published

2025-11-30