Welcome to Journal of Graphics share: 
Bimonthly, Started in 1980
Administrated: China Association for  Science and Technology
Sponsored: China Graphics Society
Edited and Published: Editorial Board  of Journal of Graphics
Chief Editor: Guoping Wang
Editorial Director: Xiaohong Hou
ISSN 2095-302X
CN 10-1034/T
Current Issue
28 February 2025, Volume 46 Issue 1 Previous Issue   
For Selected: Toggle Thumbnails
Cover
Cover of issue 1, 2025
2025, 46(1): 1. 
PDF 15     51
Related Articles | Metrics
Contents
Table of Contents for Issue 1, 2025
2025, 46(1): 2. 
PDF 12     13
Related Articles | Metrics
Image Processing and Computer Vision
Cascade detection method for insulator defects in distribution lines based on improved YOLOv8
ZHAO Zhenbing, HAN Yu, TANG Chenkang
2025, 46(1): 1-12.  DOI: 10.11996/JG.j.2095-302X.2025010001
HTML    PDF 55     71

To address the issues of complex and dynamic backgrounds due to safety constraints, irregular shapes of insulator defects, indistinct defect features, and difficulty in capturing defect information during aerial photography of power distribution lines using unmanned aerial vehicles, a cascaded detection method for insulator defects in power distribution lines based on an improved YOLOv8 was proposed. In the first stage, the YOLOv8 model automatically extracted images of insulator components, providing accurate inputs for the second stage of insulator defect detection and eliminating the influence of redundant background information. In the second stage, the ConvNeXt V2 backbone network was utilized to enhance the model's ability to recognize irregularly shaped targets and improve its feature extraction capabilities. By incorporating the edge knowledge fusion module into the feature fusion process, precise extraction of defect edge information was achieved. Furthermore, an adaptive shape IoU enhancement method was designed, adopting an adaptive training sample selection strategy to optimize the ratio of positive and negative samples. Additionally, the Shape-IoU loss function was employed, considering the inherent attributes of bounding box regression samples such as shape and scale, enabling the model to focus on essential target features, thereby improving the detection accuracy and robustness by reducing missed and false detections. Experimental results demonstrated that the proposed cascaded detection method for insulator defects in power distribution lines based on the improved YOLOv8 achieved a 17.3% increase in average precision compared to baseline models, effectively enhancing the accuracy of insulator defect detection in power distribution lines and providing robust technical support for the safe maintenance of power systems.

Figures and Tables | References | Related Articles | Metrics
A multi-scene fire sign detection algorithm based on EE-YOLOv8s
CUI Kebin, GENG Jiachang
2025, 46(1): 13-27.  DOI: 10.11996/JG.j.2095-302X.2025010013
HTML    PDF 27     21

To mitigate the current issues of spurious and missed detections of fire signs in smoke and fire scene detection, caused by interfering factors such as illumination variations, fire dynamics, complex backgrounds, and excessively small targets, an improved YOLOv8s model named EE-YOLOv8s was proposed. The EE-YOLOv8s model integrated the MBConv-Block convolution module into the YOLOv8 Backbone and employed the EfficientNetEasy feature extraction network to refine image feature extraction while preserving a lightweight design. Additionally, the SPPELAN module was upgraded to SPP_LSKA_ELAN by incorporating the large separable kernel attention mechanism (LSKA), which captured spatial detail information in intricate and dynamic fire scenes, thereby distinguishing target objects from convoluted backgrounds. The Neck section introduced deformable convolution (DCN) and cross-space efficient multi-scale attention (EMA), implementing the C2f_DCN_EMA deformable convolution calibration module to enhance the adaptation to edge contour changes of fire and smoke targets, facilitating feature fusion and calibration, and emphasizing key target features. A small target detection head, equipped with the lightweight, parameter-free attention mechanism SimAM, was integrated into the Head section, and the channel configuration was refined to strengthen multi-size target characterization while minimizing redundancy and maximizing parameter utilization efficiency. Experimental results demonstrated that EE-YOLOv8s reduced the parameter count by 13.6%, while improving accuracy by 6.8%, recall by 7.3%, and mAP by 5.4% compared to the original model, ensuring rapid detection speed and superior detection performance for fire targets.

Figures and Tables | References | Related Articles | Metrics
The defect detection method for communication optical cables based on lightweight improved YOLOv8
WANG Zhidong, CHEN Chenyang, LIU Xiaoming
2025, 46(1): 28-34.  DOI: 10.11996/JG.j.2095-302X.2025010028
HTML    PDF 24     16

In the field of defect detection of all dielectric self supporting (ADSS) communication cables, the detection of galvanic corrosion defects across scales has the problems of high computational demands and low detection accuracy. In this paper, a defect detection method for ADSS communication cables with improved YOLOv8 was proposed. Firstly, the self-built communication cable defect dataset was sliced to prevent the existence of galvanic corrosion defects in the fiber optic cable from being lost in the process of scaling; secondly, the structure of LS-FPN replaced the traditional necking structure, retaining the favorable positional information in the channel dimension, which resolved the defective scale-spanning problem on the surface of the fiber optic cable while enhancing defect localization capability; furthermore, the idea of deformable convolution was introduced, replacing the convolution in the original backbone network, allowing the network to better focus on the surrounding defect information in the process of feature extraction; finally, the original CIoU was replaced by the Focus-MPDIoU loss function, which excels in handling boundary cases and avoids overly radical loss gradients. The experimental results validated that the method on the ADSS communication fiber optic cable defect dataset, with the improved model achieving 50.6% and 87.8% on mAP50-95 and mAP50, respectively, reflecting increases of 2.1% and 3.7% compared to YOLOv8n. Meanwhile, the computational GFLOPs were reduced to 6.8 and the number of parameters decreased to 1.96 M, reducing the configuration requirements of the inspection equipment and meeting the lightweight industrial demand.

Figures and Tables | References | Related Articles | Metrics
TSA-SFNet: transpose self-attention and CNN based stereoscopic fusion network for image super-resolution
CHEN Guanhao, XU Dan, HE Kangjian, SHI Hongzhen, ZHANG Hao
2025, 46(1): 35-46.  DOI: 10.11996/JG.j.2095-302X.2025010035
HTML    PDF 27     17

Transformer-based image super-resolution methods have demonstrated remarkable performance in recent years. Nonetheless, existing approaches encounter challenges such as incomplete recovery of high-frequency information, insufficient activation of additional pixels for image reconstruction, inadequate cross-window information interaction, and training instability caused by residual connections. To address these challenges, the transpose self-attention and CNN based stereoscopic fusion network (TSA-SFNet) was proposed. TSA-SFNet adapted the window multi-head self-attention modules to mitigate amplitude issues caused by residual connections and incorporated channel attention to activate more pixels for image reconstruction. Additionally, to bolster the interaction between adjacent windows for capturing additional structural information and achieving a more comprehensive reconstruction of high-frequency details, overlapping window attention and a convolutional feedforward neural network were introduced. Quantitative and qualitative evaluations of the network model were conducted on classical super-resolution tasks and real-world super-resolution challenges. The experimental results demonstrated that the proposed TSA-SFNet achieved state-of-the-art results on five commonly used benchmark datasets and generated more realistic super-resolution images.

Figures and Tables | References | Related Articles | Metrics
A deepfake face detection method that enhances focus on forgery regions
ZHANG Wenxiang, WANG Xiali, WANG Xinyi, YANG Zongbao
2025, 46(1): 47-58.  DOI: 10.11996/JG.j.2095-302X.2025010047
HTML    PDF 12     6

The rapid development of deepfake face technology has led to its widespread use in various undesirable ways, making the detection of manipulated facial images and videos an important research topic. Existing convolutional neural networks suffer from overfitting and poor generalization, performing poorly on unknown synthetic face data. To address this limitation, a deepfake face detection method with enhanced focus on forgery regions was proposed. Firstly, an attention mechanism was introduced to process the feature map used for classification, and the learned attention map could highlight the manipulated facial area, thereby improving the generalization capability of the model. Secondly, a forgery regions detection module was connected to the backbone network, reducing the interference of global face information by detecting forgery traces in the multi-scale anchors, further strengthening the model's attention to the local forgery regions. Finally, a consistent representation learning framework was introduced, ensuring that the model pays more attention to the inherent evidence of forgery and avoids overfitting by explicitly constraining the consistency between different representations of the same input. Experiments were conducted on three datasets, including FaceForensics++, Celeb-DF-v2, and DFDC, using EfficientNet-b4 and Xception as the backbone networks, respectively. The results demonstrated that the proposed method achieved good performance in intra-dataset evaluation, and outperformed the original networks and other advanced methods in cross-dataset evaluation.

Figures and Tables | References | Related Articles | Metrics
Point cloud feature enhanced 3D object detection in complex indoor scenes
YUAN Chao, ZHAO Mingxue, ZHANG Fengyi, FENG Xiaoyong, LI Bing, CHEN Rui
2025, 46(1): 59-69.  DOI: 10.11996/JG.j.2095-302X.2025010059
HTML    PDF 17     23

3D point cloud object detection in complex indoor scenes presents challenges due to large-scale point clouds and dense objects with many details. When dealing with point cloud data, existing detection algorithms lose a significant amount of local features and fail to extract enough spatial and semantic information, resulting in low detection accuracy. To solve this problem, a point cloud features enhanced 3D object detection in complex indoor scenes (PEF) algorithm was proposed based on an improved VoteNet. Firstly, a dynamic feature compensation module was used to simulate the interactive query process between seed point set features and grouping set features, gradually recovering lost features for feature compensation. Secondly, a residual MLP module was introduced into the feature extraction part, and a deeper feature learning network was constructed through a residual structure to mine more detailed point cloud features. Finally, in the proposal stage, a feature self-attention mechanism was introduced to model the semantic relationship between a set of independent object points, generating a new feature map. Experiments conducted on the public datasets SUN RGB-D and ScanNet V2 demonstrated that the improved model enhanced the detection accuracy for indoor objects by 5.0% and 11.5% respectively on mAP@0.25 compared with the baseline model. Extensive ablation experiments confirmed the effectiveness of each improved module.

Figures and Tables | References | Related Articles | Metrics
Lightweight wild bat detection method based on multi-scale feature fusion
WANG Yang, MA Chang, HU Ming, SUN Tao, RAO Yuan, YUAN Zhenyu
2025, 46(1): 70-80.  DOI: 10.11996/JG.j.2095-302X.2025010070
HTML    PDF 18     11

Bat detection in the wild is crucial for ecological protection and scientific research. To address the challenges brought by limited computing resources and complex wild environments, a lightweight bat detection model (LiteDETR-Bat) was proposed to achieve efficient real-time detection. Firstly, in order to solve the problem of feature mapping redundancy, the reparameterized convolutional efficient layer aggregation network (RCELAN) was introduced, replacing the traditional ResNet backbone network and adopting a multi-branch feature aggregation mechanism, thereby reducing computational complexity and parameter quantity. Secondly, a dynamic sampling-multi scale feature fusion (DS-MFF) was designed. This structure integrated dilated convolution and dynamic sampling operators, optimizing 0multi-scale feature fusion by expanding the receptive field and adaptively adjusting sampling positions, which enhanced the flexibility and robustness of the model in processing diversified features. Finally, a bat dataset covering various lighting conditions, perspective changes, and bat morphology changes was collected in the wild environment of Anhui Province, and related experiments such as model performance were conducted on this dataset. Experimental results showed that the proposed LiteDETR-Bat model not only reduced the number of parameters by 46.5% and achieved an mAP of 97.2%, but also made certain improvements in accuracy and real-time performance compared with the YOLO series algorithms. The LiteDETR-Bat model provided strong technical support for the monitoring and protection of wild bats, and demonstrated its application potential in ecological monitoring and biodiversity conservation.

Figures and Tables | References | Related Articles | Metrics
Few-shot pointer meters detection method based on meta-learning
SUN Qianlai, LIN Shaohang, LIU Dongfeng, SONG Xiaoyang, LIU Jiayao, LIU Ruizhen
2025, 46(1): 81-93.  DOI: 10.11996/JG.j.2095-302X.2025010081
HTML    PDF 18     7

The accuracy of meters location is a critical factor in ensuring the accuracy of meters recognition. However, it is challenging to collect instrument data in complex industrial scenarios, existing pointer instrument detection methods exhibit low detection accuracy and poor real-time performance in few-shot situations. For this reason, the Sparse-Meta-DETR method was proposed for few-shot pointer meter detection based on meta-learning. Inspired by the object detection model Meta-DETR, this method adopted the meta-learning strategy. During the meta-training stage, few-shot tasks were created to train the Sparse-Meta-DETR model, enhancing metrics ability of the correlational aggregation module for support set and query set classes in the feature space. This enabled the model to recognize classes present in images during the few-shot training stage with few-shot tasks, quickly adapt to few-shot tasks with novel classes, detect pointer meters in complex industrial scenarios. A lightweight backbone network, Efficientnet b1, was introduced as the feature extractor to reduce the computational complexity and parameter of the model, thereby improving the detection speed. Simultaneously, a scoring network was designed as a token sparsification sampler, creating a sparsification mask to select foreground features from query features. This guided the Transformer encoders and decoders to focus on foreground features, thereby reducing computational complexity of few-shot training stage and improving detection accuracy. The Sparse-Meta-DETR model achieved an AP50 of 94.2% and an AP75 of 87.5% in 20-shot task, and an AP50 of 91.1% in 10-shot tasks. Compared to the baseline model, the improved model reduced time complexity by 74.5%. Experimental results demonstrated that the Sparse-Meta-DETR can not only effectively ensure the accuracy of pointer meter positioning detection but also improve the real-time performance in the case of few-shot. Its overall performance surpassed other few-shot deep-learning algorithms such as Meta RCNN.

Figures and Tables | References | Related Articles | Metrics
SDENet: a synthetic defect data evaluation network based on multi-scale attention quality perception
LU Yang, CHEN Linhui, JIANG Xiaoheng, XU Mingliang
2025, 46(1): 94-103.  DOI: 10.11996/JG.j.2095-302X.2025010094
HTML    PDF 12     3

The quality evaluation of defect data synthesized through data augmentation can facilitate high-quality expansion of defect data, thereby mitigating the problem of poor detection model performance caused by insufficient defect data. When evaluating the quality of synthetic defect data, existing quality evaluation algorithms primarily focus on the distortion characteristics of the data but tend to overlook the defect attributes of the data. To address this issue, a SDENet model based on attention feature enhancement (AFE) and multi-scale attention quality perception (MAQP) was proposed, which comprehensively considered the distortion characteristics and defect attributes of synthesized defect data for quality evaluation. Firstly, the AFE module improved the model's generalization ability to defects of different sizes and positions through dual-branch pooling operation, while also using an attention mechanism to enhance the feature expression ability of the model. Secondly, the MAQP module vectorized and fused the features enhanced by AFE to better perceive the quality of synthetic defect data. Finally, the fused features were fed into the quality evaluation section, and the final evaluation score was generated. Experiments conducted on the constructed synthetic defect data set of road cracks demonstrated that the SDENet model achieved optimal results in RMSE, RMAE, PLCC, and SROCC metrics, with improvements of 10.7%, 5.0%, 1.8% and 1.8% compared to the suboptimal model, thereby verifying the effectiveness of the model. On the distorted dataset TID2013, the SDENet model also produced competitive results, reaching 0.902 and 0.876 on the PLCC and SROCC metrics, respectively.

Figures and Tables | References | Related Articles | Metrics
Deepfake detection method based on multi-feature fusion of frequency domain and spatial domain
DONG Jiale, DENG Zhengjie, LI Xiyan, WANG Shiyun
2025, 46(1): 104-113.  DOI: 10.11996/JG.j.2095-302X.2025010104
HTML    PDF 12     10

In today's society, the rapid advancement of facial forgery technology has posed a substantial challenge to social security, especially in the context where deep learning techniques have been widely employed to generate realistic fake videos. These high-quality forged contents not only threaten personal privacy but can also be utilized for illegal activities. Faced with this challenge, traditional forgery detection methods based on single features have become inadequate to meet detection demands. To address this issue, a deepfake detection method based on multi-feature fusion in both frequency and spatial domains was proposed to enhance the detection accuracy and generalization capability for facial forgeries. The frequency domain was dynamically divided into three bands to extract forgery artifacts that cannot be mined in the spatial domain. The spatial domain employed the EfficientNet_b4 network and Transformer architecture to segment image blocks at multiple scales, calculate differences between different blocks, perform detection based on consistency information between upper and lower image blocks, and capture more detailed forgery feature information. Finally, a fusion block using a query-key-value mechanism integrated the methods from the frequency and spatial domains, thereby more comprehensively mining feature information from both domains to enhance the accuracy and transferability of forgery detection. Extensive experimental results confirmed the effectiveness of the proposed method, demonstrating significantly superior performance compared to traditional deepfake detection methods.

Figures and Tables | References | Related Articles | Metrics
Consistent and unbiased teacher model research for domain adaptive object detection
CHENG Xudong, SHI Caijuan, GAO Weixiang, WANG Sen, DUAN Changyu, YAN Xiaodong
2025, 46(1): 114-125.  DOI: 10.11996/JG.j.2095-302X.2025010114
HTML    PDF 17     13

As a significant approach, the self-training method has significantly enhanced the performance of domain adaptive object detection methods. The self-training method primarily predicts target domain data through a teacher network, and then selects high-confidence predictions as pseudo-labels to guide student network training. However, due to significant domain differences between the source and target domains, the pseudo-labels generated by the teacher network are often of poor quality, adversely impacting student network training and reducing the performance of object detection. To address this challenge, a consistent and unbiased teacher (CUT) model for domain adaptive object detection was proposed. Firstly, an adaptive threshold generation (ATG) module was designed within the teacher network. The ATG module utilized a Gaussian mixture model (GMM) during training to generate adaptive thresholds for each image, ensuring temporal consistency of pseudo-label quantities and thereby enhancing their quality. Secondly, a prediction-guided sample selection (PSS) strategy was introduced, which leveraged predictions from the region proposal network within the teacher network to guide the selection of positive and negative samples for the student network. The PSS strategy effectively aligned the selected samples with real outcomes, thereby mitigating the impact of low-quality pseudo-labels on the student network. Furthermore, to improve detection performance for small objects and challenging objects with fewer instances, a mixed domain augmentation (MDA) module was devised to generate mixed-domain images containing random information from both the source and target-like domains to supervise student network training. Extensive experiments conducted on three scenario datasets demonstrated the effectiveness of the proposed CUT, with performance improvements of 4.0%, 5.8%, and 3.7%, respectively. Notably, the proposed CUT model applied the self-training method for the first time to address the problem of large domain disparities between visual images and infrared images.

Figures and Tables | References | Related Articles | Metrics
Image colorization via semantic similarity propagation
MENG Sihong, LIU Hao, FANG Haotian, SENG Bingfeng, DU Zhengjun
2025, 46(1): 126-138.  DOI: 10.11996/JG.j.2095-302X.2025010126
HTML    PDF 22     9

Image colorization aims to convert grayscale images into color images, a technique that has long received extensive attention from researchers in the fields of computer graphics and computer vision. It has found wide applications in areas such as image restoration, medical imaging, film restoration, and artistic creation. Over decades of development, researchers have proposed a large number of interaction-based, rule-based, and deep learning-based algorithms to enhance the colorization effect of images. Nevertheless, the existing image colorization algorithms exhibit some significant shortcomings, such as low computational efficiency, cumbersome user interaction, low color saturation, and the occurrence of color overflow. To address these challenges, an image colorization algorithm based on semantic similarity propagation was proposed. Semantic features of the input grayscale image were extracted using deep neural networks, and a feature space was constructed. Then, the image colorization task was formalized as an efficient energy optimization problem based on semantic similarity propagation, enabling the propagation of user-supplied stroke colors to other regions of the image. In addition, a trilinear interpolation method was employed to accelerate both energy optimization and color propagation, significantly enhancing computational efficiency. In order to verify the effectiveness of the algorithm, experiments were conducted on a collected image set, evaluating multiple dimensions, such as image visual effect, generated image quality, algorithm running time, and user interaction experience. The results of a large number of qualitative and quantitative experiments demonstrated that the proposed algorithm achieved more accurate, efficient, and natural colorization with reduced user interaction requirements, while achieving substantial improvements in computational efficiency.

Figures and Tables | References | Related Articles | Metrics
Computer Graphics and Virtual Reality
Generalization optimization method for text to material texture maps based on diffusion model
TU Qinghao, LI Yuanqi, LIU Yifan, GUO Jie, GUO Yanwen
2025, 46(1): 139-149.  DOI: 10.11996/JG.j.2095-302X.2025010139
HTML    PDF 13     6

Considering the current situation where existing material texture datasets lack sufficient textual descriptions, while pure image datasets are massive in scale, and the difficulty of obtaining additional hyperparameters to generate new results when traditional generative models encounter inference errors, a generalized optimization method for text to material texture maps based on a stable diffusion model was proposed. The model was trained in a staged manner: firstly, a large-scale pure image dataset was used to finetune the diffusion model to fit image generation. Secondly, a small-scale dataset with text annotations was employed to learn semantic information. Thirdly, a new decoder was introduced to reconstruct texture maps from the latent codes generated by the diffusion model; ultimately, multiple randomly generated texture maps that conformed to the given descriptions were obtained by inputting textual descriptions. The method employed the Colossal architecture to organize the code, significantly reducing hardware requirements for training. By separating the tasks of image fitting and semantic information learning, with large-scale image datasets used for model parameter fitting and small-scale text data used for learning semantic information, the method enhanced the generalization of the model and reduced the demand for multimodal dataset scale.

Figures and Tables | References | Related Articles | Metrics
Unsupervised 3D point cloud non-rigid registration based on multi-feature extraction and point correspondence
WU Yiqi, HE Jiale, ZHANG Tiantian, ZHANG Dejun, LI Yanli, CHEN Yilin
2025, 46(1): 150-158.  DOI: 10.11996/JG.j.2095-302X.2025010150
HTML    PDF 16     6

To achieve accurate registration between non-rigid point clouds while ensuring the precise establishment of point correspondences, an unsupervised 3D point cloud non-rigid registration network based on multi-feature extraction and point correspondence was proposed. The network comprised modules for multi-feature extraction, matching refinement, and shape-aware attention. Firstly, multiple features were extracted from the input source and target point clouds, and the feature similarity matrix was obtained by feature similarity calculation. Subsequently, the similarity matrix was input into the matching refinement module of the network, where a combination of soft and hard matching was used to generate the point correspondence matrix. Finally, the target point cloud features, source point cloud, and correspondence matrix were input into the shape-aware attention module to obtain the final registration result. With this method, the registration results simultaneously possessed point correspondence and shape similarity with the target point cloud. Experimental results on public and synthetic datasets, as well as visual effects and quantitative comparisons, demonstrated that the proposed method accurately obtained the point correspondence and shape similarity between the source and target point clouds, effectively achieving unsupervised 3D point cloud non-rigid registration.

Figures and Tables | References | Related Articles | Metrics
Delaunay triangulation partitioning processing algorithm based on compute shaders
CHEN Guojun, LI Zhenshuo, CHEN Haozhen
2025, 46(1): 159-169.  DOI: 10.11996/JG.j.2095-302X.2025010159
HTML    PDF 18     14

Delaunay triangulation is a classic computational geometry algorithm, which is widely used in many fields. With the increasing demand for practical applications, the existing Delaunay triangulation algorithm have proven inadequate for large-scale data. Therefore, a parallel Delaunay triangulation method based on compute shaders was proposed. This method input point set data into the compute shader through a texture buffer and utilized the compute shader to execute the Delaunay triangulation algorithm. At the same time, a dynamic insertion method was proposed based on the existing method to address the remapping problem of point sets in discrete space. In addition, to enable GPUs with limited video memory to construct Delaunay triangulations that significantly exceeded their video memory limitations, a partitioned bidirectional scanning algorithm based on compute shaders was proposed. This method divided the point set into multiple sub-regions, and then constructed the network by scanning each sub-region. Experimental results indicated that under the same operating environment, the method based on compute shaders shortened the network construction time compared with the existing methods. Moreover, the partitioned bidirectional scanning algorithm effectively solved the GPU memory bottleneck problem, allowing GPUs with limited video memory to construct Delaunay triangulations that far exceeded their video memory capacity.

Figures and Tables | References | Related Articles | Metrics
Robustness analysis and control of lateral driving stability of novel inspection vehicle
NI Liwei, WU Liang, JIANG Hongsheng, XING Biao
2025, 46(1): 170-178.  DOI: 10.11996/JG.j.2095-302X.2025010170
HTML    PDF 10     7

Traditional vehicles often experience deviations in sideslip angle and yaw rate from their ideal values when subjected to uncertain disturbances, resulting in a degradation of the lateral driving stability of the vehicle. To enhance the lateral driving stability and robustness of vehicles under such conditions, firstly, a hierarchical collaborative control strategy (HCC) with strong robustness was proposed based on integrated dynamic model, the sequential quadratic programming method, and the adaptive sliding mode control algorithm (ASMC). Secondly, a novel four-wheel steering and distributed drive inspection vehicle (FSDDIV) was designed based on steering-by-wire, drive-by-wire, and brake-by-wire. Finally, the lateral driving stability analysis based on Simulink and Carsim was carried out. The results demonstrated that compared with ADM control strategy, the proposed HCC control strategy exhibited better control performance. When faced different driving conditions, different driving speeds, and system parameters uncertainty, the improvement ratio of maximum deviation errors of the sideslip angle and yaw rate reached 75.5% and 84.8%, 72.8% and 86.0%, and 71.0% and 83.8%, respectively. In addition, the HCC strategy exhibited better overall performance compared to the similar HLQR control theory. In summary, the proposed control strategy is insensitive to uncertain disturbances, delivering robust and stable control effects, making it well-suited for inspection tasks under different working conditions.

Figures and Tables | References | Related Articles | Metrics
Active view selection for radiance fields using surface object points
XIE Wenxiang, XU Weiwei
2025, 46(1): 179-187.  DOI: 10.11996/JG.j.2095-302X.2025010179
HTML    PDF 14     8

Neural radiance fields (NeRF) has significantly enhanced the quality of novel view synthesis and 3D reconstruction. However, the data collection process for NeRF training still relies on manual experience, which limits its applications in tasks such as unknown environment exploration and planning. Therefore, it becomes crucial to effectively select views with the highest information gain for training. A novel active view selection strategy was proposed to address this. Firstly, volume rendering weights were utilized to obtain 3D points near the surface of the scene where training rays were projected. Then, the visibility of each 3D point for candidate views was calculated, and photometric confidence weighting was employed to measure the candidate views. Finally, candidate views with fewer visible 3D points and lower confidence were selected as the new training views. Experiments on the Blender datasets demonstrated that our approach achieved a PSNR improvement of 3.88 dB and 5.88 dB for single-view and batch view selections, respectively, compared with existing methods, and increased view selection speed by nearly 30 times.

Figures and Tables | References | Related Articles | Metrics
Brush-based interactive element packing
LIANG Mu, XU Pengfei, HUANG Hui
2025, 46(1): 188-199.  DOI: 10.11996/JG.j.2095-302X.2025010188
HTML    PDF 11     3

Element packing is a visual art form that involves tightly packing elements within an area, often seen in logos and art-works, particularly in posters and advertisements. The traditional process of designing element packing requires manual adjustments to align elements to the appropriate size and orientation, which is repetitive and time-consuming. Existing algorithms can assist designers in quickly designing by dividing the design process into two steps: outlining the packing area then packing the elements. However, in actual creative workflows, these two steps occur simultaneously. Therefore, existing methods are incompatible with the real design process. To address this issue, an easy-to-use brush tool was introduced to achieve packing while outlining boundaries. A brush-based element packing technique was proposed, where each element was represented as a triangle mesh and applied to a spring mass system. During simulation, non-overlapping elements were attracted to each other, while overlapping elements repelled each other. Similar to traditional brushes used for color filling, the brush tool allows real-time element packing. By inviting multiple users to design using the system, the effectiveness of the proposed method was evaluated. Results indicated that the technique significantly reduced design time, improved compactness of packing results, and enhanced controllability of the process.

Figures and Tables | References | Related Articles | Metrics
Industrial Design
Evaluation of optimal design of human-machine interface for subway dispatching based on visual factors
LI Bo, XUAN Jinge, XUE Yanmin, YU Suihuai, WANG Juan
2025, 46(1): 200-210.  DOI: 10.11996/JG.j.2095-302X.2025010200
HTML    PDF 13     7

To optimize the task performance of subway dispatchers, the method was to construct a virtual human model using CATIA and divide the view level. The scheduling interface and view were rasterized, and a human-machine layout optimization model was constructed. The scheduling interface was modularized, and the importance of each module was analyzed. Based on the particle swarm optimization algorithm, inertia weight was introduced to obtain the optimal layout plan. Based on the optimization strategy for human-machine interface information display, the color, font, and icon of the human-machine interface were optimized and designed. An eye-tracking experimental platform was built, with AOI first fixation time, AOI fixation duration, AOI fixation frequency, AOI average reaction time, and hotspot map selected as evaluation indicators for human-machine interface task performance. The results demonstrated that: ① The layout design obtained from the human-machine layout optimization model increased the dispatcher's attention by 52%. ② The first fixation time, AOI fixation duration, and AOI average reaction time all had P-values less than 0.05, indicating significant differences; the optimized scheduling interface achieved a 50% increase in average AOI response time. ③ Although the P-value of AOI fixation frequency was greater than 0.05 and lacked statistical significance, data comparison analysis revealed its practical significance. ④ The hotspot map conformed to the level divisions of the field of view, and eye tracking was mainly concentrated within the optimal field of view area. The layout scheme and information display optimization strategy derived from the constructed human-machine layout optimization model for human-machine interface design can facilitate the rational allocation of attention among dispatchers, enhance their task performance, and provide reference for optimizing human-machine interface design.

Figures and Tables | References | Related Articles | Metrics
Research on upper limb rehabilitation product design based on NFBMS innovative design synthesis model
LI Xiaoying, YANG Lin, WANG Xingda, YAN Luochuang, TANG Yi
2025, 46(1): 211-220.  DOI: 10.11996/JG.j.2095-302X.2025010211
HTML    PDF 16     10

In order to improve the practicability and rationality of upper limb rehabilitation training products, a functional-behavior-structure (FBS) model was proposed for expansion, with the features of upper limb rehabilitation movements serving as the starting point. A design model for upper limb rehabilitation products based on need-function-behavior-motion-structure (NFBMS) was constructed by integrating user needs and rehabilitation actions. A demand index evaluation system was established through user research, and the entropy weight method was employed to prioritize user needs. The NFBMS model was utilized as the design framework to perform design decoupling. Combined with upper limb rehabilitation actions, the mapping process of the conceptual model of upper limb rehabilitation products was systematically analyzed, and the mapping results were solved through the design structure matrix (DSM) clustering. A functional structure coupling method for upper limb rehabilitation products was established to design a wearable upper limb rehabilitation device for stroke patients. NX12.0 software was used for kinematics simulation analysis to verify the rationality of the upper limb rehabilitation product structure. The research demonstrated that the method based on the NFBMS model, combined with kinematics simulation, enhanced the functional structure of upper limb rehabilitation equipment and provided a new idea for the innovative design of similar products.

Figures and Tables | References | Related Articles | Metrics
Visual interactive meaning evaluation method of movie posters based on deep learning
WANG Yan, ZHANG Muyu, LIU Xiuzhen
2025, 46(1): 221-232.  DOI: 10.11996/JG.j.2095-302X.2025010221
HTML    PDF 25     16

In recent years, the application of deep learning technology for the intelligent evaluation of image aesthetics has become a trend. However, there remains a need for an increase in the amount of annotated data required for high-level aesthetic description tasks, as well as an improvement in the quality and diversity of dataset annotations. To address this, the research took the interactive meaning of visual grammar as a starting point and introduced deep convolutional neural networks for evaluating the visual interactive meaning of movie posters. Firstly, a word segmentation tool was utilized to extract the core semantics of visual interactive meaning from academic literature on movie poster reviews, and the mapping relationship between visual interactive meaning and the characteristic elements of movie posters was summarized, with the aid of morphological analysis. Secondly, a collection of outstanding movie posters was gathered, and a dataset for evaluating the visual interactive meaning of movie posters was constructed in collaboration with expert reviews. Finally, a convolutional neural network was employed to extract features from movie poster samples, establishing a model for evaluating the visual interactive meaning of movie posters, which was verified through practical creation for its feasibility. This method expanded computer aesthetic evaluation to the field of movie poster design, constructing an objective evaluation model by simulating human vision and aesthetic cognition. This model will provide designers with more precise insights into user aesthetic needs and offer references for forward-looking design.

Figures and Tables | References | Related Articles | Metrics
The effects of eye-control speed and target travel distance on user interaction efficiency
ZHANG Ting, LAI Jiandu, HOU Guanhua, ZHANG Jingjing
2025, 46(1): 233-240.  DOI: 10.11996/JG.j.2095-302X.2025010233
HTML    PDF 9     1

Eye-controlled interaction, as a human-computer interaction method with natural interaction characteristics, can enable users to achieve convenient and fast real-time operations through eye movements. The existing research on eye control has primarily focused on comparing the uniform speed performance across different trajectories, which remains somewhat disconnected from practical applications of eye control. To investigate the impact of different eye control speeds and target movement distances on user interaction performance in an eye control interaction environment, a study recruited 23 participants to participate in two eye control experiments. The first experiment adopted a single-factor repeated measurement experimental design (speed values: 2 deg/s, 4 deg/s, 6 deg/s, 8 deg/s), and evaluated the interaction performance and user experience of different speed values by collecting data such as task completion time and accuracy. The second experiment adopted a dual-factor repeated measurement experimental design of 2 (speed: constant speed, variable speed) ×3 (distance: short distance, medium distance, long distance) to further compare the differences in interaction performance between constant speed and variable speed at different eye control distances. Each participant used their eyes to control the target to move a certain distance under different speed conditions. The results indicated that when users interacted with targets using eye control, 6 deg/s was a comfortable eye control uniform speed, with significantly better interaction performance at short or medium distances. However, the eye control interaction performance of variable speed exceeded that of constant speed over long distances. These findings suggested that the interaction effects of speed and distance could produce varying impacts on eye control interaction performance, providing guidance for designers in designing eye control speed and target movement distance parameters to enhance the eye control interaction experience.

Figures and Tables | References | Related Articles | Metrics
Published as
Published as 1, 2025
2025, 46(1): 241-241. 
PDF 6     29
Related Articles | Metrics