Loading...
Welcome to Journal of Graphics share: 

Current Issue

    For Selected: Toggle Thumbnails
    Image Processing and Computer Vision
    A post-training quantization method for lightweight CNNs
    YANG Jie, LI Cong, HU Qinghao, CHEN Xianda, WANG Yunpeng, LIU Xiaojing
    2025, 46(4): 709-718.  DOI: 10.11996/JG.j.2095-302X.2025040709
    HTML    PDF 50     60

    The current post-training quantization methods can achieve near lossless quantization at high quantization bit-width, however, for lightweight convolutional neural networks (CNN), the quantization error remains nonnegligible, especially in the case of low bit-width quantization (<4 bits). To address this, a post-training quantization method for lightweight CNN, called the block-level BatchNorm learning (BBL) method, was proposed. Unlike current post-training quantization methods that merge the batch normalization layers, this method retained the weights of the batch normalization layer on a per-block basis, and learned the quantized model parameters and batch normalization layer parameters based on the block-level feature map reconstruction loss. It also updated the mean and variance statistics of the batch normalization layer. This method mitigated the distribution shift problem caused by low-bit quantization of lightweight CNN in a simple and effective manner. Furthermore, to reduce overfitting of the post-training quantization method to the calibration dataset, the method constructed a block-level data augmentation approach by ensuring different model blocks did not learn from the same batch of calibration data. To verify the proposed method, extensive experiments on the ImageNet dataset, demonstrated that compared with current post-training quantization algorithms, the BBL method can improve the accuracy by up to 7.72 percentage points and can effectively reduce the quantization error caused by low-bit post-training quantization of lightweight CNN.

    Figures and Tables | References | Related Articles | Metrics
    An object detection algorithm for powerline inspection based on the feature focus & diffusion network
    GUO Ruidong, LAN Guiwen, FAN Donglin, ZHONG Zhan, XU Zirui, REN Xinyue
    2025, 46(4): 719-726.  DOI: 10.11996/JG.j.2095-302X.2025040719
    HTML    PDF 28     35

    UAV images for powerline inspection usually have complex backgrounds, and often contain a lot of small targets, which may lead to a high rate of missed detections and false detections when processed by the general feature extraction networks for object detection. To address this, a feature focus & diffusion network (FFDN) was proposed for feature fusion, and an improved algorithm (YOLOv8-SFD) based on FFDN and YOLOv8 was designed for powerline component detection. Spatial-to-depth non-stride convolutions (SPDConv) were employed in the backbone network instead to preserve small-scale features and reduce feature loss caused by stride convolutions. The traditional feature pyramid network was replaced with the proposed FFDN. At the feature fusion stage, the feature focus modules in the FFDN were utilized to expand the receptive field and fuse multi-scale features, and the output feature maps by them were then diffused across different scales to enhance small target detection accuracy. Finally, the original YOLOv8 head was replaced with a dynamic detection head (DyHead) that integrates three attention mechanisms (scale, space, and task), to further enhance the performance. Experimental results demonstrated that YOLOv8-SFD achieved an accuracy rate of 76.7%, which was 7.6% higher than YOLOv8n; a recall rate of 43.0%, which was 2.0% higher; and a MAP of 48.2%, which was 3.8%. YOLOv8-SFD effectively enhanced the detection precision for small and obscured targets, and the detection speed reached 119 FPS, which satisfied real-time detection requirements.

    Figures and Tables | References | Related Articles | Metrics
    Zero-shot style transfer based on decoupled diffusion models
    LEI Songlin, ZHAO Zhengpeng, YANG Qiuxia, PU Yuanyuan, GU Jinjing, XU Dan
    2025, 46(4): 727-738.  DOI: 10.11996/JG.j.2095-302X.2025040727
    HTML    PDF 26     25

    Zero-shot style transfer aims to apply the style of a given source image to a target style domain described by a text prompt, without relying on a style image. Existing methods typically require time-consuming fine-tuning or optimization processes, while those avoiding such steps often fail to achieve a satisfactory alignment between content and style. A two-branch framework was proposed that enabled zero-shot style transfer with content and style alignment, without the need for training or optimization. Leveraging the diffusion models U-Net denoising network, the content branch first denoised the input image and extracts content features, preserving the source domain’s content structure. The style branch then employed a gradient-guided method to extract style information from the text prompt, which was transferred to the denoised image. Additionally, the style features were derived from the U-Net’s skip connection in the style branch sampling process, ensuring a clear separation between content and style. This decoupling of content and style allowed for effective style transfer while mitigating their entanglement within a single network. Finally, a feature modulation module (FMM) wais introduced to fuse the content and style features from the two branches, ensuring alignment and minimizing the impact on the content during the style transfer. Experimental results demonstrate that the proposed method achieved high-quality style transfer on any content image without the need for training or optimization.

    Figures and Tables | References | Related Articles | Metrics
    Intelligent depiction to illumination and shadow: robust video shadow extraction based on SAM
    CHEN Dong, LI Changlong, DU Zhenlong, SONG Shuang, LI Xiaoli
    2025, 46(4): 739-745.  DOI: 10.11996/JG.j.2095-302X.2025040739
    HTML    PDF 15     19

    A video shadow detection method based on the segmented anything model (SAM) is proposed to address the problem of low accuracy and robustness of traditional methods in handling complex and dynamic shadows caused by lighting variations and object occlusions.. The SAM decoder is fine tuned to better adopt to shadow detection, leveraging SAM’s accurate segmentation ability to extract shadow area in key frames, XMem model, incorporatingsensory memory, short-term memory, and long-term memory, is introduced to integrate information from adjacent frames, thereby optimizing and stabilizing shadow detection results. Experimental results show that the proposed method reduces the mean absolute error by approximately 31.8% and improves the intersection over-union ratio by about 19.7% compared to traditional approaches. Both qualitative and quantitative evaluations indicate that the proposed method not only improves the accuracy of video shadow detection but also exhibits superior robustness.

    Figures and Tables | References | Related Articles | Metrics
    Hierarchical attention spatial-temporal feature fusion algorithm for 3D human pose and shape estimation
    YAN Zhuoyue, LIU Li, FU Xiaodong, LIU Lijun, PENG Wei
    2025, 46(4): 746-755.  DOI: 10.11996/JG.j.2095-302X.2025040746
    HTML    PDF 20     25

    Monocular-video-based 3D human pose and shape estimation plays an important role in the fields of virtual try-on and special effects production. To address the problem of insufficient human modeling, simple spatial-temporal feature representation, and limited estimation accuracy in 3D human pose and shape estimation from monocular videos, a hierarchical-attention spatial-temporal feature-fusion algorithm was proposed. Firstly, hierarchical attention was applied for model human body parts in hierarchical spatial modeling, yielding learnable human pose spatial features. Secondly, the learnable human pose spatial features were combined with a parametric human template to guide spatial-temporal modeling of human motion temporal feature, achieving spatial-temporal feature fusion. Finally, the method of 3D human pose and shape co-optimization was proposed, and more accurate and smooth 3D human mesh was returned by multilayer perceptron. Experimental results on Human3.6M dataset demonstrated that MPJPE and ACC-ERR were 56.1 mm and 3.4 mm/s2 respectively, reductions of 0.5% and 5.6% compared with the state-of-the-art method, improving the accuracy of 3D human pose and shape estimation, and generating accurate and smooth 3D human mesh. Furthermore, the testing results on 3DPW and Internet videos confirmed the robustness of the proposed method when facing the challenge of fast motion.

    Figures and Tables | References | Related Articles | Metrics
    Object depth estimation methods for high photon flux environments
    YANG Jiaxi, YU Letian, BAO Qirui, BI Sheng, MA Xiaodou, Yang Shengqi, JIANG Yutong, FANG Jianru, WEI Xiaopeng, YANG Xin
    2025, 46(4): 756-762.  DOI: 10.11996/JG.j.2095-302X.2025040756
    HTML    PDF 19     22

    The high temporal-resolution and high-precision characteristics of single-photon avalanche diode (SPAD) have opened up a wide range of applications, especially in fields such as computer vision and computational imaging with increasing algorithmic performance demands. Accurate depth estimation can be achieved for various targets using SPAD measurements, however, every time when SPAD device detects a photon, it will enter an undetectable quenching period. When there are a large number of photons in the environment, photon arrivals are more likely to be recorded in earlier bins than later bins, resulting in an obvious histogram distortion towards the shorter temporal axis, while the degree of distortion exacerbates with the increase of photon flux (the number of photons detected per unit time). This phenomenon, known as the Pileup Effect, reduces the accuracy of depth estimation algorithms. In this paper, a SPAD-based prototype was first constructed to collect single-photon measurements under several different photon-flux settings, and a single-photon based dataset was developed to study the pileup effect for depth estimation vision tasks. Based on our dataset, a depth estimation network was then designed to learn photon-flux as global features, simultaneously integrating the local spatial features and global flux-based features from SPAD measurements. Extensive experiments demonstrated that our network significantly achieved superior depth estimation performance under several different photon-flux settings with pileup effects.

    Figures and Tables | References | Related Articles | Metrics
    Semi-supervised pulmonary airway segmentation based on dynamically decoupling intra-class regions
    WANG Ziyu, CAO Weiwei, CAO Yuzhu, LIU Meng, CHEN Jun, LIU Zhaobang, ZHENG Jian
    2025, 46(4): 763-774.  DOI: 10.11996/JG.j.2095-302X.2025040763
    HTML    PDF 14     28

    Accurate airway segmentation from computed tomography (CT) images served as the foundation for diagnosing and treating various pulmonary diseases. However, the complex tree-like structure rendered pixel-level annotation extremely difficult. Semi-supervised learning methods provide insights for airway segmentation with limited labeled data. However, in airway segmentation tasks, there are significant intra-class differences between the large airway (trachea and main branches) and small airway (peripheral bronchioles) regarding voxel quantity, branch count, and morphological structure. Nevertheless, in airway segmentation tasks, the severe intra-class imbalance problem causes the model being prone to overfitting toward the dominated segmentation classes in semi-supervised learning with limited annotations. This results in insufficient representation learning for peripheral bronchioles, leading to poor segmentation accuracy and limiting clinical applications. To address this issue, a novel semi-supervised pulmonary airway segmentation framework based on a single-teacher dual-student three-branch network was proposed, achieving accurate airway tree-like structure with limited labeled data. In addition, a plug-and-play dynamic threshold module was developed to guide the network in identifying sub-regions with different segmentation difficulties during network training steps. Moreover, a novel intra-class region decoupling strategy was designed, enabling representation learning for sub-regions with different segmentation difficulties through different constraint optimization methods. Experimental results on two public datasets and a real-world dataset demonstrated that the proposed method outperformed the current state-of-the-art methods for airway segmentation. The Dice coefficient achieved 91.96%, while the TD and BD metrics reached 81.88% and 78.32%, respectively, enabling fast and accurate airway segmentation from CT images.

    Figures and Tables | References | Related Articles | Metrics
    Computer Graphics and Virtual Reality
    A virtual reality experience for artistic re-creation based on emotion capture
    ZHANG Yufei, DING Ding, LI Zhuying
    2025, 46(4): 775-782.  DOI: 10.11996/JG.j.2095-302X.2025040775
    HTML    PDF 21     27

    With the vigorous growth of the virtual reality (VR) industry and the rising aesthetic demands among the public, the presentation of artistic works has become a key area of study for VR applications. Immersive art based on VR technology is popular for its ability to provide audiences with a multi-sensory, all-encompassing experience, including visual and auditory stimuli. However, existing applications at this stage were confronted with issues such as monolithic experience, limited interaction with the environment, and an insufficient sense of immersion. In view of the above problems, an emotion-capture-based art re-generation experience system in a virtual reality environment was designed and developed, using technologies such as facial recognition to capture emotions and enable interaction between users and artworks. Users’ feelings were fed back to the artwork by interacting with it, thereby enriching and personalizing the experience of the artwork and further regulating the emotions of the experiencer. In addition, an empirical study was conducted to evaluate the system usability, immersion, functionality, and their own emotions using questionnaires, and the results of the experiment showed that users were satisfied with the system, and demonstrated that users’ emotional states were improved to a certain extent.

    Figures and Tables | References | Related Articles | Metrics
    Study on the interaction of an AI-based motion capture technology in rehabilitation training systems for neuromyelitis optica
    ZHANG Shuai, HONG Ao, HU Hengrui, LAN Mingying, XI Xiaochao
    2025, 46(4): 783-792.  DOI: 10.11996/JG.j.2095-302X.2025040783
    HTML    PDF 18     20

    Artificial intelligence (AI) vision models and their motion capture technologies have attracted much attention in the medical field in recent years, but these technologies have not been extensively applied to the field of neuromyelitis optica (NMO) rehabilitation. The purpose of this study was to study and implement a rehabilitation training system for neuromyelitis optica through Mediapipe posture tracking technology and interactive feedback design. Specifically, the following core works were conducted: ① Research on the formulation of rehabilitation training plans. Through a combination of qualitative and quantitative methods, 36 NMO rehabilitation patients were shadowed and interviewed, and the follow-up form of the Chinese Central Nervous System Inflammatory Demyelinating Disease Registry was used to evaluate the comprehensive evaluation system. ② Feedback analysis of rehabilitation training action guidance. The rehabilitation training action library was designed, and the key indicators such as the coordinates of human joint points, the angle of joint point formation, and the difference between joint point information and standard information were determined. The human posture detection model was also constructed using the Mediapipe framework to realize the real-time recognition and feedback of rehabilitation training actions. ③ Design of interactive feedback mechanism of rehabilitation system. The design principles of timeliness, particularity, guidance, and traceability were established. Meanwhile, a feedback process that includes scene positioning, dimensional analysis, and interaction mode was designed. Additionally, the Fogg behavior model theory was introduced..Finally, a lightweight rehabilitation training platform based on WeChat applets was built. In this study, a set of neuromyelitis optica rehabilitation training system and its supporting methods were obtained, which verified the possibility of AI motion capture technology in this field, improved the experience of NMO patients in the process of rehabilitation training, and was actually promoted in Beijing Tiantan Hospital affiliated to Capital Medical University, demonstrating its wide range of application value.

    Figures and Tables | References | Related Articles | Metrics
    Investigation of scene scaling gains in redirected walking within virtual reality
    DU Xin, REN Yangfu, XU Senzhe, WANG Juhong, ZHENG Yufei, ZHANG Songhai
    2025, 46(4): 793-806.  DOI: 10.11996/JG.j.2095-302X.2025040793
    HTML    PDF 12     19

    The continuous advancement of virtual reality (VR) technology offers increasingly immersive experiences. However, optimizing user perception and walking efficiency in virtual environments (VE) remains a pressing challenge. This study conducted user experiments to investigate the perception threshold of scaling gain, user acceptance of extreme scene downscaling, and the combined effects of scaling and translational gains, aiming to enhance interaction experiences in virtual environments. First, subjects performed a target-following task in virtual scenes with varying scaling gains. Perception data was collected across three different scenarios, revealing a discernible range of perception thresholds for scaling gain. Scene characteristics, including virtual scene size and object density, significantly influenced the perception thresholds. Subjects demonstrated varying sensitivity to scaling gain across different scenarios, with smaller virtual environments and higher object density resulting in narrower perception threshold ranges. Additionally, a Likert scale was used to evaluate subject acceptance of extreme scene downscaling. The results indicated that excessive downscaling substantially reduced comfort and negatively impacted user experience. Furthermore, experiments on the combined effects of scaling and translational gains showed minimal impact on user perception, with participants reporting low discomfort and high acceptance of the combined gains.

    Figures and Tables | References | Related Articles | Metrics
    Acceleration method for neural implicit surface reconstruction with joint point cloud priors
    GUO Mingce, HUANG Bei, CHENG Lechao, WANG Zhangye
    2025, 46(4): 807-817.  DOI: 10.11996/JG.j.2095-302X.2025040807
    HTML    PDF 24     21

    To address the practical problem that the current neural implicit surface reconstruction tasks cost high training time, a sampling method guided by joint point-cloud priors we proposed, which reduced the model training time cost while ensuring the quality of surface reconstruction. Acceleration for the training of neural implicit surface reconstruction networks was achieved from three aspects: firstly, it alternated between random training pixel sampling and adaptive training pixel sampling based on point-cloud projection density to accelerat the model’s optimization for the locations of the surface to be reconstructed; secondly, by utilizing point-cloud priors and the adjacency relationship of sampled pixels, the propsed approach concentrated sampling on locations near the surface on training rays, thus reducing the number and time cost of importance sampling; in addition, it leveraged sparse point cloud prior loss to optimize the signed distance field network and periodically updated the point cloud cache with a certain iteration step. Comparative experiments conducted on ten test scenes from the DTU and Tanks-and-Temples datasets demonstrated that the proposed method can significantly reduce the training time cost of neural implicit surface reconstruction while preserving the quality of the reconstruction. When compared to the NeuS neural implicit surface reconstruction method, our approach reduced training time costs by 35%; with the same training duration, our approach achieved a 3.1% average increase in peak signal-to-noise ratio (PSNR) and a 3.4% average improvement in structural similarity index (SSIM) for new viewpoint image predictions.

    Figures and Tables | References | Related Articles | Metrics
    Geometric feature-based non-uniform rectangular mesh generation approach
    LENG Juelin, XU Quan, BAO Xianfeng
    2025, 46(4): 818-825.  DOI: 10.11996/JG.j.2095-302X.2025040818
    HTML    PDF 14     21

    The target object for simulating complex electromagnetic environments typically exhibits high geometric complexity, multi-scale geometric features, and a large number of components. To ensure both low computational cost and high geometric resolution, non-uniform rectangular meshes were employed to divide the computational domain for the finite-difference time domain (FDTD) method. In this paper, an automatic approach was proposed for generating non-uniform rectangular meshes based on geometric features. Firstly, a feature extraction algorithm for the discrete facets of CAD models was applied to capture the curvature and thickness features. Secondly, according to the extracted geometric features and user-defined refinement settings, the local size functions reflecting the desired mesh step size were constructed along each coordinate system direction separately. Finally, the locations of mesh nodes on each coordinate axis were calculated according to the local size functions, and then the material properties were assigned to the corresponding mesh cells. Numerical results demonstrated that the proposed approach was remarkably effective in generating non-uniform rectangular meshes for complex geometry models, and has been successfully applied to the time-domain full-wave electromagnetic simulations for complex electrically large size targets.

    Figures and Tables | References | Related Articles | Metrics
    Multi-dimensional implicit neural compression of time-varying volume data based on damping activation functions
    GUO Rong, JIAO Chenyue, GAO Xin, DENG Jiakang, TIAN Xiaoxian, BI Chongke
    2025, 46(4): 826-836.  DOI: 10.11996/JG.j.2095-302X.2025040826
    HTML    PDF 8     17

    Time-varying volume data from large-scale numerical simulations holds significant value in scientific research, but its vast size poses severe challenges to I/O bandwidth and disk storage, hindering the efficiency of data visualization and analysis. Traditional lossy compression techniques tend to lose important features at high compression ratios. While deep learning models have shown good performance in compressing volumetric data, they typically require access to the entire dataset before compression, which presents certain limitations. Implicit neural representations (INRs) have emerged as a powerful tool for compressing large-scale volumetric data due to their versatility. However, existing methods primarily rely on a single large multi layer perceptron (MLP) to encode global data, resulting in slow training and inference, and difficulty in accurately capturing high-frequency information. To address these issues, a uniform partitioning implicit neural representation method based on a damping function was proposed for efficient compression of large-scale time-varying volume data. First, a uniform partitioning strategy was applied, where multiple MLPs were used to fit local data separately. By equalizing partition sizes, balanced parallelization among the MLPs was achieved, significantly improving the efficiency of both training and inference. Second, a damping function was introduced as the activation function to overcome spectral bias, enabling the capture of high-frequency information without the need for extra positional encoding. Finally, adjacent cell information replication was used to solve boundary artifacts from partitioning. Experimental results showed that this method achieved higher compression efficiency and finer detail preservation across several time-varying volume datasets, demonstrating superior compression performance.

    Figures and Tables | References | Related Articles | Metrics
    Adaptive two-hand reconstruction network for monocular visible light environments
    LIAO Guoqiong, HUANG Longjie, LI Qingxin, GU Yong, LI Haibo
    2025, 46(4): 837-846.  DOI: 10.11996/JG.j.2095-302X.2025040837
    HTML    PDF 10     16

    An accurate reconstruction of the hand mesh is a crucial process for a natural human-computer interaction experience, but the task of hand reconstruction remains highly challenging due to factors such as hand occlusion, the complexity of collecting hand interaction data outdoors, and interference in complex lighting environments. Most of the existing work has achieved good results in laboratory and other environments with less interference, but the reconstruction performance in complex lighting scenes remains poor. To solve these problems, an adaptive two-hand reconstruction network was proposed for monocular visible light environments. By introducing a single hand detection frame and using a 2D complex lighting scene dataset for weak supervision, the model can enable generalization to complex lighting scenarios. The designed hand feature interaction module effectively established long-distance dependence relationships between the left and right hand features, alleviating the problem of the single hand detection frame lacking hand interaction information. The designed adaptive fusion strategy effectively integrated interaction features and single hand features, enhancing the robustness of the model. Experimental results demonstrated that the best results were achieved on the HIC dataset, comprising multiple complex lighting scenarios.

    Figures and Tables | References | Related Articles | Metrics
    Weighted respiration waveform reconstruction algorithm based on empirical modal decomposition
    GUO Linlin, YAO Min, ZHANG Wenqing, ZHANG Jia, SUN Jiande
    2025, 46(4): 847-854.  DOI: 10.11996/JG.j.2095-302X.2025040847
    HTML    PDF 9     18

    Respiration rate estimation based on Wi-Fi signals has garnered significant attention from academic and industrial communities due to its non-contact advantage. However, extracting high-quality respiration waveforms to ensure accurate respiration rate estimation has been a persistent challenge for researchers. In this paper, a weighted respiration waveform reconstruction algorithm based on empirical mode decomposition (EMD), referred to as (weighted empirical mode decomposition) WEMD, was proposed to improve the accuracy and robustness of respiration rate estimation under different environments. First, subcarriers with better periodicity were selected using breathing-to-noise ratio (BNR) and I/O decompose methods, and various respiration waveforms were generated. Second, principal component analysis (PCA) was applied to calibrate respiration waveforms, and EMD was employed to decompose respiration waveforms. Finally, an adaptive weighting mechanism was designed to reconstruct them by estimating the correlation between different frequency components and the original respiratory pattern. Experimental results demonstrated that the WEMD algorithm achieved an average respiration estimation accuracy of over 94% in four indoor experimental environments. The WEMD algorithm not only effectively addressed the impact of the low-quality Wi-Fi data on personal respiration estimation, but also accurately estimated irregular respiration rates, achieving high-precision respiratory rate monitoring across various environments, with an error below 10%.

    Figures and Tables | References | Related Articles | Metrics
    BIM/CIM
    Cross military and civilian housing information sharing mechanism based on three-dimensional spatial syntax and semantic richness
    HUANG Dongdong, HU Xinliang, DENG Zile, AN Dongyang, DENG Hui, DENG Yichuan
    2025, 46(4): 855-863.  DOI: 10.11996/JG.j.2095-302X.2025040855
    HTML    PDF 19     34

    In the current context of cross military and civilian governance, there is a lack of effective information exchange mechanisms between the military and civilian sides in housing governance, and the security of sensitive information is also challenged. Therefore, a cross-military-and-civilian housing information-sharing mechanism based on three-dimensional spatial syntax and semantic richness was proposed. IFC files were used as input information sources and 3D spatial-syntax extraction algorithms were utilized, so that the 3D indoor space could be simplified into a model composed of nodes and edges, presenting the connection structure graphics of the indoor space in the form of topological structures. External information sources were employed to semantically enrich house information in the Neo4j graph database, and Cypher language was applied to implement information queries in a semantically rich 3D spatial syntax model, thereby ensuring the efficiency and sensitivity of cross military information sharing. The accuracy and effectiveness of this method were verified through a 3D model of a campus building at a certain university. The implementation of housing information sharing not only improved the utilization rate and optimization of resource allocation in cross military and civilian contexts, but also enhanced emergency management and response capabilities in emergency situations.

    Figures and Tables | References | Related Articles | Metrics
    Digital Design and Manufacture
    Data mining and deep structured semantic model-based process sequence recommendation method
    ZHENG Jiahui, GUO Yu, WU Tao, WANG Shengbo, HUANG Shaohua, ZHENG Kaiwen
    2025, 46(4): 864-873.  DOI: 10.11996/JG.j.2095-302X.2025040864
    HTML    PDF 10     19

    To address the challenge of “data overload” encountered with traditional “experience-driven” methods in aerospace manufacturing process designs, we propose a process sequence recommendation method based on data mining and deep semantic models. This method integrates the PrefixSpan algorithm and BERT to extract typical manufacturing process sequences and their associated capabilities from component instance data, thereby constructing a reusable and updatable manufacturing process knowledge base. Building on this foundation, an enhanced spatial channel attention mechanism is introduced to accommodate the characteristics of aerospace manufacturing data, enabling implicit feature extraction form part instance data. Additionally, to mitigate the “cold start” issue caused by the uneven component instance distribution, self-supervised learning is employed to uncover the deep structure of the data, thereby ensuring the model’s generalization ability and improving its capability to learn from small sample instances. By combining a dual-channel attention-based deep semantic model with self-supervised learning, this approach effectively extract features, acquire knowledge, and accurately recommend process sequences suitable for aerospace manufacturing, even in the presence of data imbalance. Finally, a case study involving a specific aerospace component was conducted to validate the proposed method. Experimental results demonstrate that this method consistently outperforms benchmark models across various metrics in manufacturing process sequence recommendation, confirming its effectiveness and fulfilling the intelligent process design requirements of aerospace engineers.

    Figures and Tables | References | Related Articles | Metrics
    Study on flexible adaptive trajectory planning for blade robot abrasive cloth wheel polishing
    ZHANG Jingjing, GU Zhengzhao, WANG Junqing, LIU Jia
    2025, 46(4): 874-882.  DOI: 10.11996/JG.j.2095-302X.2025040874
    HTML    PDF 10     21

    In the blade complex curved parts polishing processing, the existing trajectory planning methods mainly consider the polishing contact as a geometrical problem, and do not adequately consider the effects of the polishing tool flexible deformation and material removal, resulting in errors in trajectory planning. In order to improve the blade surface processing accuracy, the influence of abrasive cloth wheel flexible deformation and contact surface material removal on the step and line spacing was studied. Firstly, the contact area was analyzed and the material removal model was established. Then, the step length and row spacing for the flexible adaptive polishing path points were calculated, and the constant feed speed was employed for interpolation. Secondly, the NURBS curve of the blade polishing region was extracted, and the trajectory points of blade surface polishing were generated via offline simulation. Finally, the test verification was conducted on a constructed polishing platform, the surface roughness of blade concave and convex reached Ra≤0.3 μm, the leading and trailing edges reached Ra≤0.2 μm, and the total polishing efficiency increased by about 9.40%. It was proved that the flexible adaptive trajectory planning method could effectively improve the surface machining accuracy and machining efficiency.

    Figures and Tables | References | Related Articles | Metrics
    Effect of process parameters on the solidification process of electron beam cold hearth melting
    RAN Hang, LIU Yu, XIAO Junhao, ZHAO Bing
    2025, 46(4): 883-887.  DOI: 10.11996/JG.j.2095-302X.2025040883
    HTML    PDF 15     14

    In the process of casting TC4 titanium ingots in an electron beam cooled bed melting furnace (EB furnace), precise control of process parameters is essential for the final quality of the titanium ingots. To address the possible quality problems in this process, the finite element method was used to calculate and analyze the solidification process of TC4 titanium ingot in EB furnace in detail, and the morphology of the molten pool of titanium ingot was studied under different pouring temperatures and different pouring speeds. The results showed that increasing the pouring temperature led to the increase of the liquidus and solidus depth of TC4 titanium ingots, which enhanced the fluidity, promoted the uniform distribution of grains, and reduced the width between the solids and liquids and the time of solid-liquid coexistence. Increasing the pouring speed enlarged the depth and width of the melt pool, expanded the paste zone, decreased the temperature gradient, lowered the solid-state ratio, and reduced the grain uniformity. In addition, this study provided theoretical support for the optimization of process parameters for electron beam cooled bed melting of TC4 titanium ingots, demonstrating that reasonable control of these two parameters could improve the solidification structure and performance of titanium ingots.

    Figures and Tables | References | Related Articles | Metrics
    Research on the method of repairing optical scanning incomplete model based on three-dimensional auricle template
    LIN Zhiyuan, WANG Yewei, YU Guangzheng, LI Zhelin, ZHANG Xin
    2025, 46(4): 889-898.  DOI: 10.11996/JG.j.2095-302X.2025040889
    HTML    PDF 8     14

    Three-dimensional models of the human auricle are essential for ergonomics and numerical simulation. Optical scanning imaging enables fast modeling, but local recessed structures, such as the cavum concha, cannot reflect light back to the scanning device, resulting in scanning blind spots. Traditional methods use reverse-engineering software to patch these areas based on the curvature of surrounding triangular meshes; however, the accuracy is relatively low and the efficiency is not high. To enhance the model accuracy, ear impression material can be injected into the cavum concha, and the resulting ear impression model is then scanned to obtain the precise recessed structure. This model is manually aligned and globally registered with the auricle scanning model at their overlapping regions to generate a relatively accurate and complete model. However, the overall process is complex and time-consuming. To address this issue, a method based on a three-dimensional auricle template to repair optical scanning incomplete models was proposed. First, the method constructed a statistical shape model library based on accurate complete models from 35 adult subjects and generated templates based on their correlated features. Then, an improved MeshMonk program was employed to rigidly and non-rigidly register the templates to the incomplete models of 5 new subjects to generate complete models. Finally, the root mean square (RMS) of deviation distances was used to compare and analyze the geometric differences among the repaired complete model, the generated complete model, and the accurate complete model for five subjects. The results indicated that the RMS error between the generated complete model and the accurate complete model was (0.37±0.01) mm (within the acceptable threshold of 0.50 mm). For point clouds with distances exceeding 0.50 mm, the RMS error between the generated complete model and the accurate complete model was (0.93±0.12) mm, smaller than the RMS error between the repaired complete model and the accurate complete model (2.87±0.49) mm. This demonstrated that the proposed method improved accuracy by approximately 68% compared to the reverse engineering software repair method when the point cloud distance exceeded 0.50 mm. Moreover, the method was streamlined and efficient, suggesting applicability for 3D scanning and modeling of the entire auricle or even the head.

    Figures and Tables | References | Related Articles | Metrics
    Industrial Design
    Data modeling and retrieval re-ranking methods for cloud-based product design processes
    SU Zhaojing, GUO Kaiyuan, YANG Mei, CONG Hongyu, YU Suihuai, HUANG Yuexin
    2025, 46(4): 899-908.  DOI: 10.11996/JG.j.2095-302X.2025040899
    HTML    PDF 11     19

    An unstructured data modeling and retrieval method tailored for cloud-based product design was proposed to address the challenges of unstructured data processing in product design, and to overcome the limitations of conventional retrieval systems with fixed ranking strategies and lack of precision for specific industry data First, a framework for unstructured data processing was developed to meet the practical needs of innovation and decision-making for cloud-based product design. Next, a novel approach redefined layout analysis of scientific documents as an object detection problem, building a multi-element layout analysis and recognition model within the context of domain-specific scientific document databases. By constructing a data feature space and label features, combined with the LambdaMART algorithm, dynamic ranking and efficient retrieval of domain-specific scientific document data were achieved. Finally, case studies validated the proposed method’s potential for application in product innovation, providing novel support for data-driven design iteration and precise decision-making.

    Figures and Tables | References | Related Articles | Metrics
    Design model of in-vehicle auxiliary information based on risk representation
    KE Shanjun, WANG Yumiao, NIE Chengyang, HE Bangsheng, GUO Dong
    2025, 46(4): 909-918.  DOI: 10.11996/JG.j.2095-302X.2025040909
    HTML    PDF 9     15

    In order to study how information can accurately characterize risk to assist drivers in accurately perceiving the environment, information samples containing different modalities, variables and parameters were designed to conduct urgency and annoyance perception measurement experiments. Based on the results of the perception measurements, a three-level in-vehicle auxiliary information design model containing modal ordering, variable selection and parameter optimization was constructed. Firstly, for visual, auditory and tactile modalities, the differential sensitivity of urgency perception of each design variable was compared. The design variable with the highest differential sensitivity was selected as the risk characterization variable of the modality, and the risk changes were characterized by the parameter level changes of the modality. Second, for each non-risk characterization variable, differences between the perceived urgency and disturbance at different parameter levels were compared, the parameter level with the largest difference was designated the optimal parameter level for the variable, and the auxiliary information model was constructed for each modality by combining the risk characterization variables. Then, linear equations of urgency and perturbation were fitted for the three modes, the difference in perturbation of each modality under the same urgency was observed, and the modes were prioritized according to the principle of high urgency and low perturbation. Finally, each modal auxiliary information model was superimposed in order to form a multimodal in-vehicle auxiliary information model with “4-level visual flicker frequency + 5-level tactile vibration duty cycle + 6-level auditory pulse gap”. The constructed in-vehicle auxiliary information model can accurately characterize 15 levels of environmental risks, thereby supporting accurate driver perception and enhancing driving safety.

    Figures and Tables | References | Related Articles | Metrics
    Published as
    Published as 4, 2025
    2025, 46(4): 919. 
    PDF 8     0
    Related Articles | Metrics