Loading...
Welcome to Journal of Graphics share: 

Current Issue

    For Selected: Toggle Thumbnails
    Image Processing and Computer Vision
    Defect detection method of communication optical cable based on adaptive feature extraction
    WANG Zhidong, CHEN Chenyang, LIU Xiaoming
    2025, 46(2): 241-248.  DOI: 10.11996/JG.j.2095-302X.2025020241
    HTML    PDF 22     13

    With the expansion of communication line coverage, the traditional inspection method for electrical corrosion defects in all dielectric self-supporting (ADSS) optical cables have faced issues of low efficiency and high costs. To address these issues, a detection method for electrical corrosion defects in ADSS communication cables based on adaptive feature extraction was proposed. This method achieved detection of electrical corrosion defects in ADSS cables by making targeted improvements to the YOLOv8n model. Firstly, an ADown downsampling module was introduced into the backbone network to preserve more detailed information about the cable during the downsampling process. Subsequently, a context feature enhancement module was introduced, enabling the algorithm to learn the defect features of optical cables more specifically. Finally, a C2f_DSC module based on feature adaptive extraction was proposed, utilizing the dynamic serpentine convolution feature in the neck network to enhance the extraction of cable area features. Experiments conducted on an ADSS cable electrical corrosion dataset demonstrated that compared to the baseline model YOLOv8n, the proposed algorithm achieved a 2.5% improvement in mAP50 accuracy and a 2.2% increase in mAP50∶95 accuracy, providing a new and effective method for ADSS cable inspection.

    Figures and Tables | References | Related Articles | Metrics
    Research on improved YOLOv8 detection algorithm for diesel vehicle emission of black smoke
    ZHANG Lili, YANG Kang, ZHANG Ke, WEI Wei, LI Jing, TAN Hongxin, ZHANG Xiangyu
    2025, 46(2): 249-258.  DOI: 10.11996/JG.j.2095-302X.2025020249
    HTML    PDF 16     4

    The emission of black smoke from diesel vehicles is a critical and challenging issue in road traffic environmental protection enforcement. Due to the influence of complex environmental conditions, to address the limitations of existing black smoke detection methods in terms of accuracy and speed, a lightweight detection model of black smoke emission from diesel vehicles based on improved YOLOv8 was proposed. Firstly, based on the YOLOv8 backbone network, a lightweight feature extraction module C2f-FasterRep was designed to improve the feature extraction capability of the model, while the C2f-FasterRep module integrated a context-anchored attention mechanism module to capture long-range contextual information. The global average pooling and bar convolution were employed to enhance the features in the central region of the feature map, thereby improving the detection accuracy. Secondly, a new network structure was proposed in the neck part to fuse the features extracted from the backbone network. The channel attention module and dimensional matching mechanism were employed to fuse features of different scales, enhancing the model’s multi-scale feature fusion capability. Lastly, the detection head of the YOLOv8 model was optimized using a Transformer decoder structure. An intersection-parallel-ratio aware query mechanism was incorporated to optimize the decoder query, improving the model’s performance for classification and localization. In order to ensure the authenticity and effectiveness of the experiment, data were collected and tested and verified using the detection equipment deployed on a road section in Xuchang, Henan Province. The experimental results demonstrated that the mAp of the proposed method was achieved at 95.4%, the precision at 94.5%, and the recall rate at 97.5%. Compared with existing black smoke detection methods, the proposed approach exhibited higher detection accuracy and faster detection speed. The results of ablation experiments confirmed that the proposed lightweight feature extraction module, feature fusion module and detection head contributed to improving the accuracy of model detection.

    Figures and Tables | References | Related Articles | Metrics
    DEMF-Net: dual-branch feature enhancement and multi-scale fusion for semantic segmentation of large-scale point clouds
    LI Zhihuan, NING Xiaojuan, LV Zhiyong, SHI Zhenghao, JIN Haiyan, WANG Yinghui, ZHOU Wenming
    2025, 46(2): 259-269.  DOI: 10.11996/JG.j.2095-302X.2025020259
    HTML    PDF 15     6

    Large-scale point cloud semantic segmentation serves as a critical task in the domain of 3D vision, with broad applications across fields such as autonomous driving, robotic navigation, smart city construction, and virtual reality. However, existing methods relying on down-sampling and exhibiting excessive disparities between multi-scale features often suffer from a substantial loss in the ability to capture fine-grained details and local structures. This degradation in the model’s capacity to preserve such local features impairs the accuracy of semantic segmentation. To address these issues, a novel semantic segmentation framework, DEMF-Net was proposed, which integrated dual-branch feature enhancement and multi-scale fusion strategies. The network incorporated a dual-branch enhanced aggregation module, which was designed to jointly encode point cloud attribute information and semantic features from the local neighborhood. Bilateral features were leveraged and embedded into corresponding original features, thereby improving the model’s ability to capture local details with higher fidelity. Furthermore, a multi-scale feature fusion module was introduced to effectively reduce the semantic gap between features at different scales. This module facilitated the fusion of adjacent multi-scale features, resulting in a global feature representation that synthesized information across all encoding layers. Such a design significantly enhanced the model’s global context awareness and enabled the integration of both upper and lower layer encoding, thereby enhancing the feature recognition capabilities. Comprehensive experiments were conducted on two widely used point cloud datasets, SensatUrban and S3DIS, to validate the performance of the proposed approach. Experimental results demonstrated that the mean Intersection over Union (mIoU) could be achieved by DEMF-Net at 61.6% and 66.7%, respectively, outperforming existing state-of-the-art methods.

    Figures and Tables | References | Related Articles | Metrics
    Image Processing and Computer Vision
    Human skeleton action recognition method based on variational autoencoder masked reconstruction
    WANG Xueting, GUO Xin, WANG Song, CHEN Enqing
    2025, 46(2): 270-278.  DOI: 10.11996/JG.j.2095-302X.2025020270
    HTML    PDF 5     5

    Masked autoencoders (MAE) have been applied in different fields due to their powerful self-supervised learning ability, especially in tasks where data is obscured or less training data is available. However, in visual classification tasks such as action recognition, the classification effect is poor due to the limited feature-learning ability of the encoder in the autoencoder structure. In order to train the model with a small amount of labeled data and enhance the feature extraction ability of autoencoders in human skeleton action recognition tasks, a spatial-temporal masked reconstruction model based on variational auto-encoder (SkeletonMVAE) was proposed for skeleton action recognition. The model introduced the hidden space of the variational autoencoder after the traditional masked reconstruction model encoder, allowing the encoder to learn the potential structure and richer information of the data. By adjusting the reconstruction quality using parameters β, the model pretrained the masked reconstruction of the skeleton data. When the pretrained encoder was used as a feature extractor for downstream classification tasks, its output feature representations were more compact, discriminative, and robust, which helped improve the model’s classification accuracy and generalization ability and improved the model’s performance with only a small amount of labeled data training. Experimental results on the NTU-60 and NTU-120 datasets demonstrated the effectiveness of the proposed method in the human skeleton action recognition tasks.

    Figures and Tables | References | Related Articles | Metrics
    Multiscale dense interactive attention residual real image denoising network
    GUO Yecai, HU Xiaowei, AMITAVE Saha, MAO Xiangnan
    2025, 46(2): 279-287.  DOI: 10.11996/JG.j.2095-302X.2025020279
    HTML    PDF 10     4

    To address the problem that the generated image is not clear enough due to incomplete feature extraction and low feature utilization in image denoising, a multi-scale dense interactive attention residual denoising network (MDIARN) was proposed. First, a multi-scale asymmetric feature extraction module (MAFM) was employed to preliminarily extract shallow information features, ensuring diversity of image features. Then, a multi-scale cascade module (MSCM) utilized multi-dimensional dense interactive residual units (MDIU) to perform multi-dimensional mapping of image features. These units were progressively cascaded to enhance the information transmission and interaction between models, fully fitting the training data. A dual-path global attention module (DGAM) was introduced to conduct global joint learning on multi-level features, acquiring more discriminative feature information. Skip connections were integrated to encourage parameter sharing between structures, enabling full integration of features from different dimensions and preserving the completeness of information. Finally, residual learning was employed to construct a clear denoised image. Experimental results demonstrated that this algorithm achieved peak signal-to-noise ratios of 39.80 dB and 39.62 dB on the real noise datasets (DND and SIDD), respectively, and structural similarities of 95.4% and 95.8%, respectively, outperforming mainstream denoising algorithms. In addition, the proposed algorithm demonstrated excellent performance in low-light environments, preserving more details and significantly enhancing image quality.

    Figures and Tables | References | Related Articles | Metrics
    Multi-fitting detection for transmission lines based on a cascade query-position relationship method
    ZHAI Yongjie, WANG Luyao, ZHAO Xiaoyu, HU Zhedong, WANG Qianming, WANG Yaru
    2025, 46(2): 288-299.  DOI: 10.11996/JG.j.2095-302X.2025020288
    HTML    PDF 5     2

    To address the challenges posed by the small-size and dense occlusion of fittings in aerial images of transmission lines, a multi-fitting detection method for transmission lines based on a cascade query-position relationship (CQPR) was proposed. Firstly, a cascade sparse query module was proposed to query the precise location of small-size objects in large-scale feature maps using the rough location of small-size objects on small-scale feature maps, thereby improving the accuracy of small-size fitting detection. Then, a positional feature relationship module (PRM) was proposed to optimize the detection results at the occlusion using the positional relationships between different fittings in the image to establish the positional PRM. This enriched the features of occluded areas and optimized fitting detection performance under conditions of dense occlusion. Experimental results on multiple baselines demonstrated that when the proposed CQPR for mult-fitting detection in transmission lines was applied to the baseline, achieving the accuracy of Faster R-CNN, Cascade R-CNN, Libra R-CNN, and Dynamic R-CNN at 82.9%, 82.4%, 83.7%, and 77.3%, respectively. These results surpassed those of other state-of-the-art object detection models, particularly in the accuracy of detection for small-size fittings and fittings with occlusion. The inference speed was also improved, achieving a balance between localization accuracy and real-time detection.

    Figures and Tables | References | Related Articles | Metrics
    MSFAFuse: sar and optical image fusion model based on multi-scale feature information and attention mechanism
    PAN Shuyan, LIU Liqun
    2025, 46(2): 300-311.  DOI: 10.11996/JG.j.2095-302X.2025020300
    HTML    PDF 16     6

    In response to the issue that remote sensing images obtained from a single imaging principle cannot provide rich information, heterogeneous remote sensing image fusion technology has emerged. The imaging of synthetic aperture radar images is not affected by factors such as clouds and weather, but lacks visual observation capability. The imaging of optical images is susceptible to harsh environments, but offer direct viewing effects and target interpretation capabilities. Integrating the two can fully leverage their respective advantages to obtain high-quality images that contain more feature information and possess visual observation capabilities. To fully utilize the different scale features of heterogeneous images, a SAR and optical image fusion model based on multi-scale feature information and attention mechanism (MSFAFuse) was proposed. Firstly, robust feature downsampling was introduced to form the feature extraction part, obtaining multi-scale features corresponding to heterogeneous images. Secondly, a feature enhancement module was employed to enhance the structural features and salient regional features in heterogeneous features at different scales. Then, using a dual-branch fusion module guided by feature information and L1-Norm, was used to fuse the obtained heterogeneous multi-scale features pairwise according to scale. Finally, the fusion results of different scales were input into the image reconstruction module for image reconstruction, resulting in the final fused image. Experiments have shown that the MSFAFuse model can smoothly enhance prominent features while preserving more details and structural information. Compared with existing fusion methods, the model has shown better performance across 10 different indicators. This demonstrated that the MSFAFuse model can effectively fuse optical images and SAR images, providing new insights for the development of their fusion and contributing to the advancement of future remote sensing image fusion technologies.

    Figures and Tables | References | Related Articles | Metrics
    Computer Graphics and Virtual Reality
    3D Gaussian splatting semantic segmentation and editing based on 2D feature distillation
    LIU Gaoyi, HU Ruizhen, LIU Ligang
    2025, 46(2): 312-321.  DOI: 10.11996/JG.j.2095-302X.2025020312
    HTML    PDF 6     4

    Semantic understanding of 3D scenes constitutes one of the fundamental ways humans perceive the world. Some semantic tasks, such as open vocabulary segmentation, and semantic editing, are essential research domains in computer vision and computer graphics. However, the absence of large and diverse segmentation datasets of 3D open vocabulary makes it challenging to directly train a robust and generalizable model. To address this issue, 3D Gaussian splatting based on 2D feature distillation was proposed, which distills semantic embeddings from the SAM and CLIP macromodels into 3D Gaussians. For each scene, pixel-wise semantic features were obtained via SAM and CLIP, and training was conducted using 3D Gaussian differentiable rendering to generate a scene-specific semantic feature field. In the semantic segmentation task, in order to obtain the accurate segmentation boundary of each object in the scene, a multi-step segmentation mask selection strategy was designed to obtain the accurate open vocabulary semantic segmentation for the new perspective images without requiring the tedious hierarchical feature extraction and training processes. Through explicit 3D Gaussian scene representations, the correspondence between text and 3D objects was effectively established, enabling semantic editing. Experiments demonstrated that the method achieved comparable or superior qualitative and quantitative results in semantic segmentation tasks compared to existing methods, while enabling open vocabulary semantic editing through a 3D Gaussian semantic feature field.

    Figures and Tables | References | Related Articles | Metrics
    BPA-SAM: box prompt augmented SAM for traditional Chinese realistic painting
    ZHANG Tiansheng, ZHU Minfeng, REN Yiwen, WANG Chenhan, ZHANG Lidong, ZHANG Wei, CHEN Wei
    2025, 46(2): 322-331.  DOI: 10.11996/JG.j.2095-302X.2025020322
    HTML    PDF 6     2

    Due to the lack of publicly available meticulously annotated datasets for traditional Chinese realistic painting, the development of image segmentation techniques in this field is severely hindered. Traditional Chinese realistic painting exhibits characteristics such as similarity in object and background color textures, as well as blurred object boundaries due to the use of gradient transitions, posing challenges for image segmentation. The emergence of the segment anything model (SAM) presents new possibilities for addressing these challenges. Despite SAM demonstrating remarkable segmentation capabilities and zero-shot generalization in the natural image domain, it faces issues of insensitivity to object details and foreground-background confusion when processing traditional Chinese realistic painting. To address these issues, a segmented Traditional Chinese realistic painting dataset themed around flowers and birds was constructed, comprising 403 images with 5 classes of fore-ground objects. Subsequently, we employed the LoRA (Low-Rank Adaptation) method was employed to fine-tune SAM, enabling it to adapt to the characteristics of traditional Chinese realistic paintings. Additionally, a novel boundary box prompting enhancement method called BPA-SAM was proposed, based on the U-Net model, to address fore-ground-background confusion by generating point prompts within the boundary box range. Ultimately, experiments confirmed that our approach improved SAM’s segmentation performance by 7.1% under boundary box prompting conditions, establishing a foundation for SAM’s image segmentation applications in the traditional Chinese realistic painting domain.

    Figures and Tables | References | Related Articles | Metrics
    Image to 3D vase generation technology combining procedural content generation and diffusion models
    SUN Heyi, LI Yixiao, TIAN Xi, ZHANG Songhai
    2025, 46(2): 332-344.  DOI: 10.11996/JG.j.2095-302X.2025020332
    HTML    PDF 7     3

    In the traditional manual production of 3D content, 3D meshes and textures serve as the foundational elements in constructing 3D assets. To enhance the visual representation and rendering performance of 3D assets, the meshes are typically constructed using quadrilateral faces, requiring optimal topology and UV mapping. Moreover, 3D textures must be congruent with the geometric shape and maintain global consistency. However, current 3D content generation technologies based on latent diffusion models fail to meet these standards, limiting their potential in practical applications. At the same time, procedural content generation techniques have gained widespread application in the gaming and architectural industries due to their ability to systematically produce a vast array of 3D assets that conform to industry best practices. To improve the usability of generated assets, an integrated solution combining procedural content generation with diffusion model techniques was proposed. Using the 3D rotational body example of a vase, the image-to-3D asset generation problem was divided into two principal tasks: 3D mesh reconstruction and 3D texture generation. In the domain of 3D mesh reconstruction, a novel vase generation program was developed, and a deep neural network was trained to learn the mapping between image features and procedural parameters, thereby facilitating the reconstruction from a 2D image to a 3D model. For3D texture generation, a novel two-stage texturing strategy was introduced, combining multi-view image synthesis and multi-view consistency sampling techniques to produce high quality texture maps with global coherence. In summary, a scheme for the automatic construction of 3D vase assets from images was presented, which can be generalized to generate other 3D rotational body content and holds promise for applications in generating other types of 3D content.

    Figures and Tables | References | Related Articles | Metrics
    Free sculpting system in virtual reality environment
    ZHU Xiaoqiang, YANG Yifei
    2025, 46(2): 345-357.  DOI: 10.11996/JG.j.2095-302X.2025020345
    HTML    PDF 5     2

    Shape modeling is a dynamic area in computer graphics, with virtual sculpting being one of the important paradigms in freeform shape modeling. Traditional virtual sculpting typically relies on desktop computers, where users manipulate meshes with controllers and view the models on two-dimensional displays. However, this approach poses issues such as limited viewing angles and poor immersion. With the emergence of virtual reality in recent years, the immersive interactive experience provides new possibilities for the implementation of virtual sculpting. By integrating virtual reality with virtual sculpting, a real-time virtual sculpting system was designed and implemented in a virtual reality environment based on quasi-uniform mesh as the structural foundation. The system design primarily encompasses surface point selection algorithms, mesh optimization techniques, mesh deformation strategies, and topology merging methods. Furthermore, a free topology algorithm was developed to provide higher degrees of freedom for sculpting modeling. For the general sculpting process, a series of user-friendly sculpting tools was implemented based on the aforementioned algorithms, ensuring that the mesh remains closed, manifold, and free of self-intersecting. Furthermore, to address the requirement for seamless fusion between arbitrary models, two fusion methods guided by signed distance fields were proposed, which utilized mesh deformation and mesh merging, respectively. The models created by the system can be applied to various scenarios. The experimental results validated the effectiveness, versatility, and user-friendliness of the proposed algorithms.

    Figures and Tables | References | Related Articles | Metrics
    Grasp pose generation for dexterous hand with integrated knowledge transfer
    ZHANG Xuhui, GUO Yu, HUANG Shaohua, ZHENG Guanguan, TANG Pengzhou, MA Xusheng
    2025, 46(2): 358-368.  DOI: 10.11996/JG.j.2095-302X.2025020358
    HTML    PDF 15     1

    Grasp pose generation for a five-finger dexterous hand plays a critical role in dexterous hand grasping tasks. Firstly, an intention-based grasp pose generation network was constructed, addressing variations in human hand grasping poses under different tool usage intentions, emphasizing the functionality of grasps under different intentions. Secondly, to tackle the issue that the a grasp pose generation network trained with limited data cannot adapt to all intra-class tools, a knowledge transfer-based grasp pose generation method was proposed. This method improved knowledge transfer to adapt to various poses of intra-class target tools for functional grasp while optimizing the inter-finger self-collision issue. Finally, in constructing the mapping relationship between the human hand and five-finger dexterous hand grasp poses, key point correspondence-based mapping rules were optimized. This enabled the generation of five-finger dexterous hand grasp poses under different intentions, laying a foundation for subsequent tool use operations. By combining intention-based grasp pose generation with knowledge transfer, the intention-based grasp pose generation network trained with limited data can generate better grasp poses for intra-class target tools. Compared to the original network, the proposed method reduced the penetration volume by an average of 0.917 cm3, the simulation displacement by an average of 5.25 mm, and the inter-finger self-collision probability by an average of 49.25%.

    Figures and Tables | References | Related Articles | Metrics
    Research on the method of 3D image reconstruction for cultural relics based on AR technology
    ZHOU Wei, CANG Minnan, CHENG Haozong
    2025, 46(2): 369-381.  DOI: 10.11996/JG.j.2095-302X.2025020369
    HTML    PDF 18     4

    The surfaces of cultural relics possess complex and variable details, necessitating an exceptionally high level of precision in the data acquisition process to ensure the authenticity and accuracy of the reconstructed models. However, traditional methods often fail to achieve high precision while maintaining efficiency, and the subsequent processes of 3D model construction and texture mapping are highly complex, requiring substantial manual adjustments. To address these challenges, a method based on augmented reality (AR) was proposed for the digital three-dimensional image reconstruction of cultural relics. This method employed 3D scanning and high-resolution cameras to capture point cloud data and image data of the relics. The Scale-Invariant Feature Transform (SIFT) algorithm was utilized to extract feature points from both types of data. Based on these feature points, Revit software, in combination with SketchUp and Geomagic Studio, was used to construct the three-dimensional models of the cultural relics. Through virtual camera registration, the constructed 3D models were overlaid onto images of the actual relics, resulting in augmented virtual models. Subsequently, the real relics and the overlaid virtual models underwent virtual fusion, completing the digital three-dimensional image reconstruction of the cultural relics. Finally, by integrating virtual reality (VR) technology, interactive displays of the reconstructed digital 3D images were implemented. Experiments conducted using the Shimao cultural relics dataset as the experimental subject demonstrated that this method can accurately reconstruct digital 3D models of the relics. Comparative experiments indicated that the proposed method outperformed traditional approaches in both 3D reconstruction accuracy and texture mapping, while reducing the need for manual adjustments. This method effectively addressed the issues of insufficient precision and cumbersome operations inherent in traditional digital 3D image reconstruction, enabling efficient and accurate digital reconstruction of cultural relics. It provided a novel technological pathway for the preservation, virtual display, and digital archiving of cultural heritage.

    Figures and Tables | References | Related Articles | Metrics
    Human-in-the-loop field-specific logo generation method
    LI Jiyuan, GUAN Zheyu, SONG Haichuan, TAN Xin, MA Lizhuang
    2025, 46(2): 382-392.  DOI: 10.11996/JG.j.2095-302X.2025020382
    HTML    PDF 9     3

    Compared to other types of generated pictures, logos are highly abstract, diversely-designed and unified in styles, making it challenging to directly control the outcome of the generated pictures. In an effort to efficiently generate logos that are in line with the characteristics of various industries and meet the requirements of multiple designs of composition patterns, a Human-in-the-Loop field-specific logo generation method was proposed. Firstly, based on Dreambooth, a method for fine tuning text-to-image diffusion models, and a dataset composed of logos collected from publicly available online sources the text-to-image model Stable Diffusion XL was utilized as the base model and trained to develop a “prototype model” for basic logo generation. Then, groups of lexicons for targeted industries were constructed. The prototype model was then used to generate logos for targeted industries under the guidance of the lexicons. Next, via human intervention, the generated outcomes were filtered into secondary datasets tailored to industry needs. Finally, “prototype model” was iteratively fine-tuned using LoRA and the secondary datasets, obtaining the final model for logo generation. The generated results of the final model were evaluated using cosine similarity between generated images and prompt words, as well as manual questionnaire indicators. The evaluation demonstrated that the logos generated by the final model have a considerable exhibited significant improvements in industry relevance, structural integrity, and aesthetic appearance compared to those generated directly by the untrained base model.

    Figures and Tables | References | Related Articles | Metrics
    3D human pose and shape estimation from single-view point clouds with semi-supervised learning
    FANG Chenghao, WANG Kangkan
    2025, 46(2): 393-401.  DOI: 10.11996/JG.j.2095-302X.2025020393
    HTML    PDF 15     5

    Under the condition of limited labeled samples, estimating 3D human pose and shape from single-view point clouds has consistently encountered issues such as low model estimation accuracy and weak generalization capability. Existing methods typically use a fine-tuning step to optimize the models for limited labeled samples, but this fine-tuning process significantly increases computational complexity and without fundamentally enhancing model generalization. To address these issues, a semi-supervised learning-based method was proposed for 3D human pose and shape estimation. Under conditions of limited labeled data, the proposed method utilized a large amount of unlabeled human point clouds to improve model accuracy and generalization capability. Specifically, weak and strong augmentations were applied to the unlabeled data, and 3D human parameter models were estimated for both types of augmented samples. Then, the accuracy of pseudo-labels for weakly-augmented samples was evaluated, and the predictions of strongly augmented samples were constrained based on consistency regularization. The procedure above was applied iteratively to gradually refine the quality of pseudo-labels and increase the number of pseudo-labels for training, thereby enhancing the model’s estimation accuracy. Extensive quantitative and qualitative experiments on various public datasets demonstrate that the proposed method enhanced the accuracy of 3D human pose and shape estimation under conditions of limited labeled samples and enhanced model generalization performance.

    Figures and Tables | References | Related Articles | Metrics
    Graph layout customization based on user-specified examples
    CHEN Junxu, WU Ziliang, ZHU Minfeng, CHEN Wei
    2025, 46(2): 402-414.  DOI: 10.11996/JG.j.2095-302X.2025020402
    HTML    PDF 4     2

    Traditional automatic graph layout algorithms, while ensuring the overall aesthetic properties of graph lay-outs, are unable to generate customized graph layouts. In different practical application scenarios, users often need to adjust the automatically generated graph layouts to meet specific requirements. Existing graph layout adjustments mainly fall into two categories: manual node-level adjustments and constraint-based graph layouts. The former is extremely time-consuming and monotonous, while the latter often lacks flexibility. A customized graph layout adjustment method based on user examples was proposed. This method primarily utilized graph blending theory to integrate the attributes and characteristics of example graphs into the source graphs, thereby achieving flexible and efficient customized graph layout adjustments. Initially, the examples were preprocessed. Two mapping examples and six mapping modes were then designed to generate node-level mapping matrices between the examples and the source graphs. The mapping matrices were employed to align the example graph with the source graph, and the graphs were blended at a certain ratio to obtain the customized graph layout adjustments. A web-based interactive system was designed and developed to implement this method, supporting example Sketch drawing, example importing and selection, source graph importing and selection, mapping mode selection, graph blending ratio control, and node-level fine-tuning. Case studies and evaluation experiments were conducted to validate the feasibility and effectiveness of the proposed method.

    Figures and Tables | References | Related Articles | Metrics
    A neural radiation field-based approach to ethnic dance reconstruction
    QIU Jiaxin, SONG Qianyun, XU Dan
    2025, 46(2): 415-424.  DOI: 10.11996/JG.j.2095-302X.2025020415
    HTML    PDF 4     2

    Chinese folk dance, as an art form inherited through generations, originates from the everyday lives of the people. However, with the development of the society, some traditional dances face challenges in effective preservation, leading to the risk of cultural loss. Different ethnic dances exhibit unique features and complex movement patterns. In order to enhance the preservation of ethnic dance, a 3D reconstruction method of human body based on an improved neural radiation field was proposed. This method first employed an improved pose estimation algorithm, which decomposed the deformation field into rigid and non-rigid motions generated by a deep neural network after noise reduction and optimization of the poses. The poses were mapped from the observation space to a standard space by linear hybrid skinning, producing a pose-independent deformation field. Then, a neural radiation field was used to reconstruct the 3D model of the human body. Throughout reconstruction, an attention mechanism was used to enhance the learning of edge colors and optimize the body movements obtained from pose estimation. Finally, a new rendered view of the dancer was obtained for each frame from different viewpoints. The experimental results showed that the proposed method can better reconstruct the dancer and the dance posture in 3D, improving the restoration accuracy compared to HumanNeRF. Compared with the traditional 2D dance preservation techniques, the method in this paper can better restore the dancer’s movements, fulfilling the purpose of folk dance preservation.

    Figures and Tables | References | Related Articles | Metrics
    High-precision reconstruction of swept surfaces with a planar path
    LIU Shengjun, TAO Shanshan, WANG Haibo, LI Qinsong, LIU Xinru
    2025, 46(2): 425-436.  DOI: 10.11996/JG.j.2095-302X.2025020425
    HTML    PDF 12     4

    The reconstruction of CAD modeling process from triangular mesh is a key focus in reverse engineering, and efficient, high-precision swept surface reconstruction is of great engineering value. Targeting the mesh representation surface generated by planar path sweeping composed of line and arc segments, sweeping reconstruction based on profile and path automatic extraction was proposed to achieve high-precision reconstruction of swept surfaces. Firstly, the initial path was automatically obtained based on the unified curvature vector field of the triangular mesh, and the profile was generated using Gaussian mapping iteration, registration, and fitting. Then, the scattered point set of the path was computed inversely. The straight line and arc segments in the path were identified using a tangent space representation, and the fitting optimization model was established based on the tangent geometric constraints to optimize the initial path. Finally, the swept surface was reconstructed by performing a sweeping operation with the calculated profile and path. Experimental results demonstrated that the proposed method achieved automatic extraction of profile and path curves, thereby reconstructing the modeling process of the sweeping model. This approach reduced tedious manual interactions, and the extracted profiles and paths effectively avoided the accumulation of discrete errors, resulting in a higher precision in the final reconstructed sweeping surface. The method was also applicable to models with noisy data and those with missing data.

    Figures and Tables | References | Related Articles | Metrics
    Tracing high-quality isolines for discrete geodesic distance fields
    WANG Wensong, ZHOU Zijun, XIN Shiqing, TU Changhe, WANG Wenping
    2025, 46(2): 437-448.  DOI: 10.11996/JG.j.2095-302X.2025020437
    HTML    PDF 7     2

    Geodesic isolines play an important role in visualizing intrinsic metric variations of geometric shapes and in verifying the accuracy of given geodesic algorithms. Typically, geodesic isolines are drawn on a linear triangular mesh by performing simple linear interpolation based on vertex distance values within each triangle. This approach is widely used due to its simplicity, but it is limited in precision due to the highly nonlinear geodesic distance field. As a result, isolines obtained through linear interpolation often exhibit various distortions. Even with high-resolution subdivision of the input mesh, it remains challenging to capture the true topology of the isolines, such as sharp corner features. Given that ridges effectively represent discontinuous transitions in geodesic paths, a novel approach was proposed: Similar to how a medial axis can be encoded using a Voronoi diagram of densely sampled boundary points, the geometric and topological characteristics of geodesic isolines can similarly be encoded using an Apollonian diagram. Specifically, a cubic function was employed to approximate the distance function variation along each edge of the mesh, and an Apollonian diagram was generated based on a set of weighted sample points distributed along the mesh edges. Then, each triangle was divided into several subregions according to the generated Apollonian diagram, allowing the distance field within each subregion to be effectively approximated using a linear function. Extensive experiments have validated the effectiveness of this method. The results demonstrated that, with relatively low additional computational cost, the generated geodesic isolines are more accurate than those produced using traditional linear interpolation methods.

    Figures and Tables | References | Related Articles | Metrics
    Digital Design and Manufacture
    Simulation and prediction method of satellite solar wing deployment test driven by digital twin
    CHEN Ruiqi, LIU Xiaofei, WAN Feng, HOU Peng, SHEN Jinyi
    2025, 46(2): 449-458.  DOI: 10.11996/JG.j.2095-302X.2025020449
    HTML    PDF 6     2

    The success of on-line deployment of satellite solar wings is a critical factor influencing their operational performance, and the ground deployment test of satellite solar wings serves as a crucial development step to verify the deployment and locking performance of the mechanism and ensure the compliance of deployment indicators. To address the challenges such as limited data monitoring, low simulation accuracy, and difficulties in predicting outcomes in the process of satellite solar wing ground deployment testing, a digital twin-driven simulation and prediction method along with a system architecture for satellite solar wing ground deployment testing was proposed. Based on the digital twin modeling of typical satellite solar wing products, related tooling and equipment for ground deployment testing, and deployment testing units, a multidisciplinary joint simulation for satellite solar wing ground deployment testing was conducted, and a simulation database was output. A prediction model training dataset was constructed by combining the simulation data and historical test data. Subsequently, the training and optimization of the satellite solar wing deployment process prediction model were completed. Through the real-time collection of key parameters in the ground deployment testing process via the Internet of Things platform, inputting these parameters into the optimized key parameter prediction model, the rapid and accurate prediction of key parameters such as pose, force, velocity, and deployment time in the satellite solar wing ground deployment testing process were achieved. Finally, based on the digital twin model, by integrating the actually collected and predicted data, the real-time monitoring of the mechanism deployment process and the visualization of prediction results were realized. This approach supported the intelligent management and control and decision-making in the satellite solar wing ground deployment testing process, guided the optimization of the deployment process and on-site adjustments, and effectively enhanced the deployment success rate and efficiency. Through the deployment in the satellite deployment unit and the application verification on typical satellite models, the proposed method achieved an online prediction accuracy rate of over 90% for key parameters such as the deployment time and pose of typical solar wings, verifying its effectiveness and feasibility.

    Figures and Tables | References | Related Articles | Metrics
    Industrial Design
    User experience evaluation of human machine interface in automotive autonomous driving takeover system
    WU Lei, SHENG Qinqin, ZHAO Ruisi
    2025, 46(2): 459-468.  DOI: 10.11996/JG.j.2095-302X.2025020459
    HTML    PDF 11     5

    To address the user experience evaluation problem of the human machine interface in autonomous driving takeover systems, a study was conducted on the evaluation index system and validation research of the autonomous driving takeover system. Firstly, based on in-depth interview and grounded theory, interviews were conducted with typical target individuals to establish an evaluation index system for the human-machine interface experience of autonomous driving takeover systems. The index system was composed of 18 indicators in four dimensions, namely safety experience, functional experience, efficiency experience and emotional experience. An evaluation scale for human-machine interface experience of autonomous driving takeover system was subsequently completed. Then, using the structural equation model method, 241 valid questionnaires were collected to construct a structural equation model of human machine interface experience evaluation of autonomous driving takeover system, and the evaluation dimension and index weights were determined. Finally, the design strategy for the human machine interface of the autonomous driving takeover system was proposed. The effectiveness of the evaluation system was verified through a design case study of the human machine interface of an L3 level autonomous driving takeover system. The research results provided relevant references and support for the field of interface design and experience evaluation of automotive autonomous driving system.

    Figures and Tables | References | Related Articles | Metrics
    Iterative optimization of robot surgery system interface design from the perspective of cognitive mechanisms
    LI Saisai, SUN Bowen, LI Dijia, PAN Wenjuan
    2025, 46(2): 469-478.  DOI: 10.11996/JG.j.2095-302X.2025020469
    HTML    PDF 5     7

    This study aims to enhance the cognitive experience of doctors using surgical robots during the diagnostic and treatment process, and to complete the iterative optimization of the interface design of the robot surgery system to ensure the efficiency and safety of the treatment. First, the interaction interface of the liver cancer ablation surgery robot was selected as the research object, and the natural language processing technology (NLP) based on the BERT (bidirectional encoder representation from transformers) model was used to extract key words from the user’s spoken report and select and classify cognitive related vocabulary. User needs were summarized and refined through the affinity diagram method. The expert’s weight assignment for each demand item were calculated using an innovative combination of the ICE (impact confidence ease) three-dimensional scoring model and the ideal point vector projection method. The key needs were selected based on the development cost and reference to the cognitive mechanism principles and FAST (function analysis system technique) model. User requirements were analyzed and transformed into design layers through a combination of cognitive mechanism principles and FAST model. The schemes before and after iteration were subjected to usability testing, collecting objective and subjective evaluation data to verify the rationality of the design. Then, the cognitive characteristics were taken as the entry point, and the design iteration and optimization of the liver cancer ablation surgery robot interface were completed through a combination of qualitative and quantitative analysis. Finally, the usability test demonstrated that the post-iteration design can significantly reduce the operation time and the frequency of ineffective operations, while achieving higher user experience scores. Introducing cognitive theory as a guide in the interface design of robot surgery systems, and combining the BERT natural language processing model with the ICE-ideal point vector projection demand analysis method, this approach enhanced the usability of the original interface, optimized the cognitive experience of the operation process, and provide guidance and reference in theory and practice for the interface design of robot surgery systems.

    Figures and Tables | References | Related Articles | Metrics
    Published as
    Published as 2, 2025
    2025, 46(2): 479. 
    PDF 2     7
    Related Articles | Metrics