Journal of Graphics

The three realms of visual turing: from seeing to imagining in the LLM era

HUANG Kaiqi, WU Meiqi, CHEN Honghao, FENG Xiaokun, ZHANG Dailing

2025, 46(5): 919-930. DOI: 10.11996/JG.j.2095-302X.2025050919

HTML

PDF 35 21

The Visual Turing evaluates computer vision models through a Turing-style assessment, offering a human-aligned benchmark for the advancing visual intelligence. With the advent of the large language models (LLM), computer vision technologies have advanced rapidly, achieving remarkable performance in tasks such as image classification, object detection and segmentation, and video understanding. However, despite these impressive technical achievements, there remains a significant gap between current algorithms and human visual cognition in terms of adaptability and generalization. The evolution of visual intelligence was revisited from the perspective of its three progressive levels—Seeing the Visible, Seeing the Cognized, and Seeing the Conceived—while systematically examining the limitations and challenges of current technologies. The objectivewas to drive computer vision toward a more human-like capacity for perception and cognition.

Figures and Tables | References | Related Articles | Metrics

A review of autonomous driving image synthesis methods: from simulators to new paradigms

HUANG Jing, SHI Ruihao, SONG Wenming, GUO Hepan, WEI Huang, WEI Xiaosong, YAO Jian

2025, 46(5): 931-949. DOI: 10.11996/JG.j.2095-302X.2025050931

HTML

PDF 18 11

Image synthesis techniques are crucial for the development of autonomous driving, aiming to provide training and testing data for autonomous driving systems in a cost-effective manner. With the development of computer vision and artificial intelligence (AI) technologies, neural radiance fields (NeRF), 3D Gaussian splatting (3DGS), and generative modeling have attracted much attention in the field of image synthesis. These new paradigms show great potential in autonomous driving scene construction and image data synthesis. Recognizing the importance of these methods for the development of autonomous driving technology, their development history was reviewed and the latest research works were collected, and the methods were re-examined from the practical perspective of the autonomous driving image synthesis problem. The progress of NeRF, 3DGS, generative modeling, and reality-virtual fusion synthesis methods in the field of autonomous driving was introduced, with special focus on NeRF and 3DGS, two reconstruction-based methods. First, some important issues were analyzed for the task of autonomous driving image generation, followed by detailed examination of representative schemes of NeRF and 3DGS in terms of the limited viewpoint problem, large-scale scene problem, dynamics problem, and acceleration problem faced by autonomous driving scenes. Considering the potential benefits of generative models for creating corner cases of autonomous driving, practical issues and existing research works on the use of autonomous driving world models for scenario generation were also presented. Then, the cutting-edge applications of virtual-reality fusion for autonomous driving image synthesis were analyzed, as well as the potential of NeRF and 3DGS combined with AI generative modeling for the task of autonomous driving scenario generation. Finally, current achievements were summarized and future research directions were outlined.

Figures and Tables | References | Related Articles | Metrics

Adaptive feature fusion pyramid and attention mechanism-based method for transmission line insulator defect detection

ZHAI Yongjie, ZHAI Bangchao, HU Zhedong, YANG Ke, WANG Qianming, ZHAO Xiaoyu

2025, 46(5): 950-959. DOI: 10.11996/JG.j.2095-302X.2025050950

HTML

PDF 17 18

To address the challenges of complex background interference and varying defect region scales in transmission line insulator samples, a method for transmission line insulator defect detection based on an adaptive fusion feature pyramid and attention mechanism was proposed. First, an adaptive fusion module (AF) was introduced to process multi-scale feature information, which was integrated into the feature pyramid network to mitigate the inconsistencies of defect region scales in aerial images of insulators. Next, a defect feature refinement module (DFRM) based on an attention mechanism was designed to handle interference from complex background noise by expanding the receptive field and capturing the contextual features of defective regions. Finally, the improved algorithm was validated on a real-world transmission line insulator defect dataset. Experimental results demonstrated that the proposed method outperformed existing approaches in insulator defect detection, achieving a 5.7% improvement in accuracy compared to the baseline model. These findings offered an effective solution for intelligent inspection in power grid systems.

Figures and Tables | References | Related Articles | Metrics

On-Site construction safety monitoring based on large vision language models

LENG Shuo, WANG Wei, OU Jiayong, XUE Zhigang, SONG Yinglong, MO Sijun

2025, 46(5): 960-968. DOI: 10.11996/JG.j.2095-302X.2025050960

HTML

PDF 15 16

To address the challenges of high development cost and limited applicability of traditional vision models in construction safety monitoring, an original solution based on large vision language model (LVLM) was proposed. Based on an open-source pretrained LVLM, various types of prompt strategies suitable for construction safety monitoring tasks were designed, including text prompts, image prompts with supplementary information, and image exemplar prompts. These strategies enable the LVLM to effectively comprehend and reason about construction site imagery. Moreover, an intelligent monitoring workflow and system architecture based on LVLM were developed. The proposed method has been applied to three representative construction safety monitoring scenarios, including supervisor absence detection, hazardous zone intrusion identification, and non-compliant behavior recognition. Empirical data validation demonstrated that with appropriate prompting strategies, the LVLM can achieve satisfactory recognition accuracy close to that of mainstream deep learning models without requiring data annotation and model training. The proposed approach has the advantages of low development cost, fast implementation speed, and flexible task adaptation, revealing application potential in the fields of image recognition and intelligent monitoring.

Figures and Tables | References | Related Articles | Metrics

SAM2-based multi-objective automatic segmentation method for laparoscopic surgery

LIU Cheng, ZHANG Jiayi, YUAN Feng, ZHANG Rui, GAO Xin

2025, 46(5): 969-979. DOI: 10.11996/JG.j.2095-302X.2025050969

HTML

PDF 17 17

Automatic segmentation in laparoscopic surgical scenes is a critical for enabling surgical robots to perform autonomous operations. However, this task faces three major challenges: the high similarity in texture and blurred boundaries of surgical targets, making accurate segmentation difficult; significant scale differences, which hinder the synchronous segmentation of multiple targets; and intraoperative interferences, such as motion artifacts and smoke occlusion, that affect segmentation completeness. To address these challenges, a multi-objective automatic segmentation method for laparoscopic surgery (SAM2-MSNet) based on the visual large model SAM2 was proposed. The network employed a LoRA+ fine-tuning strategy to optimize SAM2’s image encoder, enabling efficient adaptation to the texture features of laparoscopic images. A cross-scale feature synchronous extraction module was designed to realize accurate segmentation of multi-scale targets. Furthermore, a global perception module of feature relationships was constructed to enhance the anti-interference abilities, such as motion artifacts and smoke occlusion. Additionally, a pseudo-label-assisted supervision mechanism driven by directional gradient histograms significantly enhanced the accuracy of target edge segmentation. Experimental results demonstrated that SAM2-MSNet achieved a mean intersection over union (mIoU) of 70.2%/69.6% and a mean Dice coefficient (mDice) of 78.5%/75.0% on the Endovis2018 and AutoLaparo datasets. On the premise that the reasoning speed was equivalent to that of SAM2-UNet (23 frames per second vs. 25 frames per second), the segmentation accuracy was significantly improved by 3.0%/6.7% (mIoU) and 2.8%/6.8% (mDice). This work enabled high-precision automatic segmentation for laparoscopic surgical scenes, providing a robust technical foundation for the autonomous operation of surgical robots.

Figures and Tables | References | Related Articles | Metrics

PanoLoRA: an efficient finetuning method for panoramic image generation based on Stable Diffusion

YE Wenlong, CHEN Bin

2025, 46(5): 980-989. DOI: 10.11996/JG.j.2095-302X.2025050980

HTML

PDF 15 14

Panoramic images, which can express the overall information of the surrounding environment, have become an important way to construct virtual scenes. However, amidst the rise of artificial intelligence generated content (AIGC) technology, especially diffusion models trained on large-scale text image datasets and parameter-efficient fine-tuning (PEFT) techniques, research on the generation and rapid transfer of panoramic images is still insufficient. To address the challenges posed by the scarcity and spatial distortion of panoramic image datasets, 14 000 open-source panoramic image datasets were collected, finely annotated, and filtered through projection transformation. Based on this, the PanoLoRA method was proposed. In the process of extracting spatial features from the original convolution and self-attention modules, PanoLoRA additionally incorporated spherical convolution and LoRA (low-rank adaptation) modules. This enabled the explicit extraction of spherical features from panoramic images, which were then fused with the original planar features, thereby achieving efficient transfer learning for panoramic image generation while retaining the strong image generation ability of Stable Diffusion. The experimental results demonstrated that PanoLoRA outperformed the latest 5 Parameter-Efficient Fine-Tuning methods in comparison tests using the collected text panoramic image dataset, achieving comprehensive advantages and improving the quality of image generation and graphic consistency. A series of ablation experiments were conducted to verify the effectiveness of each algorithm module.

Figures and Tables | References | Related Articles | Metrics

Universal mesh generation method based on physics-informed neural network

ZHANG Haoxuan, LI Haisheng, WANG Min, LI Nan

2025, 46(5): 990-997. DOI: 10.11996/JG.j.2095-302X.2025050990

HTML

PDF 15 12

Structured mesh generation in numerical simulation often requires a lot of time and manpower. Traditionally, the process relies on establishing a mapping between the computational domain and the physical domain, which is typically obtained by solving partial differential equations. However, existing structured mesh generation methods struggle to simultaneously achieve both high efficiency and superior mesh quality. To address this issue, we propose a universal mesh generation model based on physics-informed neural network (UMG-PINN). This model formulates mesh generation task as a mesh deformation problem from the computational domain to the physical domain. By taking the boundary curve as input and leveraging an attention network, UMG-PINN captures the potential mapping between the computational and physical domains, thereby generating high-quality structured mesh. To enforce physical constraints, the model incorporates the Navier-Lamé equation from linear elasticity into the loss function, ensuring that the neural network conforms to the principles of elastic body deformation during optimization. A key advantage of UMG-PINN is its fully self-supervised training process, which eliminates the need for prior knowledge or pre-existing datasets, greatly reducing the effort required for structured mesh dataset construction. Experimental results show that UMG-PINN outperforms traditional transfinite interpolation methods by generating higher quality structured meshes. In addition, UMG-PINN can also be extended to unstructured mesh generation under the constraints of physical information.

Figures and Tables | References | Related Articles | Metrics

Semantic segmentation of small-scale point clouds based on integration of mean shift and deep learning

ZHU Hongmiao, ZHONG Guojie, ZHANG Yanci

2025, 46(5): 998-1009. DOI: 10.11996/JG.j.2095-302X.2025050998

HTML

PDF 7 9

In the field of point cloud semantic segmentation, accurate segmentation of small semantic objects has always been an important and challenging task. Point cloud data is typically sparse and irregular, and when small or distant objects are processed, existing fully-supervised point cloud segmentation algorithms often fail to effectively capture the features of these small semantic objects, leading to lower segmentation accuracy. This issue is particularly prominent in applications such as autonomous driving, robot navigation, and urban modeling, given their reliance on the accurate identification and localization of small objects. To address this problem, a small semantic point cloud segmentation algorithm integrating mean shift clustering with deep learning was proposed. The shortcomings of existing point cloud segmentation algorithms in handling small semantic objects were analyzed, emphasizing that due to the sparsity and weak local features of small objects, current methods are often unable to effectively extract their semantic information. To overcome this, mean shift was integrated into deep neural networks as a feature extraction module to enhance the model’s attention to small semantic objects. In terms of network architecture, a feature processing module and a small semantic object neighborhood capture module were also specifically designed. The feature processing module effectively enhanced the local features of small objects, facilitating the network to better distinguish small from large objects in complex backgrounds. Meanwhile, the small semantic object neighborhood capture module focused on the contextual information surrounding small objects, enabling the model to capture more precise semantic features in local regions. Through experimental evaluation on multiple point cloud datasets, the results demonstrated that the proposed method significantly improved segmentation accuracy, especially in sparse and small-object-dense scenarios. In conclusion, the small semantic point cloud segmentation algorithm based on the integration of mean shift and deep learning provided an effective solution for accurate segmentation of small semantic objects, with broad application prospects and practical significance.

Figures and Tables | References | Related Articles | Metrics

Research on UAV three-dimensional scene navigation based on deep reinforcement learning

LIU Bokai, YIN Xuefeng, SUN Chuanyu, GE Huilin, WEI Ziqi, JIANG Yutong, PIAO Haiyin, ZHOU Dongsheng, YANG Xin

2025, 46(5): 1010-1017. DOI: 10.11996/JG.j.2095-302X.2025051010

HTML

PDF 9 12

In recent years, with the UAV industry and application demands expanding, the realization of UAV autonomy and intelligence has been identified as a critical challenge As a foundational technology in the field of autonomous control of UAVs, UAV navigation and exploration have become a top priority in UAV application research. Currently, most UAV navigation and exploration methods rely on the reconstruction of environmental information, consuming excessive computation and memory, thus failing to meet the increasingly complex scenarios and real-time requirements. Therefore, based on the excellent representation learning ability of deep learning and the self-learning decision-making ability of reinforcement learning, an autonomous navigation method for unmanned aerial vehicles was proposed. By continuously optimizing decision-making strategies through self-learning, the navigation task could be better completed. The method first constructed a continuous action space and a non-sparse reward function to guide the learning process of the drone; then designed feature-extraction and decision-making modules to enhance the perception and decision-making capabilities of the UAV. The experimental results demonstrated that the algorithm exhibited the best navigation and obstacle avoidance performance in the simulated 3D scene. The navigation success rate in the designed 3D scene reached 87%, a 33% increase in average cumulative reward convergence value over that of the same period method, reduced the training time, and improved training stability.

Figures and Tables | References | Related Articles | Metrics

Multi-view synergistic visual analysis of ocean heat waves

HE Qi, XIE Qiuhan, HUANG Dongmei, CHEN Kuo, WANG Jian

2025, 46(5): 1018-1027. DOI: 10.11996/JG.j.2095-302X.2025051018

HTML

PDF 5 12

Against the background of increasing global warming, the frequency and intensity of ocean heat waves continue to rise, imposing serious impacts on marine ecosystems and coastal economic activities. Existing research methods were found to inadequately capture the complex characteristics of multi-factor coupling and multi-scale interaction of ocean heat waves, especially in the quantitative characterization of the spatio-temporal dynamic evolution. To address this scientific problem, a multi-view synergistic analysis methodology incorporating high-dimensional spatio-temporal features was proposed. Firstly, a feature extraction technique based on spatio-temporal graph convolutional network (ST-GCN) was developed. It realized the accurate portrayal of the spatio-temporal evolution law of ocean heat waves by constructing a multi-dimensional feature matrix containing heat wave intensity, frequency, duration and other indicators, and establishing dynamic spatial adjacencies by combining with the improved Delaunay triangular dissection algorithm. Secondly, a visualization system supporting multi-factor correlation analysis was innovatively designed. Multi-dimensional scaling method and the HDBSCAN clustering algorithm were adopted to deeply analyze the nonlinear coupling relationship between the ocean-heat-wave events and the key environmental drivers, such as sea-surface-temperature anomalies and wind-speed field. The system enabled researchers to intuitively explore the spatial and temporal distribution patterns of ocean heat waves and their driving mechanisms through the synergistic interaction of multiple views.

Figures and Tables | References | Related Articles | Metrics

DRec: large language model-driven data analysis recommendation system

CHEN Zhizhang, FENG Yingchaojie, WENG Luoxuan, SHEN Jian, CHEN Wei

2025, 46(5): 1028-1041. DOI: 10.11996/JG.j.2095-302X.2025051028

HTML

PDF 10 14

Natural language interaction systems have greatly simplified the interaction process between users and data analysis, allowing users to complete data analysis and chart generation through natural language. With the rise of large language models (LLMs), LLM-driven natural language data analysis systems have gradually become a trend in recent years. Thanks to their excellent logical reasoning and tool invocation capabilities, LLMs are able to generate more complex logical inferences and charts. However, interactive data analysis based on LLMs poses challenges. Data analysts must clearly define the direction of analysis to drive the interactive process, which often necessitates a deep understanding of the data. Furthermore, when employing LLMs for data exploration, analysts are often less directly involved with the data, which may lead to insufficient understanding of the data and consequently affect the overall control of the analysis process. To assist users in clarifying the analysis process and deepening their understanding of the data, the LLM-based recommendation and association-driven data analysis system DRec was proposed. This system aided users in developing a comprehensive understanding of the data through associative information and guides the data analysis process. At the same time, the system provided insights from both the semantic and data dimensions and offered query recommendations to assist users in determining the analysis direction. Case studies and user experiments demonstrated that the DRec system can enhance data analysis interaction efficiency and guide users toward reasonable data analysis results.

Figures and Tables | References | Related Articles | Metrics

Geodesic distance propagation across open boundaries

YUE Zijia, WANG Wensong, CHEN Shuangmin, XIN Shiqing, TU Changhe

2025, 46(5): 1042-1049. DOI: 10.11996/JG.j.2095-302X.2025051042

HTML

PDF 12 10

In the field of digital geometry processing, the computation of geodesic distances on surfaces is a fundamental and crucial task. During the calculation process, each surface point serves as both a receiver and a transmitter to propagate distances across the entire surface. When there are open boundary defects, existing algorithms attempt to fill holes and gaps in the ambient space. However, this approach remains inadequate when dealing with open-boundary defects occurring in highly curved regions. To address this, a new approach was proposed allowing geodesic distance propagation to naturally traverse holes without filling them. It was observed that traditional algorithms, after crossing holes, formed a “shadow” region. In this region, the shortest path was found to pass through the hole’s boundary, producing distances larger than the true geodesic distance. Based on this observation, three significant improvements were made to the classical fast marching method (FMM): First, boundary points are treated solely as distance receivers, preventing them from propagating distances to other points. Second, each point was allowed to propagate distances in both forward and backward directions, enabling points in the shadow region to obtain geodesic distances from the surrounding visible points. Finally, a balance was achieved between “near to far” and “visible region to shadow region” propagation modes by adjusting the priority of distance propagation. Experimental results demonstrated that even with highly complex open boundary defects, our method produced geodesic distances closely approximating the true solution (the solution for a model without defects).

Figures and Tables | References | Related Articles | Metrics

Knowledge-aware recommendation based on hypergraph representation learning and Transformer model optimization

ZUO Yuqi, ZHANG Yunfeng, ZHANG Qiuyue, XU Yingcheng

2025, 46(5): 1050-1060. DOI: 10.11996/JG.j.2095-302X.2025051050

HTML

PDF 9 10

The recommendation algorithm based on knowledge graphs has emerged as a significant research focus and hotspot in the field of recommender systems in recent years. The introduction of knowledge graphs, enables the acquisition of auxiliary information about items, thereby significantly enhancing the capabilities of recommendation systems and providing users with more precise and personalized recommendation experiences. In response to this trend, a knowledge-aware recommendation model optimized via hypergraph representation learning and the Transformer model was proposed. This model leveraged the unique advantages of hypergraphs in handling high-order relationships to directly model the complex interaction information between users and items, thereby greatly enriching their interaction information. Since local graphs lack global interaction information between users and items, global hypergraphs were constructed within local graphs. On the other hand, nonlocal graphs contain redundant information, so nonlocal hypergraphs were built to capture more comprehensive interaction information between users and items. Additionally, the attention mechanism of the Transformer model was employed to strengthen the collaboration between user nodes and item nodes, mining more valuable preference information from noisy user interaction data such as clicks on uninteresting items. This optimized the embeddings of user and item nodes, mitigating the impact of noise and enhancing the recommendation performance based on user preferences.

Figures and Tables | References | Related Articles | Metrics

Simulation technology for braiding process of composite materials based on kinematic principles

WU Haoyu, YANG Xiaochao, WANG Wei, ZHAO Gang

2025, 46(5): 1061-1071. DOI: 10.11996/JG.j.2095-302X.2025051061

HTML

PDF 10 11

A simulation algorithm for the braiding process of composite materials is presented, specifically addressing the calculation and prediction of braiding machine control parameters, yarn trajectories, and braid angles. The algorithm operates through two complementary subprocesses: inverse solution generates machine control data based on target braid structures, while forward solution computes yarn trajectories and braid angle distributions using predefined control parameters. Initially, the surface of the mandrel is discretized into a set of triangular patches as program inputs. The mandrel centerline is extracted, and local solutions are performed according to kinematic principles. The inverse solution algorithm generates the braid structure of the preset braid angle and determines the corresponding take-up speed, while the forward solution algorithm calculates the yarn trajectories and braid angle distribution of specific take-up speed and control parameters. The BraidSim module, developed on the FreeCAD platform, integrates functionalities including mandrel surface meshing, trajectory generation, and dynamic yarn deposition simulation. The proposed algorithm has been validated through typical braiding cases including circular variable cross-section mandrel, square variable cross-section mandrel, aero-engine intake mandrel and spatial curve mandrel. Results demonstrate that the obtained take-up speed and braid angle distributions align closely with design expectations. Additional simulations are conducted for variable cross-section mandrel and curved centerline mandrel, generating corresponding inverse solution machine control data and forward solution braid angle distribution, demonstrating the applicability of the algorithm to complex mandrels.

Figures and Tables | References | Related Articles | Metrics

Analysis and research on the functional architecture of aero-engine health management system

ZHAN Keyi, HUANG Weina, CHUN Daoyong, GUI Yongtao, ZHANG Chunlin

2025, 46(5): 1072-1084. DOI: 10.11996/JG.j.2095-302X.2025051072

HTML

PDF 17 9

To address the challenges of ambiguous requirement boundaries and dynamic architecture optimization in complex systems, integrating the MagicGrid methodology, a five-layer collaborative modeling framework was established, encompassing “scenario-requirement-function-logic-physical” domain. Model-Based Systems Engineering MBSE using the Systems Modeling Language was carried out on the MagicDraw platform, and functional architecture practices were conducted for an aero-engine health management system. The methodology was divided into four critical phases: ① Scenario-driven system boundary definition through internal block diagrams modeling stakeholder interaction topology; ② Scenario verification and requirement integrity analysis using traceability matrices constructed from requirement diagrams; ③ Dynamic functional behavior modeling via combined use case-activity diagrams, where modular decomposition was used to derive technology-neutral logical architectures; ④ Physical implementation modeling employing block definition diagrams and internal block diagram achieving logic-physical decoupling through standardized interface design. The case validation demonstrated that the proposed method achieved significant effectiveness in reducing functional overlap rate, improving prediction efficiency, enabling agile assessment of state impacts, and lowering the false alarm rate. The research not only established a full-lifecycle modeling paradigm for aero-engine health management systems, realizing end-to-end traceability spanning scenarios, requirements, functions, and physical components, but also expanded the engineering application scenarios of graphic methodologies in the MBSE domain. The multi-view collaborative modeling mechanism was shown to be of universal reference value for complex equipment system design, particularly for resolving cross-domain requirement conflicts and supporting traceable architecture evolution.

Figures and Tables | References | Related Articles | Metrics

Life prediction method of nuclear power heat exchanger based on improved red kite optimization algorithm and LSTM

XIAN Siyu, ZHAO Zetian, WU Xuanyu, FENG Yixiong, XUE Yang, ZHANG Zhifeng

2025, 46(5): 1085-1093. DOI: 10.11996/JG.j.2095-302X.2025051085

HTML

PDF 6 8

With the increase in the quantity and diversification of nuclear power equipment, predictive maintenance strategies based on condition detection have gradually become a focus of attention for nuclear power plants, especially for predicting the remaining service life of key equipment, such as cooling water heat exchangers. A heat exchanger life prediction method combining an improved red kite optimization algorithm (IROA) with a long short-term memory network (LSTM) was proposed to address the limitations of traditional methods in hyperparameter optimization and improve prediction accuracy. In response to the problem of premature convergence caused by insufficient initial population diversity in existing methods, crossover and mutation operations from genetic algorithms were introduced to improve the ROA algorithm, in order to enhance population diversity and the ability to escape from local optima. With an LSTM model trained using historical degradation data, we conducted a detailed analysis of the impact of different hyperparameter combinations on model performance, demonstrating that the optimized hyperparameter combinations could significantly improve prediction performance. To verify the effectiveness of the proposed method, we conducted comparative experiments between IROA-LSTM and other common prediction methods such as SVM, CNN, RNN, and standard LSTM, as well as comparative experiments with several other optimization algorithms, along with noise interference tests. The results indicated that IROA-LSTM not only performed well in various performance indicators, but also demonstrated strong robustness and stability, maintaining high prediction accuracy under different conditions. This provided reliable data support for developing scientific and reasonable maintenance strategies, thereby contributing to improved safety and economic efficiency of nuclear power plant equipment operation.

Figures and Tables | References | Related Articles | Metrics

Design and characterisation of a dual-branched chain turning assist device

LIU Zheng, PAN Guoxin, YAO Pengzhen, LIU Tian, SU Peng

2025, 46(5): 1094-1104. DOI: 10.11996/JG.j.2095-302X.2025051094

HTML

PDF 7 9

Regular turning is a critical nursing measure for preventing pressure ulcers in long-term bedridden patients. Existing mechanical turning-assist devices exhibit excessive structural rigidity without considering human turning biomechanics, while inflatable types demonstrate misalignment between the human joint rotation center and the mechanism rotation centers, potentially causing injuries during turning procedures. Therefore, researching supine turning-assist devices with human-mechanical movement synergy holds significant importance. Based on the research foundation of human turning kinematics, a dual-branch chain turning-assist device driven by a planar linkage mechanism is designed, and the main stress positions are the shoulder and hip regions. The mathematical model is established and the design parameters are deduced. The linkage transmission angle is analyzed using graphical methods, and at the same time, the kinematic simulation of the device, the workspace analysis and the experiments on extraction of the human turning center of gravity trajectory are performed, so as to carry out a kinematics graphic. This comprehensive approach enabled kinematic characterization of the entire device. The results show that the angle of assisting the human body to turning angles of 30°~45°, maintaining uniform contact stress distribution between human body and dual-branch chains, effectively preventing pressure ulcers.. The transmission angles range from 42.19° to 90.00°, and the mechanism has good force transmission efficiency in this angle range. This study verified the rationality of the device design. The dual- branch chain structure can fit the motion process of the shoulder and hip, provide multi-segment support chain contact, prevent localized pressure concentration, and meet the human-machine coordination principles and ergonomic requirements for rehabilitation aids. These findings establish a foundation for turning rehabilitation research and clinical application of assistive devices.

Figures and Tables | References | Related Articles | Metrics

Multi-mode swimming mechanism of a biomimetic robotic fish based on CFD simulation

XIA Minghai, LUO Zirong, YIN Qian, LU Zhongyue, JIANG Tao

2025, 46(5): 1105-1112. DOI: 10.11996/JG.j.2095-302X.2025051105

HTML

PDF 13 9

Bionic robotic fish represent an innovative class of underwater vehicles, characterized by their low noise emission, high reliability, and eco-friendliness. This study investigates the swimming mechanisms of a dual-fin-driven robotic fish through numerical simulations of its multi-mode motion capabilities, employing computational fluid dynamics (CFD). Kinematic and dynamic models of the robotic fish were developed, and the spatial motion equations for the undulating fins were formulated. A fluid simulation model was constructed using Fluent software, incorporating a force-motion coupled dynamic mesh simulation algorithm. The simulation results demonstrated that the robotic fish could execute various multi-mode motions, including forward and backward movement, maneuvering turns, and in-place turns, through the coordinated action of its dual undulating fins. The propulsive force was found to be proportional to the square of the wave frequency, while both the swimming speed and turning speed were directly proportional to the wave frequency. At a frequency of 6 Hz, the robotic fish achieved a swimming speed of 1.25 m/s and a steering speed of 3.2 rad/s. These findings validate the design feasibility and multi-mode motion performance of the robotic fish, providing a theoretical foundation and computational support for the optimization and motion control of the physical prototype of the bionic robotic fish.

Figures and Tables | References | Related Articles | Metrics

A construction plan intelligent review method based on YOLO and natural language processing

QIAN Zengzhi, SUN Yulong, ZHANG Jie, XIAHOU Xiaer, ZHOU Daxing, KANG Weide

2025, 46(5): 1113-1122. DOI: 10.11996/JG.j.2095-302X.2025051113

HTML

PDF 8 9

The manual review of construction plans in the building industry suffers from high repetitiveness, substantial time consumption, and extensive expert resource usage. To improve review efficiency and promote intelligent construction development, an intelligent construction plan review method was proposed, integrating review rule compilation, vector model construction, and image-text recognition to achieve intelligent review of multiple types of plans. The review rules were based on group technical documents and historical review samples, filtered through high-frequency historical review comments and expert judgment, then compiled item by item using regular expression technology. A review model based on semantic similarity comparison was constructed, embedding plan text content into vector space and implementing semantic comparison through vector cosine similarity calculation, thereby enhancing review flexibility and fault tolerance. Additionally, YOLO-based image text recognition technology was incorporated to process textual content in document images, ensuring comprehensive review coverage. Experimental results showed an average review accuracy of 90.4% and an 87.9% improvement in time efficiency compared to manual review. The system can process multiple text format inputs with robust performance, significantly improving review work efficiency and playing an important role in promoting enterprise digital transformation and the popularization of intelligent construction technology. Currently, the platform equipped with this review technology was tested in multiple branches and projects of the group, generating accurate review reports and delivering significant improvement in review efficiency.

Figures and Tables | References | Related Articles | Metrics

Towards unsupervised BIM product retrieval: a Weisfeiler-Lehman kernel enhanced approach

HU Huiqiang, HE Changyan, LIU Xiaojun, JIA Jinyuan, GAO Lu

2025, 46(5): 1123-1133. DOI: 10.11996/JG.j.2095-302X.2025051123

HTML

PDF 6 11

To meet the urgent need for building elements retrieval in the construction industry, an unsupervised building information modeling (BIM) product retrieval method tailored to the characteristics of industry foundation classes (IFC) data was proposed. The method fully exploited the semantic and geometric information from the IFC standard to construct a product attributed graph (PAG) as the product feature. By leveraging the multi-attribute channels of PAG, a PAG isomorphism prediction approach, enhanced by the Weisfeiler-Lehman (WL) kernel, was proposed to achieve BIM product retrieval. The proposed method accepted two IFC documents as input: Document A, representing the target product to be retrieved, and Document B, serving as the product library. Our method ultimately returned products from Document B similar to the target product in Document A. The principal contributions were threefold: ①The proposal of a BIM product retrieval framework that circumvented the need for data preprocessing while maintaining semantic integrity. ②The development of PAG feature extraction for BIM product and enhanced PAG isomorphism prediction method with augmented WL graph kernels. ③The design of an unsupervised convergence assessment strategy in which the convergence status was timely determined by analyzing the attribute differences between the attributes from source and those predicted. Empirical findings indicated that the PAG isomorphism testing of our methodology achieved convergence within a maximum of three iterations. Under the experimental conditions, the isomorphism testing of BIM products required no longer than 1 second, with an average accuracy rate of 95% in products retrieval.

Figures and Tables | References | Related Articles | Metrics

Dimension detection method for cable support curtain wall panels based on unbalanced optimal transport theory

TAN Liyun, LIU Jiepeng, LI Hantao, ZENG Yan, LIAO Yue, WU Xiaofeng, CUI Na

2025, 46(5): 1134-1143. DOI: 10.11996/JG.j.2095-302X.2025051134

HTML

PDF 4 11

Glass curtain walls are widely used in large venues and landmark buildings due to their unique aesthetic appeal and powerful shaping capabilities. The construction process is to first construct the keel support system and then install and debug the glass panels. However, during the specific operation, the axis of the built keel usually deviates from the design axis, complicating subsequent construction of glass panels. Our research on axis extraction of curtain-wall based on unbalanced optimal transport theory was conducted. This method fully utilized point-cloud data information and combined non-equilibrium optimal transmission theory. Steps included inputting point cloud data, preprocessing point cloud data, random sampling to obtain an initial axis point set, extracting a thick axis of the rod, and extracting a fine axis of the rod. In this way, the axis features of the target point cloud were obtained. Then, the curtain-wall dimensions were obtained based on the extracted axis. Experimental results showed that this method can effectively extract the axis of the curtain-wall keel and exhibited strong centrality and robustness. The effectiveness of the method was demonstrated by comparison with other algorithms. Compared with the actual measured results, the calculated curtain-wall dimensions deviated by within ±2 mm, remaining within the allowable error range.

Figures and Tables | References | Related Articles | Metrics

Graph-based diffusion solver for basic job-shop scheduling problem

YU Kexiong, HE Hongjun, YI Renjiao, ZHAO Hang, XU Kai, ZHU Chenyang

2025, 46(5): 1144-1151. DOI: 10.11996/JG.j.2095-302X.2025051144

HTML

PDF 7 9

The job-shop scheduling problem (JSSP) is a classic NP-hard combinatorial optimization problem with broad applications in manufacturing, logistics, and related domains. Due to its exponential computational complexity with increasing jobs and machines, traditional exact algorithms struggle with large-scale instances, while existing heuristic and deep learning-based methods often inadequately exploit global information. Furthermore, these approaches typically generate solutions from a single distribution, failing to capture the inherent multi-modality of combinatorial optimization problems. To address these limitations, we propose a novel global information prediction method based on diffusion probabilistic models. Our approach adapts the diffusion model to the structural constraints of JSSP, predicting a heatmap that represents the distribution of optimal solutions. Leveraging this heatmap, we perform constrained optimization and local search, effectively harnessing the model’s multi-modal generation capability and global information encoding. This results in high-quality, constraint-satisfying scheduling solutions. For enhanced computational efficiency, we implement our framework on the domestic deep learning platform Jittor, developing an optimized JSSP solving pipeline that achieves up to 40% faster inference than PyTorch. Extensive experiments on mainstream benchmarks demonstrate that our method outperforms existing approaches across varying problem scales, delivering state-of-the-art solution quality. To the best of our knowledge, this work presents the first diffusion-based solver for JSSP.

Figures and Tables | References | Related Articles | Metrics

Current Issue