Welcome to Journal of Graphics share: 
Bimonthly, Started in 1980
Administrated: China Association for  Science and Technology
Sponsored: China Graphics Society
Edited and Published: Editorial Board  of Journal of Graphics
Chief Editor: Guoping Wang
Editorial Director: Xiaohong Hou
ISSN 2095-302X
CN 10-1034/T
Current Issue
30 June 2024, Volume 45 Issue 3 Previous Issue   
For Selected: Toggle Thumbnails
Cover
Cover of issue 3, 2024
2024, 45(3): 1. 
PDF 67     85
Related Articles | Metrics
Contents
Table of Contents for Issue 3, 2024
2024, 45(3): 2. 
PDF 41     31
Related Articles | Metrics
Review
Opportunities and challenges: automatic generation technologies for graphical user interfaces
LI Xiangdong, XIA Hanfei, SHAN Yifei, YIN Kailin, GENG Weidong
2024, 45(3): 409-421.  DOI: 10.11996/JG.j.2095-302X.2024030409
HTML    PDF 107     106

As emerging technologies like big data and artificial intelligence continue to evolve and find applications, a trend towards algorithm-driven automation and intelligent graphical user interface (GUI) design is rapidly transforming traditional manual methods. The need for automated GUI generation arises from the diversity of human-computer interaction devices, increasing functional complexity, and the demand for personalization. This approach not only addresses these challenges but also establishes a new paradigm in intelligent, collaborative human-computer interface design. To further expand developers’ and researchers’ understanding and application of cutting-edge technology methods and application cases in automatic GUI generation, this paper delved into the intersection of human-computer interaction and artificial intelligence, shedding light on the latest advancements in automated GUI generation and intelligent evaluation methods. The current technological approaches, algorithmic models, design methods, and evaluation frameworks for automatic GUI generation were analyzed. Advancements in GUI auto-generation technology based on artificial intelligence generated content (AIGC) were specifically highlighted, along with the new challenges posed to existing user interface design paradigms. Furthermore, the development trends and opportunities in the field of automatic GUI generation were summarized.

Figures and Tables | References | Related Articles | Metrics
Detection of traffic signs based on lightweight YOLOv8s
ZHU Qiangjun, HU Bin, WANG Huilan, WANG Yang
2024, 45(3): 422-432.  DOI: 10.11996/JG.j.2095-302X.2024030422
HTML    PDF 141     131

To enhance the real-time capability and feasibility of traffic sign detection, a lightweight traffic sign detection model based on YOLOv8s was proposed. Firstly, the BottleNeck in the C2f module was replaced with the residual module FasterNetBlock in FasterNet, reducing the model’s parameter count and computational complexity. Secondly, the large object detection layer was replaced with a small object detection layer, decreasing the number of network layers in Backbone and achieving a significant improvement in detection speed and a reduction in parameter count. Finally, the original complete intersection over union (CIOU) loss function was replaced with the wise intersection over union (Wise-IOU), thereby enhancing both speed and accuracy. Verified on the TT100K traffic sign dataset, compared with the YOLOv8s model, mAP50 increased by 5.16%, parameter count decreased by 76.48%, computational complexity decreased by 13.33%, and frames per second (FPS) improved by 35.83%. In comparison to other models, mAP50 exhibited an average increase of 15.11%, an average decrease of 85.74% in parameter count, an average decrease of 46.23% in computational complexity, and an average increase of 31.49% in FPS. This model achieved the advantages of high detection accuracy, small number of parameters, low computational complexity, and fast speed. It represented a substantial improvement over the original algorithm and demonstrated strong competitiveness when compared to other advanced traffic sign detection models, with great advantages in traffic sign detection.

Figures and Tables | References | Related Articles | Metrics
Detection of dress code violations based on improved YOLOv5s
LI Yuehua, ZHONG Xin, YAO Zhangyan, HU Bin
2024, 45(3): 433-445.  DOI: 10.11996/JG.j.2095-302X.2024030433
HTML    PDF 81     757

Addressing the issue of non-compliance in the attire of culinary staff in the complex background of the catering kitchen, where existing algorithms tend to have low detection accuracy and are prone to false detections and omissions, this paper proposed an improved attire compliance detection algorithm, YOLOv5s-ESW, based on YOLOv5s. Firstly, a novel multi-scale attention mechanism was introduced into the main network to enhance the network’s feature extraction capability. Secondly, within the neck network, the spatial and channel reconstruction convolution module (SCConv) replaced the original convolution module (Conv) to reduce model parameter redundancy and simultaneously enhanced model accuracy. Lastly, the WIoU loss function was introduced in the prediction part to accelerate convergence and enhance the model’s generalization capability. The improved algorithm was applied to a self-compiled dataset of catering kitchen staff attire for experimentation. The results validated that the improved model has elevated its mean detection accuracy by 4.1% and reduced its parameter quantity by 11.4%. While enhancing detection accuracy, the model also reduced network complexity, thereby satisfying the requirements for attire compliance detection among catering kitchen staff.

Figures and Tables | References | Related Articles | Metrics
Defect detection method of rubber seal ring based on improved YOLOv7-tiny
ZHANG Xiangsheng, YANG Xiao
2024, 45(3): 446-453.  DOI: 10.11996/JG.j.2095-302X.2024030446
HTML    PDF 60     42

Aiming at the problem of low efficiency in traditional detection of surface defects of rubber seal rings, an improved YOLOv7-tiny algorithm for surface defect detection of rubber seal rings was proposed. The PConv optimized ELAN structure was introduced into the backbone feature extraction network to enhance the algorithm’s feature extraction capability and to reduce the number of parameters. The global attention mechanism (GAM) was introduced into the feature fusion network, utilizing the attention weights between each pair of 3D channels, spatial widths, and spatial heights to improve efficiency by capturing the important features in three dimensions, thus enhancing the algorithm’s feature fusion capability. The WIoU loss function was employed to optimize the original bounding box loss function, enhancing the algorithm’s ability to locate the detected targets through a situation-compliant gradient gain allocation strategy. Additionally, a P2 small-target detection layer was added to strengthen the fusion of the deep and shallow feature information, thereby enhancing the algorithm’s ability to detect small-target defects. Experimental comparisons were conducted using the O-Rings dataset. The improved algorithm was compared with the YOLOv7-tiny algorithm, resulting in a 7.8% improvement in mAP and achieving a detection accuracy of 90.9%, meeting the needs of actual industrial production.

Figures and Tables | References | Related Articles | Metrics
Monocular depth estimation combining pyramid structure and attention mechanism
LI Tao, HU Ting, WU Dandan
2024, 45(3): 454-463.  DOI: 10.11996/JG.j.2095-302X.2024030454
HTML    PDF 37     38

Monocular depth estimation is the prediction of a dense depth image from a single color image. A monocular depth estimation algorithm combining pyramid structure and attention mechanism was proposed to address the issues of boundary ambiguity and insufficient capture of contextual information in current monocular depth estimation algorithms. The algorithm adopted the overall framework of encoder-decoder, in which the encoder selected the PVTv2 network to obtain more adequate global semantic information by taking advantage of the Transformer network in modeling global information. The decoder consisted of a depth estimation main branch and two pyramid sub-branches. The depth estimation main branch adaptively focused on important feature regions and feature channels between the encoder and decoder features through spatial and channel attention mechanisms. The Laplacian pyramid sub-branch and depth residual pyramid sub-branch aimed to learn rich local information from color images and depth estimation main branch depth features, transferring it to the depth estimation main branch to address the problems of missing details and chaotic structures in monocular depth estimation. Experimental results demonstrated that on the indoor public dataset NYU Depth V2, compared with the advanced algorithm P3Depth, the accuracy of δ1.25 threshold was increased by 1.22%, the absolute error and root mean square error were decreased by 5.8% and 2.8%, respectively. On the outdoor public dataset KITTI, the absolute error, root mean square logarithmic error, and root mean square error of the algorithm were decreased by 8.5%, 3.9%, and 0.4%, respectively. The algorithm improved the accuracy of depth estimation and achieved a good visual rendering.

Figures and Tables | References | Related Articles | Metrics
3D piece-wise planar reconstruction from a single indoor image based on self-augmented -attention mechanism
ZHU Guanghui, MIAO Jun, HU Hongli, SHEN Ji, DU Ronghua
2024, 45(3): 464-471.  DOI: 10.11996/JG.j.2095-302X.2024030464
HTML    PDF 40     29

The piece-wise 3D reconstruction of indoor scenes using convolutional neural networks (CNN) has become one of the hot topics in the research of indoor scene modeling. However, the intertwining of planar and non-planar elements often leads to the network’s extraction of non-planar information mixed with planar features, thereby affecting the final segmentation accuracy. Moreover, there are significant scale differences in the planes present in indoor scenes, leading to pronounced class imbalances, where small-scale plane instances are prone to distortion. To address these challenges, this paper proposed a self-enhanced attention-based multi-scale feature fusion network for 3D plane segmentation reconstruction. This network can automatically learn planar features in the scene and effectively fuse feature information from different scales, thereby enhancing the accuracy of plane instance segmentation. At the same time, by assigning different weights to each pixel in the plane instance, particularly increasing the weight values for small-scale plane edge pixels, the channel representation of small-scale plane segmentation objects was further enhanced. Finally, a new loss function was constructed using balanced cross-entropy loss and dice loss to train the model, further improving the accuracy of plane segmentation. Extensive experiments demonstrated that the algorithm proposed achieves significant improvements in plane recall rate and segmentation accuracy, resulting in more accurate indoor 3D segmented plane reconstruction models.

Figures and Tables | References | Related Articles | Metrics
Orthogonal fusion image descriptor based on global attention
AI Liefu, TAO Yong, JIANG Changyu
2024, 45(3): 472-481.  DOI: 10.11996/JG.j.2095-302X.2024030472
HTML    PDF 25     20

Image descriptors are important research objects in computer vision tasks and are widely applied to the fields of image classification, segmentation, recognition, and retrieval. The depth image descriptor lacks the correlation between the high-dimensional feature space and channel information in the local feature extraction branch, resulting in insufficient information for local feature expression. Therefore, an image descriptor combining local and global features was proposed. The multi-scale feature map was extracted through dilated convolution in the local feature extraction branch. After the output features were spliced, the relevant channel-space information was captured through a global attention mechanism with a multilayer perceptron. Then the final local features were output after processing. The high-dimensional global branches generated global feature vectors through global pooling and full convolution. The orthogonal values of local features were extracted on the global feature vector, and were then concatenated with the global features to form the final descriptor. At the same time, the robustness of the model in large-scale datasets were enhanced by employing the angular domain loss function containing the sub-class center. The experimental results on the publicly available datasets Roxford5k and Rparis6k demonstrated that in medium and hard modes, the average retrieval accuracy of this descriptor reached 81.87% and 59.74%, and 91.61% and 79.12%, respectively. This represented an improvement of 1.70% and 1.56%, and 2.00% and 1.83% compared to that of deep orthogonal fusion descriptors. It exhibited superior retrieval accuracy over other image descriptors.

Figures and Tables | References | Related Articles | Metrics
A highly robust image segmentation algorithm based on trade-off factors and multidimensional spatial metrics
LIU Yi, QIU Junhai, ZHANG Jiaxing, ZHANG Xiaofeng, WANG Hua, ZHANG Caiming
2024, 45(3): 482-494.  DOI: 10.11996/JG.j.2095-302X.2024030482
HTML    PDF 30     27

Image segmentation is an important research direction in computer vision. Clustering algorithms, serving as an unsupervised method, have always been a powerful tool for image segmentation. However, in scenarios where image possess high-intensity noise and complex structures, the segmentation effect of clustering algorithms might prove unsatisfactory. To address this problem, a highly robust image segmentation algorithm was proposed based on trade-off factors and multi-dimensional space metrics. Firstly, a trade-off factor was introduced to effectively reduce the influence of noise on the segmentation result by adjusting the factor. Secondly, the algorithm integrated both low-dimensional and high-dimensional space metrics, enabling the capture of linear and nonlinear features in the image. In this way, the algorithm facilitated a more comprehensive understanding of the complex structure and texture in the image, thereby enhancing the accuracy and robustness of segmentation. Finally, the algorithm achieved image segmentation through the application of an enhanced fuzzy clustering algorithm. To verify the performance of the algorithm, extensive experiments were conducted on synthetic, natural, and medical images, and the results demonstrated that the proposed method significantly outperformed other algorithms in terms of segmentation.

Figures and Tables | References | Related Articles | Metrics
Self-supervised active label cleaning
LIN Xiao, ZHANG Qiuyang, ZHENG Xiaomei, YANG Qizhe
2024, 45(3): 495-504.  DOI: 10.11996/JG.j.2095-302X.2024030495
HTML    PDF 25     19

Active label cleaning utilizes the active learning method for label noise processing to lower the cost of manual annotation. However, the existing active label cleaning methods still suffer from high cost of extra manual annotation, particularly due to a high proportion of correctly labeled samples among the selected suspicious ones. To address this problem, a self-supervised active label cleaning method based on core-set was proposed. Firstly, self-supervised tasks were employed for representation learning of all samples, followed by mapping the samples to a future space. Suspicious samples were then identified using a greedy K-Center set covering method, and label noise samples were selected for re-labeling based on uncertainty. By considering both the representativeness and uncertainty of samples, this method could effectively lower the proportion of correct samples in suspicious ones. Experimental results on public datasets with varying proportions of label noise demonstrated that the proposed method could significantly reduce the cost of extra manual annotation in each iteration, while also mitigating the cold start problem to some extent. Additionally, the effectiveness of the self-supervised core-set sampling module and the uncertainty prediction module in this method were validated through ablation experiments.

Figures and Tables | References | Related Articles | Metrics
Computer Graphics and Virtual Reality
A path planning for cultural tourism service robot combining improved A* algorithm and improved dynamic window approach
JIA Mingchao, FENG Bin, WU Peng, ZHANG Kun, SANG Shengju
2024, 45(3): 505-515.  DOI: 10.11996/JG.j.2095-302X.2024030505
HTML    PDF 33     39

To meet the needs for the guidance of algorithm search in a complex environment, the optimality of global path in a static environment, and the security of real-time obstacle avoidance in a dynamic environment for path planning of cultural and tourism service robots, an algorithm based on the fusion of the improved A* algorithm and the dynamic window approach was proposed. Firstly, based on the traditional A* algorithm, a more accurate search neighborhood selection strategy was adopted, the obstacle occupation grid rate was introduced to quantify map information, and the heuristic function and weight coefficients were dynamically adjusted. Secondly, the concept of safe distance was introduced, and a cubic polyline optimization method was proposed to eliminate redundant nodes and inflection points, enhancing the smoothness of the path. For the narrow channel environment, an adaptive arc optimization method was proposed to make the path more consistent with the kinematic constraints of the robot. The integration of the vertical distance cost function of dynamic obstacles effectively reduced the risk of conflicts and collisions between the robot and dynamic obstacles. Finally, the improved A* algorithm was integrated with the dynamic window method, selecting the key path point as the temporary target point for the dynamic window method, and applying the dynamic window method segmentally for local real-time path correction. Experimental results demonstrated that this fusion algorithm has search guidance, global path optimality and dynamic obstacle avoidance capabilities, and can reach the target point safely and quickly, thus offering certain application value.

Figures and Tables | References | Related Articles | Metrics
Lightweight human pose estimation algorithm combined with coordinate Transformer
HUANG Youwen, LIN Zhiqin, ZHANG Jin, CHEN Junkuan
2024, 45(3): 516-527.  DOI: 10.11996/JG.j.2095-302X.2024030516
HTML    PDF 29     24

Addressing issues such as large model size, high computational costs, and limited compatibility with edge devices in most existing bottom-up human pose estimation algorithms, this study proposed a lightweight multi-person pose estimation network model named YOLOv5s6-Pose-CT based on YOLOv5s6-Pose. In order to reduce feature redundancy across both spatial and channel dimensions, the network model introduced spatial and channel reconstruction convolution in the neck network. Simultaneously, a coordinate Transformer was incorporated into the backbone network to enhance long-distance dependence while maintaining efficient local feature extraction ability. Furthermore, unbiased feature position alignment was employed to resolve feature dislocation during multi-scale fusion. Finally, this study redefined the regression loss of bounding boxes using the MPDIoU (minimum point distance-based IoU) loss function. Experimental results on the COCO 2017 dataset demonstrated that compared with EfficientHRNet-H1 (a mainstream lightweight network), our optimized network model reduced parameters by 16.2% and computation by 66.1%, respectively, while maintaining comparable accuracy levels. Moreover, compared with the baseline approach, our proposed model achieved parameter and computation reductions of 11.2% and 5.8%, respectively, along with improvements of 2.5% in average detection accuracy and 2.6% in recall rate.

Figures and Tables | References | Related Articles | Metrics
Research on the influence of eye movement interaction frequency on visual fatigue in virtual reality
YAN Jiahao, LV Jian, HOU Yukang, MO Xinzhu
2024, 45(3): 528-538.  DOI: 10.11996/JG.j.2095-302X.2024030528
HTML    PDF 23     17

To investigate whether the frequency of eye movement interaction in virtual reality (VR) causes visual fatigue in users, 25 participants were selected for a visual fatigue experiment in a VR room. Eight interaction frequencies were set within the range of 0.2 to 1.6 Hz, and each participant underwent a 20-minute experiment at each frequency. Data on pupil diameter and blink frequency were captured using the built-in eye tracker in the head-mounted display (HMD) and were then subjected to linear interpolation and noise reduction processing. Firstly, subjective evaluations of the participants were recorded using a five-level fatigue scale, and the relative change rate in pupil diameter was utilized to reflect the degree of fatigue. Then, Spearman correlation analysis was employed to explore the relationship between subjective comfort ratings and blink frequency. Additionally, Kruskal-Wallis tests were conducted to analyze the relationship between blink frequency and interaction frequency. Finally, the proposed method was validated using the FAST disassembly and assembly robot digital twin system in VR. The experimental results demonstrated that the pupil diameter variation rate was minimal at 0.6 Hz, ranging from -1.86% to 2.26%, indicating a relatively comfortable interaction frequency. The highest blink frequency was 52 blinks per minute, with the highest level of visual fatigue. Within the frequency range of 0.2 to 0.6 Hz, the participants’ visual fatigue increased as the interaction frequency decreased. However, within the range of 0.8 to 1.6 Hz, the visual fatigue increased with the increase in frequency, with the lowest blink frequency observed at 0.6 Hz. Thus, appropriate eye movement interaction frequencies can reduce the degree of visual fatigue.

Figures and Tables | References | Related Articles | Metrics
Unsupervised clothing animation prediction for different styles
SHI Min, ZHUO Xinru, SUN Bilian, HAN Guoqing, ZHU Dengming
2024, 45(3): 539-547.  DOI: 10.11996/JG.j.2095-302X.2024030539
HTML    PDF 23     16

Dressing animation generation is a key technology in 3D animation, with clothing deformation as its core, which has been a focal point of research in this field. Existing clothing deformation methods mostly focus on a single clothing style for research, requiring retraining for style changes, thus consuming time and increasing computational costs. Additionally, most current methods rely on supervised approaches for network training, necessitating extensive data preparation and training expenses. In light of these challenges, an unsupervised clothing animation generation method applicable to different styles was proposed. Firstly, a learnable style feature representation was introduced, capturing the probabilistic distribution model of style-constrained motion latent space. Secondly, an unsupervised clothing deformation prediction network was established with style constraints and grounded in an encoder-decoder architecture. Furthermore, a Transformer encoder-decoder layer was incorporated to extract temporal motion features. Finally, multiple style animation generation experiments were conducted, comparing the proposed method with existing methods in terms of visual effects and quantitative metrics. Experimental results demonstrated that the proposed method can generate visually plausible clothing animations with adjustable styles, outperforming existing methods in prediction accuracy and reducing penetration loss.

Figures and Tables | References | Related Articles | Metrics
Depth completion with large holes based on structure-guided boundary propagation
ZHAO Sheng, WU Xiaoqun, LIU Xin
2024, 45(3): 548-557.  DOI: 10.11996/JG.j.2095-302X.2024030548
HTML    PDF 27     14

When collecting depth information using consumer-depth cameras, the collected depth information is often influenced by factors such as equipment, environment, and object material, often leading to missing depth information and holes, limiting the application of depth images in subsequent vision tasks. Existing depth-completion algorithms often struggle to effectively address large-area depth missing, resulting in poor complementation effect and poor object boundary maintenance. To tackle these two problems, a depth-completion algorithm for large holes based on structure-guided boundary growth was proposed. First, combined with the boundary information provided by the RGB images, the structure-guided boundary growth strategy was employed to complement the depth loss at the object boundary. Finally, the large holes inside the object were complemented using a combination of large-hole cut-and-fill and mean filtering. The experimental results demonstrated that the algorithm was able to efficiently maintain object boundaries with large missing areas and across missing objects, while being able to complement the depth information of large missing areas. Quantitative and qualitative results on multiple datasets demonstrated the effectiveness of the method.

Figures and Tables | References | Related Articles | Metrics
Parametric custom design of insoles based on 3D foot shape and plantar pressure distribution
HUANG Yi, ZHU Zhaohua, WANG Jia, ZHOU Kexuan
2024, 45(3): 558-563.  DOI: 10.11996/JG.j.2095-302X.2024030558
HTML    PDF 20     11

To reduce peak plantar pressure, enhance the comfort of insole products, and prevent foot-related diseases, a process for designing parametric, customized, porous insoles was proposed. This process was driven by a three-dimensional foot model and plantar pressure distribution. First, a method was constructed for the automatic generation of highly adaptive insole models using the user’s three-dimensional foot model and relevant foot dimension parameters. Secondly, an image sampling algorithm was employed to establish a mapping point set for the distribution of plantar pressure on the user’s foot to the spatial density distribution in the insole. This served as the foundational data for generating spatial unit structures within the defined domain of the insole model through the Voronoi 3D tool. Finally, Voronoi skeletal lines adapted to the insole model’s boundary were extracted, and a porous insole mesh model was established by integrating a mesh generation algorithm. Dynamic and static plantar pressure experimental results indicated that, compared to traditional insoles, porous insoles designed based on foot shapes and pressure distribution could effectively reduce peak plantar pressure, increase the contact area on the plantar surface, and improve the symmetry of pressure distribution between the left and right feet.

Figures and Tables | References | Related Articles | Metrics
Research on drive axle assembly sequence planning based on semantic process knowledge
WANG Gangfeng, ZHANG Huan, LIU Simeng, YUE Ping, ZHANG Dong
2024, 45(3): 564-574.  DOI: 10.11996/JG.j.2095-302X.2024030564
HTML    PDF 27     28

In response to the lack of knowledge-based reasoning and decision-making in the assembly sequence planning of construction machinery drive axles, which makes it difficult to realize intelligent assembly of complex products, a method for assembly sequence planning based on semantic process knowledge was proposed. The semantic process knowledge information model of the drive axle assembly was constructed to express the hierarchical structure information, attribute information, and assembly semantic information of sub-assemblies. By establishing an assembly sequence planning ontology and introducing semantics web rule language (SWRL) rules into the assembly ontology, the semantic representation and reasoning and decision-making of process knowledge were studied. The process knowledge graph of the drive axle was constructed using Neo4j, and the assembly sequence of typical structural sub-assemblies was quickly identified. By quantifying the influence of geometric properties, physical properties, and assembly process information of parts on assembly efficiency, the assembly weight sequence was generated, and the assembly sequence of atypical structure sub-assemblies was iteratively modified by SWRL rules. Finally, the feasibility of the proposed method was verified by generating and simulating the assembly sequence of the drive axle of a certain model of road roller, providing a reference for knowledge-driven complex product assembly sequence planning.

Figures and Tables | References | Related Articles | Metrics
Complex shell structure analysis and optimization based on isogeometric analysis and simulated annealing algorithm
XUE Yutong, WANG Aizeng, YUE Yike, HE Chuan, ZHAO Gang
2024, 45(3): 575-584.  DOI: 10.11996/JG.j.2095-302X.2024030575
HTML    PDF 23     12

This paper proposed a shell structure analysis and optimization algorithm based on the isogeometric method and simulated annealing, aiming to enhance the efficiency of CAD and CAE integrated design for complex shell structures. Firstly, based on NURBS technology, the parametric modeling of the thin shell structure was carried out, and then the isogeometric analysis was realized based on the Kirchhoff-Love shell theory. Then, based on the simulated annealing algorithm, the geometric parameters of the shell served as the design variables, while multiple mechanical properties acted as the objective functions for optimization, thus realizing the integration of isogeometric analysis and intelligent optimization algorithm for shell structure analysis and optimization. Finally, the effectiveness of the algorithm was verified through two examples. Compared with traditional finite element methods, this algorithm has the advantages in precision and efficiency.

Figures and Tables | References | Related Articles | Metrics
Deadlock determination for digital twin workshops based on Petri nets and Banker’s algorithm
YANG Yifeng, CHEN Yazhou, CHEN Yiming, LIN Xiaochuan, WANG Hongxing
2024, 45(3): 585-593.  DOI: 10.11996/JG.j.2095-302X.2024030585
HTML    PDF 20     709

Unreasonable resource allocation or process arrangement in the workshop production process can lead to deadlock phenomenon, resulting in the inability to continue production and greatly reducing workshop production efficiency. To address the above problems, the theories of Petri net and Banker’s algorithm were integrated to classify the deadlock formation conditions into four kinds: mutual exclusion waiting, possession waiting, cyclic waiting, and inalienability. On the basis of these four conditions, deadlocks were classified into four different manifestations, namely resource allocation deadlock, process order deadlock, collaborative object deadlock, and dynamic resource deadlock. Determining the existence of deadlocks using Banker’s algorithm, and determining the specific location of deadlocks in the workshop using the time accessibility analysis method, the deadlock recovery strategies under different forms of deadlocks were established. The proposed method was integrated into the workshop digital twin system using software such as Tina and Unity 3D, thus achieving the workshop process deadlock monitoring and prediction functions. Finally, the production process of precision stamping parts in a certain workshop was verified as an example, and the results demonstrated that the proposed method could effectively achieve real-time monitoring and efficient prediction of the production process.

Figures and Tables | References | Related Articles | Metrics
BIM/CIM
A new interaction paradigm for building design driven by large language model: proof of concept with Rhino7
JIANG Can, ZHENG Zhe, LIANG Xiong, LIN Jiarui, MA Zhiliang, LU Xinzheng
2024, 45(3): 594-600.  DOI: 10.11996/JG.j.2095-302X.2024030594
HTML    PDF 49     40

As society places higher demands on the quality of building designs, design software has become more professional and complicated. Current design software not only incurs high learning costs but also features complex interaction modes. The recent breakthroughs in large language models (LLM) have enabled computers to clearly comprehend instructions based on human natural language and accurately generate code, which is expected to provide new ideas for the paradigm of human interaction with software. Therefore, this study designed a new paradigm of interactive building design driven by LLM, i.e., shifting from the designers interacting with the design software through multiple keyboard and mouse operations to LLMs writing scripts to invoke APIs according to architects’ instructions. The methodology was proposed and its implementation feasibility in building design was validated. The methodology included: ① LLM retrieved task-related APIs from the API set according to user instructions; ② LLM wrote a program script based on instructions and the abstract of candidate APIs and ran it; ③ LLM revised the script written based on the feedback from the environment, users, etc. To validate the capabilities of current LLMs in executing the key steps of the methodology, multiple design tasks were completed with Rhino7 design software, GPT-4, and CodeLlaMa. The results not only demonstrated that the LLM-driven interactive design paradigm held initial prospects for implementation in building design, but also provided experiences and suggestions for its implementation. The implementation of this design paradigm could reduce the threshold and learning costs, improving the efficiency in many scenarios, and was expected to play a key role in future building design software.

Figures and Tables | References | Related Articles | Metrics
An intelligent railway operation and maintenance management approach based on BIM and semantic web
HE Qing, JING Chuanyu, SUN Huakun, YAO Li, XU Jingmang, WANG Ping
2024, 45(3): 601-612.  DOI: 10.11996/JG.j.2095-302X.2024030601
HTML    PDF 24     24

The building information modeling (BIM) technology plays a crucial role in enhancing the efficiency of railway operation and maintenance management. However, the heterogeneity of data generated from various inspection and maintenance activities, coupled with the complex spatiotemporal relationships, hinder the process of BIM data interpretation and integration. To address this challenge, a railway maintenance ontology (TOMO) based on the industry foundation classes (IFC) and semantic Web technology was developed. TOMO served three main functions: ① Simplifying BIM model information based on railway maintenance lifecycle requirements. ② Introducing mapping rules and establishing data extraction and transformation modules to integrate heterogeneous data from multiple sources, structurally defining complex spatiotemporal relationships between data. ③ Combining data-driven techniques to study intelligent optimization methods for railway fine-tuning, providing flexible decision support. Finally, using static inspection data from a high-speed railway as an example, the effectiveness and practicality of this framework were verified. This framework held practical engineering significance in promoting data interoperability in the field, reducing the labor intensity of maintenance personnel, and enhancing the intelligence of maintenance management.

Figures and Tables | References | Related Articles | Metrics
Industrial Design
Effects of text visual density and interface color hue on digital health education for the elderly
PENG Cheng, YI Minzhe, WU Qun, YU Hangyong
2024, 45(3): 613-623.  DOI: 10.11996/JG.j.2095-302X.2024030613
HTML    PDF 20     12

Digital health education has made great contributions to the enhancement of people's health literacy through its rich content and flexible education methods. However, due to the decline of cognitive ability and visual level, the elderly often encounter difficulties in identifying and understanding information during digital education. In this paper, the reading interface of the digital health education platform served as the research object to analyze the impact of interface visual design on the reading effect and cognitive load of the elderly. The independent variables in this study were text visual density (normal density, low density) and interface color hue (cold color and warm color), and the dependent variables were reading performance. The results of the 2×2 two-factor experiment indicated that: ① The reading comprehension effect under low text density was better than that under normal density; ② The reading comprehension effect with cold color was better than that with warm color; ③ There was a significant interaction between text visual density and interface color hue on reading time. Cold color resulted in shorter reading time when paired with normal density, while warm color led to shorter reading time when paired with low density. This study analyzed the interface design from the perspective of health education content communication, enriching the data on digital reading experience for the elderly. It discussed the effect and cross-relationship of text visual density and color hue in the interface, and provided specific design suggestions for enhancing the effectiveness of digital education for the elderly.

Figures and Tables | References | Related Articles | Metrics
The regularity of cognitive response to image perception based on EEG information and its quantitative analysis method
YI Peng, TIAN Xinghui, LIU Guangdou, WEI Qingbing, WANG Shuai, LIU Yancong
2024, 45(3): 624-630.  DOI: 10.11996/JG.j.2095-302X.2024030624
HTML    PDF 37     21

“Images” serve as an important medium for human perception of the world, and image perception is a key channel for the effective conversion between planar elements and three-dimensional forms in cognitive thinking. However, the process of human brain accepting image input, completing cognitive processing, and realizing knowledge output is an interdisciplinary discipline between cognitive psychology and brain science, which lacks sufficient research methods and theoretical basis at present. Therefore, in view of the unclear cognitive process of image perception and the lack of cognitive analysis methods, the EEG cognitive test experiment were designed. These experiments analyzed the changes in P300 potential based on the theory of brain potential correlation of cognitive events, and two-dimensional projection map and three-dimensional model as the cognitive objects of image perception. The results demonstrated that the relevant parts of image perception were mainly in the left frontal lobe of the brain. The P300 potential value could be employed to reflect the cognitive degree of the brain’s image perception. The stronger the brain’s ability to accept the image content, the lower the P300 potential peak value. By comparing the P300 potential changes of different samples, the cognitive process of image perception could be analyzed to form a quantitative analysis method for the cognitive ability of image perception, improve the input efficiency of the cognitive content of image perception to the brain, and shed light on the input and cognitive mechanism of image perception, providing an effective basis for the application of feedback optimization in the depth of image interaction.

Figures and Tables | References | Related Articles | Metrics
Published as
Published as 3, 2024
2024, 45(3): 631. 
PDF 10     28
Related Articles | Metrics