Loading...
Welcome to Journal of Graphics share: 

Table of Contents

    For Selected: Toggle Thumbnails
    Cover
    Cover of issue 4, 2023
    2023, 44(4): 0. 
    Abstract ( 186 )   PDF (716KB) ( 173 )  
    Related Articles | Metrics
    Contents
    Table of Contents for Issue 4, 2023
    2023, 44(4): 1. 
    Abstract ( 91 )   PDF (236KB) ( 94 )  
    Related Articles | Metrics
    Review
    A survey of video human action recognition based on deep learning
    BI Chun-yan, LIU Yue
    2023, 44(4): 625-639.  DOI: 10.11996/JG.j.2095-302X.2023040625
    Abstract ( 486 )   HTML ( 37 )   PDF (2292KB) ( 382 )  

    With the rapid advancement of network multimedia technology and the continuous improvement of video capture equipment, an increasing number of videos are shared on network platforms, gradually becoming an integral part of human life. Consequently, video understanding has become one of the hot spots of computer vision research, with video understanding being a pivotal task. At present, 2D image recognition classification methods based on deep learning have made significant strides. However, video action recognition still faces a formidable challenge. The reason is that videos differ from 2D images by an additional temporal dimension, and that understanding actions such as walking, running, high jumping, and long jumping in videos requires not only the spatial semantic information that 2D images possess but also temporal information. Therefore, effectively utilizing the temporal information of videos is critical for action recognition. This paper firstly introduced the research background and development process of action recognition, followed by an analysis of the current challenges in video action recognition. The methods of temporal modeling and parameter optimization were then presented in detail, along with an examination of the commonly used action recognition datasets and metric parameters. Finally, the paper outlined the future research directions in this field.

    Figures and Tables | References | Related Articles | Metrics
    Survey of methods for scene analysis and content processing in panoramic images and videos
    XIE Hong-xia, HU Yu-ning, ZHANG Yun, WANG Ya-qi, DU Hui, QIN Ai-hong
    2023, 44(4): 640-657.  DOI: 10.11996/JG.j.2095-302X.2023040640
    Abstract ( 209 )   HTML ( 15 )   PDF (15506KB) ( 143 )  

    In recent years, the rapid development of software and hardware technologies for acquiring and interacting with panoramic content has led to a significant increase in the number of panoramic images and videos. Immersive media with 360-degree panoramic images and videos as the main content has been widely used in the field of virtual reality and enhancement implementation. Compared with traditional 2D images and videos, panoramic images and videos can provide users with a new immersive experience. With wearable devices, users can freely watch the content from all perspectives through head movement. At present, the number of panoramic images and videos has soared, but it is usually difficult to obtain satisfactory panoramic images and videos, due to the difficulty in obtaining panoramic content and the lack of effective editing tools. Therefore, analyzing and processing panoramic content with high quality has become an increasingly important research topic in the field of virtual reality. However, both in theory and application, the analysis and processing of panoramic content face significant challenges. Despite this, there is a lack of systematic and comprehensive summaries and research on the key issues in this field in existing literature. In order to better promote research and application in this area, a survey was provided on the recent works of scene analysis and content processing of panoramic images and videos. In terms of panoramic scene analysis, this survey reviewed the research on depth learning networks, depth recovery, importance detection, and target detection for panoramic images and videos. In terms of panoramic content processing, the survey analyzed the research on interactive browsing, stabilization and correction, and content editing of panoramic image video. Finally, the overview was summarized, with an outlook on future research trends in scene analysis and content processing of panoramic images and videos under the stereo view.

    Figures and Tables | References | Related Articles | Metrics
    Small object detection algorithm in UAV image based on feature fusion and attention mechanism
    LI Li-xia, WANG Xin, WANG Jun, ZHANG You-yuan
    2023, 44(4): 658-666.  DOI: 10.11996/JG.j.2095-302X.2023040658
    Abstract ( 373 )   HTML ( 25 )   PDF (1849KB) ( 313 )  

    The task of detecting small objects in UAV aerial images is a formidable challenge due to their diminutive size and insufficient amount of feature information. To surmount this predicament, a multi-head attention mechanism was incorporated into the YOLOv5 backbone network in order to seamlessly integrate global feature information. As the network depth increased, the model tended to accentuate high-level semantic information at the expense of underlying detailed texture features vital for the detection of small objects. To address this issue, a shallow feature enhancement module was devised to acquire underlying feature information and augment small object feature information. Furthermore, a multi-level feature fusion module was developed to amalgamate feature information from different layers, thus enabling the network to dynamically adjust the weights of each output detection layer. Experimental results on the publicly available VisDrone2021 dataset demonstrated that the mean average precision of the proposed algorithm, attained a level of 45.7%, representing a 3.1% enhancement over the baseline YOLOv5 algorithm. Additionally, the proposed algorithm achieved a detection speed of 41 frames per second for high-resolution images, satisfying the requirement for real-time performance and exhibiting a noteworthy improvement in detection accuracy over other prevalent methods.

    Figures and Tables | References | Related Articles | Metrics
    Multi-class defect target detection method for transmission lines based on TR-YOLOv5
    HAO Shuai, ZHAO Xin-sheng, MA Xu, ZHANG Xu, HE Tian, HOU Li-xiang
    2023, 44(4): 667-676.  DOI: 10.11996/JG.j.2095-302X.2023040667
    Abstract ( 209 )   HTML ( 9 )   PDF (13779KB) ( 132 )  

    To address the problem of multi-scale detection of multi-class defect targets in transmission lines in complex environments, a YOLOv5 transmission line multi-class defect target detection algorithm was proposed based on Transformer and perceptual field modules, abbreviated as TR-YOLOv5. First, a YOLOv5 network was built to address the problem of low saliency of defect targets caused by complex backgrounds, which hindered accurate detection. The Transformer module was introduced in the Backbone part. By utilizing a multi-head attention structure to capture the correlations and global information between the pixels of feature maps, the feature expression capability of the defect targets was enhanced, thereby improving the detection accuracy of the model. Secondly, since the target being detected is impacted by multiple scales, a perceptual field module was introduced in the Neck part to extract features of different scales of the target. Null convolution was also employed to increase the perceptual field, while more detailed features were reserved for the subsequent PANet structure. Furthermore, the Neck feature fusion capability was bolstered to enhance the detection accuracy of the model for multi-scale defective targets. In addition, to enhance the precision of predicted border regression, the CIOU function was introduced to further boost the detection accuracy of the algorithm. Finally, the proposed algorithm was validated using the data of a power inspection department for the past three years. The experimental results demonstrated that the proposed algorithm could surpass seven comparative algorithms in terms of detection accuracy and real-time performance, with an average detection accuracy of 95.6% and the inspection image detection speed for 1280×720 resolution reaching 125 frames/second.

    Figures and Tables | References | Related Articles | Metrics
    Image Processing and Computer Vision
    A real-time metallic surface defect detection algorithm based on E-YOLOX
    CAO Yi-qin, ZHOU Yi-wei, XU Lu
    2023, 44(4): 677-690.  DOI: 10.11996/JG.j.2095-302X.2023040677
    Abstract ( 172 )   HTML ( 9 )   PDF (30007KB) ( 137 )  

    For metallic surface defect detection, a novel algorithm E-YOLOX was proposed to address the shortcomings of current methods, such as poor generalization ability and low detection speed. The algorithm utilized a new feature extraction network, ECMNet, which employed depth convolutions to reduce the parameters and computational cost of the network. The linear inverse bottleneck residual network was in use to enhance the feature extraction capability, while preserving more key features that were manifold distributed in low-dimensional subspaces within high-dimensional tensors during forward propagation. Additionally, the extended cross-stage partial network structure diversified the gradient flow paths of neural networks, making deep neural networks learn and converge more efficiently. Moreover, a new data augmentation method edge Cutout was proposed, which generated adaptive masks covering random regions of the image during the training process, enhancing the detection and generalization ability of the network. The experimental results demonstrated that E-YOLOX-l achieved 77.2% mAP in detection accuracy on the aluminum profile surface defect dataset AL6-DET and 36.8% mAP on steel surface defect dataset GC10-DET, which was 3.6% and 1.7% higher than the baseline algorithm YOLOX-l. At the same time, the number of parameters was reduced by 55% and the computational cost was reduced by 49%. The detection speed was 57 FPS, an increase of 21 FPS. Compared with other related algorithms, the new algorithm achieved a higher detection accuracy and a better balance between accuracy and speed.

    Figures and Tables | References | Related Articles | Metrics
    Text detection method for electrical equipment nameplates based on deep learning
    WANG Dao-lei, KANG Bo, ZHU Rui
    2023, 44(4): 691-698.  DOI: 10.11996/JG.j.2095-302X.2023040691
    Abstract ( 156 )   HTML ( 6 )   PDF (3355KB) ( 71 )  

    The prompt detection of power equipment nameplates can help the complete transformer substations and power plants to efficiently comprehend device information and perform necessary maintenance, thus ensuring the proper functioning. This thesis addressed the problem of enhancing text detection efficiency while also taking into account the improvement of precision. To that end, we introduced the concept of convolutional block attention module (CBAM) into the DBNet, and improved the detection head. Multi-scale feature feature pyramid networks (FPN) structures were introduced into the backbone network, improving upon the original FPN. Meanwhile, in view of the absence of public data for power equipment nameplates and difficulties in obtaining it, we proposed a technique to enhance the data by cutting nameplate images into rectangles and then splicing them together into a new image. In this way, the data set could be effectively expanded. The experimental results showed that both the data enhancement method and the improved DBNet network structure proposed in this paper have played a role in improving the detection performance, surpassing most current text detection network structures on the market. The improved DBNet network structure combined with data enhancement method yielded a precision rate of 90.3% and a recalling rate of 79.7%. The rate of F-measure also increased to nearly 84.7%, a 3.3% improvement over the original model, indicating that the detection performance was greatly improved while the loss of detecting speed changes remained minimal.

    Figures and Tables | References | Related Articles | Metrics
    Content semantics and style features match consistent artistic style transfer
    LI Xin, PU Yuan-yuan, ZHAO Zheng-peng, XU Dan, QIAN Wen-hua
    2023, 44(4): 699-709.  DOI: 10.11996/JG.j.2095-302X.2023040699
    Abstract ( 109 )   HTML ( 5 )   PDF (15248KB) ( 89 )  

    The development of computer vision has rendered image style transfer a challenging and valuable subject of research. Nonetheless, existing methods are unable to effectively preserve object contours of content images while migrating many different style features with the same content semantics. In response, an artistic style transfer network, with consistent matching of content semantics and style features, was proposed. First, a two-branch feature processing module was employed to enhance the style and content features and retain the object contours of content images. Subsequently, feature distribution alignment and fusion were achieved within the attentional feature space. Finally, an interpolation module with spatial perception capability was utilized to achieve style consistency of content semantics. The network was trained with 82 783 actual photos and 80 095 artistic portraits for style transfer. Furthermore, 1 000 actual photos and 1 000 artistic portraits were used for testing. The effectiveness of the proposed framework and the added loss function was verified through experiments, which included comparing it with the latest four style transfer methods and conducting ablation experiments, respectively. The experimental results demonstrated that the proposed network could run at an average time of 9.42 ms in 256-pixel image generation and 10.23 ms in 512-pixel image generation, while avoiding distortion of content structure and matching content semantics and style features consistently, with better artistic visual effects.

    Figures and Tables | References | Related Articles | Metrics
    Landscape image generation based on conditional residual generative adversarial network
    SHAO Jun-qi, QIAN Wen-hua, XU Qi-hao
    2023, 44(4): 710-717.  DOI: 10.11996/JG.j.2095-302X.2023040710
    Abstract ( 70 )   HTML ( 3 )   PDF (9299KB) ( 55 )  

    The semantic segmentation map of landscape image encompasses a large number of categorical information such as the sky, white clouds, mountains, rivers, and trees. In view of the challenges presented by the numerous information categories in the semantic segmentation map and the subtle color transformations between different regions, the landscape images generated by current methods are deficient in terms of both clarity and authenticity. Consequently, a method based on conditional residual generation adversarial network (CRGAN) was proposed to generate landscape images with a higher resolution and more realistic content. Firstly, the proposed method involved the upsampling and downsampling structures of the generator network to enhance the feature extraction effect of the generator on the semantic segmentation graph. Secondly, skip connections were utilized between the encoder and decoder to transmit the feature information from the semantic segmentation graph, ensuring the integrity of such information was retained, and not lost in the encoder. Finally, a residual module was added between the encoder and decoder of the network, facilitating better extraction, transmission, and retention of semantic information. In addition, the mean square error (MSE) was employed to enhance the similarity between semantically segmented graphs and generated images. The experimental results demonstrated that compared with pix2pix and cyclegan methods, the FID index of images generated by CRGAN increased by 26.769 and 119.333, respectively. This improvement effectively enhanced the clarity and authenticity of landscape images. The universality and validity of CRGAN were also validated using a common dataset.

    Figures and Tables | References | Related Articles | Metrics
    Object detection for nameplate based on neural architecture search
    DENG Wei-ming, YANG Tie-jun, LI Chun-chun, HUANG Lin
    2023, 44(4): 718-727.  DOI: 10.11996/JG.j.2095-302X.2023040718
    Abstract ( 80 )   HTML ( 2 )   PDF (2068KB) ( 62 )  

    In order to enhance the automation of building deep convolutional neural network (CNN) for object detection and further improve the detection accuracy, an improved DenseNAS-based neural architecture search method was proposed to automatically build a CNN for nameplate detection. First, the searchable subnet modules (CSP-Block1 and CSP-Block2) were designed to fuse deep and shallow layer feature mapping by enhancing the Head layer of DenseNAS. Subsequently, the search space was established based on the CSP-Block1 and CSP-Block2 to explore the Backbone and Head of CNN for nameplate detection. The experimental results demonstrated that the proposed method required about 9.35 GPU hours to search the optimal neural network on a nameplate dataset consisting of 5 classes, and that the detection accuracy mAP was about 97.3% on the test set, exceeding those of state-of-the-art methods, such as YOLOv5.

    Figures and Tables | References | Related Articles | Metrics
    Monocular depth estimation based on Laplacian pyramid with attention fusion
    YU Wei-qun, LIU Jia-tao, ZHANG Ya-ping
    2023, 44(4): 728-738.  DOI: 10.11996/JG.j.2095-302X.2023040728
    Abstract ( 84 )   HTML ( 4 )   PDF (6616KB) ( 71 )  

    With the rapid development of deep neural networks, research on deep learning-based monocular depth estimation has centered on regressing depth through encoder-decoder structures and has yielded significant results. However, most traditional methods typically entail the repetition of simple upsampling operations during the decoding process, which fail to take full advantage of the characteristics of the encoder for monocular depth estimation. To address this problem, this study proposed a dense feature decoding structure combined with an attention mechanism. Utilizing a single RGB image as input, the feature map of each level of the encoder was fused into the branch of the Laplace pyramid to heighten the utilization of the feature map at each level. Attention mechanisms were introduced into the decoder to further enhance depth estimation. Finally, data loss and structural similarity loss were combined to reinforce the stability and convergence speed of model training and diminish the training cost of the model. The experimental results demonstrated that compared with the existing model on the KITTI dataset, the root mean square error decreased by 4.8% and the training cost was reduced by 36% relative to the advanced algorithm LapDepth, with a more significant improvement in depth estimation accuracy and convergence speed.

    Figures and Tables | References | Related Articles | Metrics
    Image feature matching based on repeatability and specificity constraints
    GUO Yin-hong, WANG Li-chun, LI Shuang
    2023, 44(4): 739-746.  DOI: 10.11996/JG.j.2095-302X.2023040739
    Abstract ( 58 )   HTML ( 1 )   PDF (1552KB) ( 33 )  

    Image feature matching ascertains whether a pair of pixels can be matched by comparing their distance in the feature space. Therefore, how to learn robust pixel features constitutes one of the primary concerns in the field of image feature matching based on deep learning. In addition, the learning of pixel feature representation is also affected by the quality of the source image. As a solution to the challenge of learning more robust pixel feature representations, the proposed method improved the image feature matching network LoFTR. For the coarse granularity feature reconstruction branch, the specificity constraint was defined to maximize the feature distance between pixels within the same image, enabling strong distinguishability between different pixels. The repeatability constraint was defined to minimize the feature distance between the matched pixels from different images, enabling strong similarity between the matched pixels across different images and thus enhancing the accuracy of matching. Additionally, an image reconstruction layer was incorporated into the decoding phase of the Backbone, and image reconstruction loss was defined to constrain the encoder to learn more robust feature representation. The experimental results on indoor dataset ScanNet and outdoor dataset MegeDepth show the effectiveness of the proposed method. Furthermore, based on images with different qualities, it is verified that the proposed method can better adapt to image feature matching when the source images have different quality.

    Figures and Tables | References | Related Articles | Metrics
    Image Processing and Computer Vision
    Research on real-time dense reconstruction for open road scene
    LI Xin-li, MAO Hao, WANG Wu, YANG Guo-tian
    2023, 44(4): 747-754.  DOI: 10.11996/JG.j.2095-302X.2023040747
    Abstract ( 44 )   HTML ( 5 )   PDF (2575KB) ( 34 )  

    In order to tackle the problems of inefficiency and inaccurate mapping prevalent in the field of intelligent driving, a two-stage dense mapping algorithm for outdoor open road scenes based on multi-sensor fusion was proposed. The proposed algorithm comprised an extrinsic parameter real-time calibration module and a mapping module. The former constructed constraints and optimized them based on typical semantic and geometric features in road scenes, achieving real-time online calibration of extrinsic parameters between sensors. The latter’s core was a two-stage incremental mapping algorithm that performed incremental coarse mapping and fine mapping for the entire scene and road face area, respectively. The rough mapping could guarantee the real-time performance of the algorithm, and fine mapping could achieve accurate restoration of road surface textures such as traffic signs. Experimental results in an outdoor open road scene demonstrated that the proposed algorithm could perform real-time dense mapping in outdoor large-scale scenes, with high accuracy and efficiency.

    Figures and Tables | References | Related Articles | Metrics
    Computer Graphics and Virtual Reality
    Geometric feature guided multi-level segmentation for object point clouds
    LIU Yan, XIONG You-yi, HAN Miao-miao, YANG Long
    2023, 44(4): 755-763.  DOI: 10.11996/JG.j.2095-302X.2023040755
    Abstract ( 61 )   HTML ( 1 )   PDF (6738KB) ( 43 )  

    The proliferation of scanning object point clouds has brought surface shape analysis to the forefront of research in the computer graphics community. Tasks such as structure extraction, shape editing, and human-object interaction necessitate as precise a point cloud part segmentation as possible. However, due to the complexities of shallow geometric feature extraction, shape loss, and noise interference, the part segmentation, especially the small instance part extraction for scanning point clouds, remains relatively difficult. To address this issue, a multi-level part instance segmentation method guided by geometric features was proposed for object point clouds. The concave, convex, and boundary features were extracted based on the local bending extent of the point cloud surface. Firstly, our method segmented the general structure along the most global prominent concave line. Then it subdivided those segments with obvious geometric differences according to the local shallow features. Finally, the method performed concave-convex collaborative segmentation for some segments to obtain the multi-level segmentation results. The three-stage segmentation process from coarse to fine, along the geometric features from deep to shallow, allowed for better consideration of parts with different scales and finer grained part instance segmentation. The experimental results demonstrated that the proposed method could achieve superior segmentation results on both scanning point clouds and sampled point clouds from CAD models. It provided a more precise and effective method for part segmentation of scanning object point clouds.

    Figures and Tables | References | Related Articles | Metrics
    3D low-poly mesh generation for building models
    YUE Ming-yu, GAO Xi-feng, BI Chong-ke
    2023, 44(4): 764-774.  DOI: 10.11996/JG.j.2095-302X.2023040764
    Abstract ( 138 )   HTML ( 6 )   PDF (3106KB) ( 97 )  

    Within the domain of 3D virtual scenes, the usage of low-poly meshes (meshes comprised of few triangles) used in levels of detail play a pivotal role in improving the efficiency of real-time rendering. However, given detailed building models (high-poly meshes), it is difficult for the low-poly meshes generated by the existing methods to maintain good visual similarity with the high-poly meshes while achieving extremely low simplification rates, requiring the manual correction of defects. We proposed a novel method in which the user only needed to provide a few robust parameters to generate low-poly building meshes with good visual similarity and satisfying benign geometric properties such as watertightness and two-manifold. Firstly, new meshes were morphed by inverse rendering, enabling the capture of important geometric features from the high-poly meshes, so that they could have similar appearances to the high-poly meshes and retain only the large-scale appearance features. Additionally, to maintain the consistency of the mesh topology during the morphing process, we leveraged the outer hull of the voxelized high-poly meshes as the initialization of the new meshes. Secondly, to address the intersecting triangles that arose during the morphing process, we designed the alpha wrapping algorithm with topology-adaptive parameters. The algorithm ensured that the genus of the outer hull of the voxelized morphed meshes and the genus of the results remained the same, generating approximate meshes from the morphed meshes with no intersecting triangles, satisfying watertightness and two-manifold. Finally, an improved edge collapse algorithm was applied and generated meshes that were simplified to users’ target facet count. The algorithm optimized the contraction of vertex pairs on planes. We evaluated our method’s robustness and effectiveness against a dataset of building models and compared it against popular and state-of-the-art methods.

    Figures and Tables | References | Related Articles | Metrics
    Generation and selection of synthetic data for cross-domain person re-identification
    CAI Yi-wu, ZHANG Yu-jia, ZHANG Yong-fei
    2023, 44(4): 775-783.  DOI: 10.11996/JG.j.2095-302X.2023040775
    Abstract ( 34 )   HTML ( 1 )   PDF (5757KB) ( 24 )  

    The reliance of mainstream deep learning-based person re-identification models on large-scale labeled data for training is a costly process that requires extensive collection and labeling efforts. Additionally, the existing virtual data generation methods neglect to account for the characteristics of target domain, thereby compromising the performance of cross-domain re-identification. To address these issues, this paper proposed a synthetic data generation and selection algorithm for cross-domain person re-identification. First, this algorithm utilized the foreground information of the target domain, including the color distribution of individuals’ clothing, to guide the generation of virtual 3D human models. The background information of the target domain was employed to replace the background of source domain data. This served to enhance the data quality at the pixel level, while also guiding the model to distinguish different persons based on the foreground. Finally, the proposed method employed distribution metrics such as Wasserstein Distance to measure the feature distribution distance between the source domain and target domain. This distance was used to select the source domain training subset closest to the target domain for model training. The experimental results demonstrated the superiority of this method over other existing person virtual data generation algorithms, as it can significantly improve the cross-domain generalization performance of the person re-identification model.

    Figures and Tables | References | Related Articles | Metrics
    Audience screen display system based on light-emitting devices
    GAO Yue, HAN Hong-lei
    2023, 44(4): 784-793.  DOI: 10.11996/JG.j.2095-302X.2023040784
    Abstract ( 40 )   HTML ( 0 )   PDF (9223KB) ( 19 )  

    In the realm of large-scale sporting events, a common issue is monotonous viewing content, the lack of engagement and immersion experienced by audience during pre-game and inter-game intervals. In response to this challenge, a user interaction method using light-emitting smart devices was proposed. This approach involved transforming the audience into a large screen, with each light-emitting device deployed in the audience serving as a pixel on the screen. By equipping every each audience member with a light-emitting smart device, the stadium became a large screen, where the audience could follow the animation and music, wave the light-emitting device in their hands, and interact with the host. The display effect was also synchronized on the big screen in the center of the arena. This solution enhanced the link between the audience members, elevating the viewing experience and imbuing the atmosphere of the game with a sense of enthusiasm. Based on this solution, the spectator screen display system based on light-emitting devices was successfully applied to the ice hockey events of the 2022 Beijing Winter Olympics. The results showed that the system could significantly enhance the spectator experience and exhibited promising potential for application in various areas, including the content display within the arena, information dissemination, and seating guidance.

    Figures and Tables | References | Related Articles | Metrics
    Hand reconstruction incorporating biomechanical constraints and multi-modal data
    XUE Hao-wei, WANG Mei-li
    2023, 44(4): 794-800.  DOI: 10.11996/JG.j.2095-302X.2023040794
    Abstract ( 57 )   HTML ( 4 )   PDF (4656KB) ( 54 )  

    To address the high cost and slow response of current monocular hand reconstruction, a method for hand 3D reconstruction using a monocular camera to acquire hand shape and posture was proposed. The method adopted a deep learning-based architecture that used image data with 2D and 3D annotations, as well as hand motion capture data for training. Firstly, 3D joint positions were accurately regressed and mapped to joint rotations through a joint detection module (3DHandNet) and an inverse kinematic module (IRNet). Then, biomechanical constraints were introduced to achieve high-quality mesh image alignment for real-time predictions. Finally, the resulting prediction vector with joint rotation representation was input to the hand mesh template to fit the hand shape. This approach was more suitable for computer vision and graphics applications compared to only regressing 3D joint positions. Experimental results on a benchmark dataset demonstrated that the proposed method achieves real-time runtime performance (60 fps) and high reconstruction accuracy, outperforming current methods in terms of hand posture estimation accuracy and hand image alignment.

    Figures and Tables | References | Related Articles | Metrics
    BIM/CIM
    Design and development of automatic drawing system for construction deepening based on BIM
    CHEN Jing, YU Fang-qiang, YI Si-kun, QIU Chun-hua, CAO Ying
    2023, 44(4): 801-809.  DOI: 10.11996/JG.j.2095-302X.2023040801
    Abstract ( 142 )   HTML ( 5 )   PDF (1617KB) ( 99 )  

    In response to the low efficiency and inconvenience of current BIM software for drawing construction detailing drawings, a technology and system for the automatic generation of such drawings based on BIM was proposed and developed. This technology explored the conventional family and special-shaped family in the construction detailing model, and studied family identification and information matching technology, thus ensuring the completeness of the parameter information of the component family in the generated drawings. By extracting the information of the component size, positioning, engineering attributes, and other information in the model, we automatically generated labeling and output 2D drawings. Drawing lines, layers, and other drawing standards were controlled through prefabricated view templates to ensure that the generated drawings meet general drawing requirements. A BIM drawing system was then developed, and a drawing evaluation standard was formulated based on the output drawings and used to verify the feasibility of the drawing system. Finally, the drawing system was applied in engineering in combination with the evaluation standard. The application results showed that the BIM drawing system could quickly and automatically generate drawings that meet engineering standards, thus ensuring both the quality and efficiency of drawings.

    Figures and Tables | References | Related Articles | Metrics
    Research on design responsibility and right traceability system integrating BIM and blockchain
    GAN Zi-chen, FAN Jing-jing, XUE Zhi-ting, LIN Ding-liang, XU Zhen
    2023, 44(4): 810-817.  DOI: 10.11996/JG.j.2095-302X.2023040810
    Abstract ( 93 )   HTML ( 1 )   PDF (1381KB) ( 40 )  

    In recent years, designers have often become the target of accountability in the aftermath of engineering accidents. With the BIM model becoming the original material for engineering drawings review and archiving, novel solutions are necessary to prevent engineering quality accidents from their source. Consequently, a design responsibility and right traceability system integrating BIM and blockchain was proposed. Blockchain, a decentralized database, is tamper-proof and traceable, and has been successfully utilized in the field of quality traceability. Within the framework of a distributed application, an alliance chain for BIM metadata storage was proposed based on Ethereum technology, and the degree of decentralization was reinforced through three types of nodes: administrator, designer, and supervisor. In order to resolve the problem of excessive BIM model volume, the third-party platform BIMFACE was used as an external database, filling the gaps in supporting formats and practical applications when integrating blockchain with BIM. Hash algorithm-based integrity check can prevent third-party tampering with the model. Concurrently, the system boasted great expandability, enabling collaborative work functions such as online browsing and model description, and can accommodate additional application scenarios in the lifecycle of BIM products in the future. An application case of a university’s Xiong’an campus has demonstrated the system’s ability to permanently record and accurately trace the design process.

    Figures and Tables | References | Related Articles | Metrics
    Research on key technologies of automatic evaluation and design of road light barriers based on 3D simulation
    XU Yang, ZHANG Liang-tao, WANG Zhen-gang, ZHANG Liu-jun, WU Si-yuan, DENG Yi-chuan, SONG Jie
    2023, 44(4): 818-827.  DOI: 10.11996/JG.j.2095-302X.2023040818
    Abstract ( 46 )   HTML ( 3 )   PDF (3293KB) ( 39 )  

    Glare is a serious issue that can cause momentary blindness among drivers, leading to poor decision-making and a higher risk of traffic accidents. In addition, for parallel or crossing sections of railways and highways, the glare produced by railway trains can also affect the safety of nearby road traffic. In order to address this problem systematically, a 3D simulation and evaluation method was developed for train glare on parallel or intersecting railways and highways. This method analyzed the impact of train glare on nearby road traffic safety, identifying the height and location of necessary glare reduction facilities. Firstly, the data of railway and highway sections were discretized and integrated using the secondary development technology of AutoCAD. The two-dimensional line data on the plan were then integrated into three-dimensional data, thus realizing the transformation from the two-dimensional data model to a three-dimensional data model. Based on the resulting 3D data model, the spatial position relationship between train light source and the driver’s line of sight on the highway was considered. Using the glare calculation theory, the 3D simulation and evaluation of train glare was conducted for the entire line, thereby automating the height and location of glare reduction facilities on the line. Finally, the accuracy of both the 3D simulation technology for glare and the automatic evaluation scheme was verified by numerical examples.

    Figures and Tables | References | Related Articles | Metrics
    Industrial Design
    Research on parametric design method of color style for small household appliances
    YANG Dong-mei, CUI Zhi-qi, ZHANG Jian-nan, WANG Ze-yuan, WANG Peng-fei
    2023, 44(4): 828-837.  DOI: 10.11996/JG.j.2095-302X.2023040828
    Abstract ( 84 )   HTML ( 4 )   PDF (7300KB) ( 68 )  

    With the proliferation of e-commerce, consumers now have an array of options in online shopping. In order to effectively steer users towards consumption, it is essential to provide excellent product color matching. To this end, a color space positioning method based on the three elements of color was proposed to investigate the relationship between product color and style, facilitating the parametric design of color schemes catering to the users’ stylistic preferences. Based on an analysis of the concept of product color style image ontology, the HSV color space was introduced to quantify the three elements of color (hue, saturation and brightness). Additionally, the 3D effect display of the HSV color space served as a visualization medium. To narrow the gap between the existing color scheme and the ideal color scheme, this study optimized the existing color scheme and combined the parameterization, cluster analysis, correlation analysis, and other methods. This process established a correlation between color and style of the product. The experimental evaluation results revealed the influence factors of different color elements on the style and the optimal color scheme under different styles. Furthermore, the visualization ontology of color style image of small household appliances was constructed, utilizing a Bluetooth speaker as an example for color verification. By taking the study of color style image of small household appliances as a case in point, this paper posited the research method of parametric design of product color style, which provided a novel perspective for the study of color design of small household appliances.

    Figures and Tables | References | Related Articles | Metrics
    Research on creative support system of visual metaphor design for Chinese style posters
    NING Bin, LIU Fang, LIU Zhi-xiong, WANG Zun-fu
    2023, 44(4): 838-848.  DOI: 10.11996/JG.j.2095-302X.2023040838
    Abstract ( 109 )   HTML ( 3 )   PDF (6648KB) ( 59 )  

    The development of creativity support system is an area of active research in the realm of computer-aided design. While most existing creativity support systems aim to assist the generation of works, they often fall short in providing effective support during the creative conception stage, which can hinder the process of inspiration divergence and convergence. Chinese-style posters, in particular, require designers to select appropriate cultural elements for layout to convey specific themes and cultural connotations. This task can prove challenging for graphic designers, especially novices who may overlook the multiple metaphors of cultural elements. In turn, the solidified image schema will lead to the works becoming mere formalities and lacking cultural connotations. In order to assist designers in the construction of visual metaphors in the design of Chinese-style posters, this paper proposed key design elements and thinking paths for the construction of visual metaphors in Chinese-style posters. Using the semantic difference and crowdsourcing methods, six types of elements were constructed as “metaphor associations”, and automatic generation rules for metaphor labels in Chinese-style posters were developed based on these associations. Subsequently, a creativity support system, called “Afflatus”, was constructed to facilitate deeper thinking exploration, inspiration, and visualization of the metaphorical reasoning process. The system explicitly represented the bidirectional correlation between elements and metaphors and offers a retrieval function with “metaphor” as high-level semantic information, supporting the sample divergence across multiple design dimensions of “metaphor” “color” and “form”. Finally, through user experiments and expert interviews, the effectiveness of Afflatus was verified, demonstrating its superiority over existing systems by supporting the divergence of user inspiration across multiple dimensions while effectively enhancing the efficiency of conception and design creativity.

    Figures and Tables | References | Related Articles | Metrics
    Published as
    Published as 4, 2023
    2023, 44(4): 849. 
    Abstract ( 26 )   PDF (147265KB) ( 50 )  
    Related Articles | Metrics
    Format of references in this issue
    22 references in Issue 4, 2023
    2023, 44(4): 850. 
    Abstract ( 12 )   PDF (223KB) ( 16 )  
    Related Articles | Metrics