Loading...
Welcome to Journal of Graphics share: 

Table of Contents

    For Selected: Toggle Thumbnails
    Contents
    Table of Contents for Issue 2, 2022
    2022, 43(2): 1. 
    Abstract ( 120 )   PDF (211KB) ( 60 )  
    Related Articles | Metrics
    Review
    Literature review of audio-driven cross-modal visual generation algorithms
    JIANG Lai, YU Zhen, WANG Peng-fei, ZHOU Dong-sheng, HOU Ya-qing
    2022, 43(2): 181-188.  DOI: 10.11996/JG.j.2095-302X.2022020181
    Abstract ( 408 )   PDF (1074KB) ( 242 )  
    Audio driven cross-modal visual generation algorithms have been widely employed in many fields, and
    have gained attention from industry and academia in recent years. Audio and vision are the most important and
    common modalities in people’s daily life. However, it has been a great challenge to creatively generate a visual scene
    corresponding to the audio. The existing literature has not systematically and comprehensively studied the topic of
    audio driven cross-modal visual generation. This paper summarized the existing algorithms for audio-driven
    cross-modal visual generation and divided them into three categories: audio to image, audio to body motion video, and
    audio to talking face video. For each category, we first described the fields of its specific applications and processes of
    mainstream algorithms, and analyzed the framework technologies involved. Then the core contents, advantages, and
    disadvantages of related algorithms were described according to the order of technology advancement, and their generation and performance effects were explained. Finally, the opportunities and challenges in the current field were
    discussed and the future research suggestions were provided.

    Related Articles | Metrics
    Image Processing and Computer Vision
    Deep learning based pixel-level public architectural floor plan space recognition
    GAO Ming, ZHANG He-hua, ZHANG Ting-rui, ZHANG Xuan-ming
    2022, 43(2): 189-196.  DOI: 10.11996/JG.j.2095-302X.2022020189
    Abstract ( 411 )   PDF (1332KB) ( 340 )  
    Pixel-level floor plan space recognition plays an important role in applications such as floor plan review and model reconstruction from drawings. Targeting at housing floor plans, the existing methods recognize spaces directly based on semantic segmentation. Public architectural floor plans feature more noising lines and elements, higher resolution, and more space varieties. Higher resolution makes it hard to acquire global information in a floor plan, while the variety of spaces makes it impossible to gain the clear range of room types, both features rendering the existing space recognition approaches unpractical. To recognize spaces in public architectural floor plans, a dataset named Public Architectural Floor Plan Dataset was proposed, including 20 floor plans labeled with walls at the pixel level and 100 floor plans labeled with elements at the bounding box level. A deep learning-based space boundary recognition approach was proposed. This approach could enhance the accuracy in recognizing walls, with the proposed center line extraction and key line minimum square error loss function, and could recognize spaces by enclosing space. A space contour optimization algorithm was proposed, which in experiments could reduce the number of contour points and reserve the shape of spaces. Experimental results show that this method breaks through the limitation of resolution and room type range, attains satisfying space recognition performance, and presents a solution to recognizing spaces of public architectural floor plans. Compared with existing methods, the proposed method reaches a higher recall ratio while the precision score is guaranteed.

    Related Articles | Metrics
    Multimodal small target detection based on remote sensing image
    HU Jun, GU Jing-jing, WANG Qiu-hong
    2022, 43(2): 197-204.  DOI: 10.11996/JG.j.2095-302X.2022020197
    Abstract ( 861 )   PDF (3440KB) ( 407 )  
    Since targets in remote sensing images are relatively small and easily affected by illumination, weather, and
    other factors, deep-learning based target detection methods from single modality remote sensing images suffer from
    low accuracy. However, the image information between different modalities can enhance each other to improve the
    performance of target detection. Therefore, based on RGB and infrared images fusion, we proposed a balanced
    multimodal depth model (BMDM) for multimodal small target detection from remote sensing images. As opposed to
    simple element-wise summation, element-wise multiplication, and concatenation to fuse the feature information of the
    two modalities, we designed a balanced multimodal feature method to enhance target features to make up for the
    shortcomings of single modal information. We first extracted low-level features from RGB and infrared images,
    respectively. Secondly, we fused the feature information of the two modalities and extracted deep-level features.
    Thirdly, we constructed a multimodal small target detection model based on the one-stage method. Finally, the
    effectiveness of the proposed method was verified by the experimental results of multimodal small target detection
    performed on the public dataset VEDAI of remote sensing images.
    Related Articles | Metrics
    Image segmentation algorithm based on improved pixel correlation model
    ZHANG Yan , GAO Xin , LIU Yi , ZHANG Xiao-feng, ZHANG Cai-ming
    2022, 43(2): 205-213.  DOI: 10.11996/JG.j.2095-302X.2022020205
    Abstract ( 138 )   PDF (1643KB) ( 99 )  
    Image segmentation is the research hotspot and difficulty in computer vision. Based on local information, the
    fuzzy local information C-means (FLICM) clustering algorithm improves the robustness of the algorithm to a certain
    extent, but cannot attain the expected image segmentation effect in the case of high noise intensity. Aiming at the low
    segmentation accuracy of traditional fuzzy clustering algorithm, an improved image segmentation algorithm based on
    pixel correlation model was proposed. Firstly, a new pixel correlation model was designed by analyzing the local
    statistical characteristics of pixels. On this basis, non-local information was effectively employed to mine the details in
    the image and improve the image segmentation effect. In the experiment, a variety of evaluation indexes were used to
    evaluate the segmentation results, and compared with a variety of common fuzzy clustering algorithms. Experimental
    results show that the fuzzy clustering algorithm based on improved pixel correlation can effectively balance the degree of
    resistance to noise and the degree of retention of image details in synthetic images, natural images, medical images, and
    remote sensing images, and that the segmentation effect and robustness are superior to the correlation algorithm.
    Related Articles | Metrics
    Monocular depth estimation of ASPP networks based on hierarchical compress excitation
    LIAO Zhi-wei , JIN Jing , ZHANG Chao-fan, YANG Xue-zhi
    2022, 43(2): 214-222.  DOI: 10.11996/JG.j.2095-302X.2022020214
    Abstract ( 139 )   PDF (3117KB) ( 75 )  
    Scene depth estimation is a basic task of scene understanding, and its accuracy reflects the degree of
    computer’s understanding of scene. Traditional depth estimation employs the atrous spatial pyramid pooling (ASPP)
    module to process different pixel features without changing the image resolution. However, this module does not
    consider the relationship between different pixel features, leading to inaccurate scene feature extraction. In view of the disadvantages of the ASPP module in depth estimation, an improved ASPP module was proposed to solve the
    distortion problem of the ASPP module in image processing. Firstly, the proposed module was added after the
    convolution kernel. Combined with the relationship between the features of each pixel, the method of enabling the
    network to adaptively learn the part of interest can effectively extract the features accurately according to the given
    image. Then the problem of network hierarchy optimization was solved by constructing difference matrix. Finally, the
    depth estimation network model was built on the indoor public dataset NYU-Depthv2. Compared with the current
    mainstream algorithms, the algorithm can achieve good performance in both qualitative and quantitative indexes.
    Under the same evaluation index, compared with the most advanced algorithm, the accuracy of  1 threshold is
    improved by nearly 3%, the root mean square error and absolute error are decreased by 1.7%, and the log domain error
    (lg) is decreased by about 0.3%. The improved ASPP network model proposed in this paper addresses the problem that
    the traditional ASPP modules fail to take into account the relationship between different pixel features. It can
    effectively make the model more convergent, significantly improve the ability of feature extraction, and produce more
    accurate results of scene depth estimation.

    Related Articles | Metrics
    Sequential multi-scale autoencoder for video anomaly detection
    LYU Hao, YI Peng-fei, LIU Rui, ZHOU Dong-sheng, ZHANG Qiang, WEI Xiao-peng
    2022, 43(2): 223-229.  DOI: 10.11996/JG.j.2095-302X.2022020223
    Abstract ( 194 )   PDF (1550KB) ( 112 )  
    Video anomaly detection refers to identifying events inconsistent with expected behaviors. Many current
    methods detect abnormalities through reconstruction errors. However, due to the powerful capabilities of deep neural
    networks, abnormal behaviors may be reconstructed, which is inconsistent with the hypothesis that the reconstructed
    error of abnormal behavior is large. However, the method of predicting future frames for anomaly detection has
    achieved good results, but most of these methods neither consider the diversity of normal sample, nor establish the
    association between consecutive frames of the video. In order to solve this problem, we proposed a sequential
    multi-scale autoencoder network to predict future frames, and completed video anomaly detection through the
    difference between the predicted value and the truth value. The network not only explicitly considers the diversity of normal events, but also constructs long-range spatial dependencies through a powerful encoder, thereby enhancing the
    diversity of output features. In addition, for the complex dataset containing more noises, we proposed denoising
    network to further improve the accuracy of the model. Under the premise of fulfilling real-time requirements, this
    method has achieved the best accuracy so far on the Avenue dataset.

    Related Articles | Metrics
    Efficient pedestrian detector combining depthwise separable convolution and standard convolution
    ZHANG Yun-bo, YI Peng-fei, ZHOU Dong-sheng, ZHANG Qiang, WEI Xiao-peng
    2022, 43(2): 230-238.  DOI: 10.11996/JG.j.2095-302X.2022020230
    Abstract ( 143 )   PDF (768KB) ( 97 )  
    Pedestrian detectors require the algorithm to be fast and accurate. Although pedestrian detectors based on deep
    convolutional neural networks (DCNN) have high detection accuracy, such detectors require higher capacity of
    calculation. Therefore, such pedestrian detectors cannot be deployed well on lightweight systems, such as mobile devices,
    embedded devices, and autonomous driving systems. Considering these problems, a lightweight and effective pedestrian detector (EPDNet) was proposed, which can better balance speed and accuracy. First, the shallow convolution layers of the backbone network employed depthwise separable convolution to compress the parameters of model, and the deeper convolution layers utilized standard convolution to extract high-level semantic features. In addition, in order to further improve the performance of the model, the backbone network adopted a feature fusion method to enhance the expression ability of its output features. Through comparative experiments, EPDNet has shown superior performance on two
    challenging pedestrian datasets, Caltech and CityPersons. Compared with the benchmark model, EPDNet has obtained a
    better trade-off between speed and accuracy, improving the speed and accuracy of EPDNet at the same time.

    Related Articles | Metrics
    Face detection and embedded implementation of lightweight network
    ZHANG Ming, ZHANG Fang-hui, ZONG Jia-ping, SONG Zhi, CEN Yi-gang, ZHANG Lin-na
    2022, 43(2): 239-246.  DOI: 10.11996/JG.j.2095-302X.2022020239
    Abstract ( 169 )   PDF (10890KB) ( 219 )  
     In recent years, face detection based on convolutional neural networks (CNN) has dominated this field, and
    the detection results on the public benchmark set have also been significantly improved. However, the computational
    cost and model complexity are on the rise. It remains a challenge to apply face detection model to embedded devices
    with limited computing power and memory capacity. Aiming at the application of face detection of 320×240
    resolution input images in embedded systems, a low-resolution face detection algorithm based on lightweight network
    was proposed. The backbone network employed the attention module, combined Distance-IoU (DIoU) and Non-Maximum Suppression (NMS), and adopted the Mish activation function. Meanwhile, an appropriate a priori box
    was set for the face feature ratio. In doing so, the balance could be achieved between precision and speed, and it could
    be deployed to the embedded platform. Specifically, deep separable convolution was used to replace ordinary
    convolution, and an attention convolutional block attention module (CBAM) was added after the convolution block to
    keep the network’s focus on the target object to be recognized. Instead of the ReLU activation function, the Mish
    activation function was used to improve the model inference speed. By combining DIoU and NMS, the algorithm’s
    detection accuracy for small faces was enhanced. The results of experiments on the WIDER FACE dataset prove that
    the proposed method not only can detect human faces with high accuracy in real time, but also has higher accuracy
    than traditional algorithms in small resolution input. After expanding the dataset, the proposed model also improves
    the detection accuracy under complex illuminations.

    Related Articles | Metrics
    Fully automatic matting algorithm for portraits based on deep learning
    SU Chang-bao, GONG Shi-cai
    2022, 43(2): 247-253.  DOI: 10.11996/JG.j.2095-302X.2022020247
    Abstract ( 140 )   PDF (1222KB) ( 97 )  
    Aiming at the problems of low completeness of character matting, insufficiently refined edges, and
    cumbersome matting in matting tasks, an automatic matting algorithm for portraits based on deep learning was
    proposed. The algorithm employed a three-branch network for learning: the semantic information of the
    semantic segmentation branch (SSB) learning  graph, and the detailed information of the detail branch (DB)
    learning  graph. The combination branch (COM) summarized the learning results of the two branches. First, the
    algorithm’s coding network utilized a lightweight convolutional neural network MobileNetV2, aiming to
    accelerate the feature extraction process of the algorithm. Second, an attention mechanism was added to the SSB
    branch to weight the importance of image feature channels, the atrous spatial pyramid pooling module was added
    to the DB branch, and multi-scale fusion was achieved for the features extracted from the different receptive
    fields of the image. Then, the two branches of the decoding network merged the features extracted by the
    encoding network at different stages through the jump connection, thus conducting the decoding. Finally, the
    features learned by the two branches were fused together to obtain the image  graph. The experimental results
    show that on the public data set, this algorithm can outperform the semi-automatic and fully automatic matting algorithms based on deep learning, and that the effect of real-time streaming video matting is superior to that of
    Modnet.

    Related Articles | Metrics
    Stereo image zero watermarking algorithm based on DOCT and SURF
    HAN Shao-cheng, ZHANG Peng
    2022, 43(2): 254-262.  DOI: 10.11996/JG.j.2095-302X.2022020254
    Abstract ( 60 )   PDF (11970KB) ( 56 )  
    Aiming at the poor geometric attack resistance of most current stereo image zero watermarking schemes, a
    blind detection stereo image zero watermarking algorithm based on discrete octonion cosine transform (DOCT) and
    speeded up robust features (SURF) was proposed. Firstly, the stationary wavelet transform (SWT) was performed on six
    components of left and right views of the original stereo image under CIEXYZ color space. Secondly, the above six
    low-frequency subbands were divided into non-overlapping blocks to construct the octonion image blocks at the
    corresponding positions, and then the DC coefficients of all image blocks after DOCT were directly calculated in spatial
    domain. Finally, the robust feature matrix was constructed by comparing the magnitude relationship between the
    modulus of each octonion DC coefficient and their overall mean value. Then, the final zero watermark for authenticating
    was generated by executing XOR operation on the feature matrix and encrypted watermark that had been processed by
    quantum key scrambling and 2D-LALM system encrypting. In addition, the SURF method was employed to perform
    geometric correction on the stereo image to be authenticated before the zero-watermark detection. Experimental results
    show that the proposed algorithm displays better robustness against conventional attacks and geometric attacks.
    Related Articles | Metrics
    Ultrasound image segmentation model based on edge entropy and local FT distribution
    CUI Wen-chao , XU De-wei , SUN Shui-fa , PAN Zhi-hong , WANG Xi-dong
    2022, 43(2): 263-272.  DOI: 10.11996/JG.j.2095-302X.2022020263
    Abstract ( 91 )   PDF (1129KB) ( 53 )  
    Local Gaussian distribution fitting (LGDF) or local Rayleigh distribution fitting (LRDF) models often give
    relatively poor performance on segmenting ultrasound images, due to the large bias in describing ultrasound images
    by either Gaussian or Rayleigh distribution, and the lack of guidance for ultrasound images edge information during
    image segmentation. To deal with these problems, an edge entropy weighted local Fisher-Tippett (FT) distribution
    fitting model was presented in this paper. According to the fact that the object and background in local regions of
    ultrasound images meet with different FT distributions, the proposed model adopted maximum a posteriori (MAP)
    probability to derive an energy function to be minimized. The energy function was solved by the level set method.
    Meanwhile, the edge entropy was included into the length regularization term as a weight function to guide the active
    contour to better capture the obscure and weak edges of the object. Extensive experiments on synthetic and real ultrasound images have demonstrated that the proposed model can not only achieve an enhancement for the local FT
    distribution fitting and the inclusion of the edge entropy, but also qualitatively and quantitatively outperform many of
    the existing methods.

    Related Articles | Metrics
    A U-Net based contour enhanced attention for medical image segmentation
    LI Cui-yun, BAI Jing, ZHENG Liang
    2022, 43(2): 273-278.  DOI: 10.11996/JG.j.2095-302X.2022020273
    Abstract ( 1476 )   PDF (1519KB) ( 877 )  
    Medical image segmentation is vital for medical image processing. With the development of deep learning,
    image segmentation techniques have achieved remarkable development. However, there remain fuzzy and inaccurate
    problems in the discrimination of contour pixels for lesion features. To address the problems, we proposed a contour
    enhanced attention (CEA) module. It can obtain rich location information by feature encoding in two different
    directions and strengthen contours by calculating the offset between location features and input features. Furthermore,
    we constructed a U-Net for medical image segmentation based on the proposed module, it can break through the space
    limitation of convolution kernel, thus capturing position-aware cross-channel information and clearer edge contour
    information. In doing so, the accuracy of segmentation can be improved. Experiments on the public Kvasir-SEG dataset demonstrates that the network with CEA module achieves better results in Dice, precision, recall rate, and
    other evaluation indexes in medical segmentation.


    Related Articles | Metrics
    Finger-knuckle-print recognition based on multi-dimensional matching distances fusion
    HUANG Jie, WEI Xin, YANG Zi-yuan, MIN Wei-dong
    2022, 43(2): 279-287.  DOI: 10.11996/JG.j.2095-302X.2022020279
    Abstract ( 77 )   PDF (942KB) ( 76 )  
    As a novel biometric modality, finger-knuckle-print (FKP) recognition has gained much attention for its
    security and stability. Coding-based methods are considered as one of the most effective methods in this field. Such
    methods can distinguish samples according to one single matching distance between two images computed from the
    extracted features in the template matching stage. However, some fuzzy samples cannot be effectively distinguished
    by one single matching distance, leading to false acceptance and false rejection. To address this problem, a
    light-weight and effective method based on multi-dimensional matching distances fusion was proposed in this paper. The proposed method utilized the difference and complementarity between different matching distances of multiple
    coding-based methods, and applied support vector machine (SVM) to the classification of the multi-dimensional
    feature vectors constructed by the multiple matching distances. What’s more, the proposed method is a general
    method, which can be easily embedded into the existing coding-based methods. Extensive experiments were
    conducted for the range from two-dimensional matching distances to four-dimensional matching distances on the
    public FKP database, PolyU-FKP. The results have shown that the proposed method can generally improve their
    performances, with a maximum reduction of 22.19% in EER.

    Related Articles | Metrics
    Landmark detection based on perspective down-sampling and neural network
    LI Yu-zhen, CHEN Hui, WANG Jie, RONG Wen
    2022, 43(2): 288-295.  DOI: 10.11996/JG.j.2095-302X.2022020288
    Abstract ( 65 )   PDF (9278KB) ( 50 )  
    In the field of intelligent driving, a neural network-based and perspective down-sampling-based landmark
    detection method was proposed to accurately detect the road guide signs in real time. This proposed method can
    effectively solve the problems of poor real-time performance of traditional detection methods and low detection
    accuracy for complex scenes and remote small targets. Firstly, the region of interest for the image was selected for
    perspective down-sampling to reduce the near resolution of the road image, reduce the image size, and eliminate the
    perspective projection error. Secondly, the YOLOv3-tiny target detection network was enhanced. The boundary frame
    clustering of self-built data set was implemented by k-means++. The convolution layer was added to strengthen the
    shallow features and enhance the small target representation ability. By changing the fusion scale of feature pyramid,
    the prediction output was adjusted to 26×26 and 52×52. Finally, the accuracy rate was elevated from 78% to 99% on
    the self-built multi-scene data set, and the model size was reduced from 33.8 MB to 8.3 MB. The results show that a neural network-based and perspective down-sampling-based landmark detection method displays strong robustness,
    higher detection accuracy for small targets, and is readily deployable on low-end embedded devices.

    Related Articles | Metrics
    A traffic police object detection method based on optimized YOLO model
    LI Ni-ni, WANG Xia-li, FU Yang-yang, ZHENG Feng-xian, HE Dan-dan, YUAN Shao-xin
    2022, 43(2): 296-305.  DOI: 10.11996/JG.j.2095-302X.2022020296
    Abstract ( 224 )   PDF (19603KB) ( 356 )  
    To tackle the problems of low accuracy of detection and localization for traffic police object in complex
    traffic scenes, a method to detect traffic police object based on the optimized YOLOv4 model was proposed in this
    study. Firstly, four random transformation methods were employed to expand the self-built traffic police data set, so as
    to solve the problem of model over-fitting and improve the generalization ability of the network model. Secondly, the
    YOLOv4 backbone network was replaced with the lightweight MobileNet. The Inception-Resnet-v1 structure was
    introduced to reduce the parameter numbers and deepen the network layers of the model effectively. Then, the
    K-means++ clustering algorithm was adopted to perform clustering analysis on the self-built data set. In doing so, the
    initial candidate frame of the network was redefined, and the learning efficiency was improved for traffic police object
    depth features. Finally, to address the problem of the imbalance of positive and negative samples in the process of
    network training, the focus loss function was introduced to optimize the classification loss function. Experimental
    results demonstrate that the size of the optimized YOLOv4 model is only 50 M and the AP value reaches up to 98.01%.
    compared with Faster R-CNN, YOLOv3, and the original YOLOv4 model, the optimized network has been
    significantly improved. The proposed method can effectively solve the problems of missed detection, false detection, and low accuracy for traffic police object in current complex traffic scenes.
    Related Articles | Metrics
    Computer Graphics and Virtual Reality
    Mixed reality simulation system for emergency escape design of civil aircraft flight crew
    WU Cheng-cheng, LYU Yi, YUAN Xin-hao, XU Shu-hong
    2022, 43(2): 306-315.  DOI: 10.11996/JG.j.2095-302X.2022020306
    Abstract ( 83 )   PDF (4249KB) ( 74 )  
    Emergency escape simulation for civil aircraft crew helps to identify the potential problems of crew escape
    hatch design during the early development of aircrafts, and ensures the safety of crew members. This paper presented
    a mixed-reality simulation system for emergency escape of civil aircraft flight crew. To solve the key problem of
    human body virtual-physical matching, an optical-inertial hybrid whole-body human motion capture method was
    proposed. The method, working together with the Kinect2-based human body key dimension matching technique, can
    effectively improve the efficiency and robustness of human body virtual-physical matching. The proposed
    mixed-reality simulation system has been successfully applied to the development of large domestic aircrafts.
    Experimental results show its efficiency in the evaluation of crew escape hatch design.
    Related Articles | Metrics
    Two-stage adjustable perceptual distillation network for virtual try-on
    CHEN Bao-yu, ZHANG Yi, YU Bing-bing, LIU Xiu-ping
    2022, 43(2): 316-323.  DOI: 10.11996/JG.j.2095-302X.2022020316
    Abstract ( 80 )   PDF (5051KB) ( 71 )  
    It is known that image-based virtual try-on can fit a target garment image to a person image, and that this
    task has gained much attention in recent years for its wide applications in e-commerce and fashion image editing. In
    response to the characteristics of the task and the shortcomings of existing approaches, a method of two-stage
    adjustable perceptual distillation (TS-APD) was proposed in this paper. This method consisted of 3 steps. Firstly, two
    semantic segmentation networks were pre-trained on garment image and person image respectively, thus generating
    more accurate garment foreground segmentation and upper garment segmentation. Then, these two semantic
    segmentations and other parsing information were employed to train a parser-based “tutor” network. Finally, a
    parser-free “student” network was trained through a two-stage adjustable perceptual distillation scheme, taking the
    fake image generated by the “tutor” network as input and the original real person images as supervision. It can be
    perceived that the “student” model with distillation is able to produce high-quality try-on images without human
    parsing. The experimental results on VITON datasets show that this algorithm can achieve 9.10 FID score, 0.015 3 L 1
    score, and 0.985 6 PCKh score, outperforming the existing methods. The user survey also shows that compared with
    other methods, the images generated by the proposed method are more photo-realistic, with all the preference scores reaching more than 77%.
    Related Articles | Metrics
    Track fastener localization algorithm based on geometric features and the spike center point localization
    CAO Yi-qin, YI Hu, QIU Yi, ZHOU Yi-wei
    2022, 43(2): 324-332.  DOI: 10.11996/JG.j.2095-302X.2022020324
    Abstract ( 70 )   PDF (25856KB) ( 64 )  
    To solve the problems of positioning failure and accuracy reduction caused by skewedness and nonstandard
    size of images in the track image, a fastener positioning algorithm based on the spike center point location and
    geometric structure features was proposed. The new method adopted the idea of first locating the center point of the
    spike, and then locating the fasteners with geometric features. Based on the edge image obtained by image
    preprocessing, the edges of track spike in the image would be characteristic of roundness after being corroded and
    dilated. Then, by means of the Hough transform circle detection algorithm, the rough area of the spike was located and
    expanded, so that the spike area could be roughly extracted from the original image. The edges of spike area image
    were then detected and OpenCV, a contour extraction and polygon detection algorithm, was employed to accurately fit
    the spike hexagon and calculate the spike center point. Finally, the coordinates of each vertex of the fastener bounding
    box was obtained using the fastener location algorithm proposed based on geometric structure features. The
    experiment results show that the positioning accuracy of the new algorithm is 99.33%, the precision is 0.997, and the
    speed is 29.8 fps, superior to the algorithms compared. Meanwhile, under different circumstances, such as weather
    conditions, spike corrosion, or occlusion, the new algorithm displays better robustness and anti-interference ability.
    Related Articles | Metrics
    Lightweight human pose estimation with global pose perception
    LIU Yu-jie, ZHANG Min-jie, LI Zong-min, LI Hua
    2022, 43(2): 333-341.  DOI: 10.11996/JG.j.2095-302X.2022020333
    Abstract ( 203 )   PDF (4932KB) ( 132 )  
    Human pose estimation has been a hot topic in the field of human-computer interaction in recent years. At
    present, the common methods for human pose estimation focus on improving the accuracy by increasing the network
    complexity. However, the cost-effectiveness of the model was ignored, resulting in high accuracy of the model in
    practice but huge consumption of computational resources. In this paper, a model for lightweight hu-man pose estimation
    with global pose perception was designed. It has an accuracy of 68.2% AP on the MSCOCO dataset, and the speed
    remains at 255 fps, and the parameter amount and FLOPS are 10% and 0.9% that of the OpenPose method, respectively.
    In the human pose estimation task, the number of output channels of the network will be set according to the number of
    predicted key joints, leading to independent detection of each key joint. Global information, such as the relative position
    between key points and the overall layout, is of great significance to the pose estimation task for difficult samples, in which was absent from previous studies. In order to utilize the global pose information, a global pose perception module
    was designed to extract the global pose features, and the two-branch network was employed to fuse the global and local
    pose features. Experiments show that the lightweight human pose estimation network with global pose perception can
    increase the accuracy by 1.5% and 1.3% on the MPII and MSCOCO datasets, respectively.

    Related Articles | Metrics
    FFF of a three-dimensional continuous weaving filling pattern
    WU Huan-xiao, YAO Yuan, YANG Jin-xiu, DING Cheng
    2022, 43(2): 342-347.  DOI: 10.11996/JG.j.2095-302X.2022020342
    Abstract ( 76 )   PDF (2071KB) ( 60 )  
    In order to improve the mechanical strength and reduce the anisotropy of fused filament fabrication (FFF)
    workpiece, a three-dimensional continuous braiding path planning method was proposed. The method employed
    continuous fiber reinforced material as printed material, designed an eight-loop structure, and utilized 3D printer
    nozzle to extrude wire for the generation of warp/weft yarn. The movement of the FFF platform was controlled in z
    direction, continuous deposition route similar to 3D woven fiber was produced, and different layers were mutually
    interacted and embedded in order to realize interlock between adjacent section planes, thus improving the connection
    strength within and between layers. This cyclic structure supported continuous path planning, so that it can be
    manufactured on a conventional three-axis fuse manufacturing platform and be widely applicable. Compared with the
    standard sample, the rationality and feasibility of the braided structure were verified. Experiments show that the 3D
    continuous fiber braided printing path can support the filling of different structures, and can effectively reduce the
    anisotropy of mechanical properties caused by layered deposition of materials, thereby enhancing the reliability of
    printing pieces with complex structures.
    Related Articles | Metrics
    Industrial Design
    Relationship model between Chinese character font stroke shape and emotional image
    OUYANG Jin-yan, GAO Xuan-han, ZHANG Shu-tao, WANG Xu-hong, ZHOU Ai-min
    2022, 43(2): 348-355.  DOI: 10.11996/JG.j.2095-302X.2022020348
    Abstract ( 132 )   PDF (2562KB) ( 105 )  
    In order to reveal the internal relationship between the morphological features of Chinese characters and the
    emotional images of the audience, a relationship model between the morphological features of Chinese characters and
    the emotional image was proposed from the perspective of visual cognition. First, the design elements of Chinese
    character font stroke shape were analyzed to construct its project and category table using the morphological analysis
    method. Then, the K-means clustering algorithm was employed to select the representative emotional image words,
    and a semantic difference (SD) questionnaire was issued to obtain the emotional image scores for each font sample.
    Finally, the multiple linear regression method was used to establish the relationship model between the design
    elements of font stroke shape and emotional images. From the coefficient of the expression, the influence of each
    morphological feature element on the emotional images can be analyzed. The model can provide technical support for
    the image positioning of Chinese character font design, and provide a new idea and method for the relevant research.
    It is applicable to the practice in the field, and the results show that the method is of high feasibility and reliability.
    Related Articles | Metrics
    Complexity analysis method of human-machine interaction task in intelligent vehicle cockpit
    MA Ning, WANG Ya-hui
    2022, 43(2): 356-360.  DOI: 10.11996/JG.j.2095-302X.2022020356
    Abstract ( 317 )   PDF (780KB) ( 275 )  
    The tasks and behaviors of human-machine interaction (HMI) in the intelligent vehicle cockpit directly
    affect users’ experience in the cockpit. To prevent the risk of poor interface usability for the automobile interior and
    exterior designers and car UI designers, the HMI behaviors in intelligent vehicles were studied quantitatively, and the
    complexity indexes of HMI tasks were summarized. Then the specific task indexes affecting the HMI complexity in
    the intelligent cockpit and their weight distribution were extracted, and an entropy-based measurement method for
    HMI task complexity of intelligent vehicles was proposed. Finally, the algorithm was verified by an example of an
    intelligent car cockpit. The results showed that the complexity of HMI tasks in the cockpit was impacted by many
    factors, such as the logical structure of HMI task, the knowledge level and cognitive quantity of HMI, and the
    complexity of HMI digital interface layout in the cockpit. These factors warrant more attention from designers. The
    proposed method can help designers avoid the high risk of design complexity and cost of user learning, and assist
    them to intervene in advance in the design problems related to the above indicators.
    Related Articles | Metrics
    Published as
    Published as 2, 2022
    2022, 43(2): 361-361. 
    Abstract ( 51 )   PDF (109822KB) ( 43 )  
    Related Articles | Metrics