Loading...
Welcome to Journal of Graphics share: 

Table of Contents

    For Selected: Toggle Thumbnails
    Cover
    Cover of issue 6, 2022
    2022, 43(6): 0. 
    Abstract ( 96 )   PDF (1596KB) ( 109 )  
    Related Articles | Metrics
    Contents
    Table of Contents for Issue 6, 2022
    2022, 43(6): 1. 
    Abstract ( 65 )   PDF (224KB) ( 78 )  
    Related Articles | Metrics
    Review
    Computer aided topological design ——survey on geometric design and processing based on persistent homology
    DONG Zhe-tong, LIN Hong-wei
    2022, 43(6): 957-966.  DOI: 10.11996/JG.j.2095-302X.2022060957
    Abstract ( 172 )   PDF (982KB) ( 154 )  
    Persistent homology is an effective method to compute topological features with different scales. It captures the birth and death time from a nested sequence of simplicial complexes, and employs the life span of a topological feature to measure its geometric scale and significance. The extraction and application of topological features play an important role in geometric design, spawning some studies on geometric design based on persistent homology. In this paper, we introduced the application studies in two aspects, i.e., the feature extraction of persistent homology and the persistent homology-based modeling and optimization. In the feature extraction of persistent homology, we introduced various methods for topological feature extraction from point clouds and triangular meshes, respectively. Meanwhile, the pipeline for applying topological features to some geometric design problems was summarized. In the persistent homology-based modeling and optimization, we reviewed the simplicial complex reconstruction methods based on topology transform, topology-aware surface reconstruction methods, and topological denoising and optimization based on persistent homology. 
    Related Articles | Metrics
    A survey of path planning and feedrate interpolation in computer numerical control 
    MA Hong-yu , SHEN Li-yong , JIANG Xin , ZOU Qiang , YUAN Chun-ming
    2022, 43(6): 967-986.  DOI: 10.11996/JG.j.2095-302X.2022060967
    Abstract ( 206 )   PDF (3403KB) ( 143 )  
    Numerical control technology is widely employed in the modern manufacturing industry, and related research has been emphasized by academia and industry. The traditional process of numerical control technology is mainly composed of tool path planning and feedrate interpolation. To attain the machining of high speed and precision, several problems in tool path planning and feedrate interpolation are usually transformed into mathematical optimization models. In view of the complexity of engineering application problems, stepwise iterative optimization method is used to address the problems, but the results are usually only locally optimal. Secondly, both tool path planning and feedrate interpolation are designed to process a workpiece surface. Although the calculation is simplified into two steps, the overall optimization cannot be achieved. Therefore, in order to better undertake the research on the integrated design and optimization idea of tool path planning and feedrate interpolation, it is necessary to systematically review and drawn on the existing representative works. We will introduce the relevant methods and technical progress of tool path planning and feedrate interpolation in CNC machining successively, including tool path planning based on end milling, tool orientation optimization, G-code processing and corner transition, feedrate planning of parameter curves, and some new machining optimization methods proposed recently. Among them, the tool path planning methods can be briefly classified into four categories: iso-parametric methods, iso-planar methods, iso-scallop height methods and the newly proposed methods based on vector field. The tool orientation optimization methods can be divided into local optimization methods considering error constraints and global optimization methods based on C-space. And the recent related work of G-code processing and corner transition mainly includes: micro line segments corner transition methods, spline based global fitting methods and finite impulse response-based methods. For parameter curves, the corresponding feedrate planning methods mainly include acceleration/deceleration-based methods, optimization-based approaches and the methods of integrating smoothness and feedrate. After that, we will introduce some emerging methods and techniques like surface segmentation for subtractive/additive manufacturing and the integrated interpolation methods for tool paths. 
    Related Articles | Metrics
    A note on solid modeling: history, state of the art, future 
    ZOU Qiang
    2022, 43(6): 987-1001.  DOI: 10.11996/JG.j.2095-302X.2022060987
    Abstract ( 474 )   PDF (4217KB) ( 362 )  
    Solid modeling is a technique underlying CAD software as we see it today, and its theories and algorithms are among the most fundamental milestones in the historical development of CAD. Basically, it has answered the question of what geometric information a computer should store and how to store/manipulate them in order for the computer to aid the processes of design and manufacturing. This paper provides a brief review on the historical development of solid modeling, its fundamental research problems, as well as their challenges and state of the art. It then concludes with three prospective trends of solid modeling, especially the promising paradigm shift from “Computer-Aided Design” to “Computer-Automated Design”. 
    References | Related Articles | Metrics
    A Survey of theory and applications of U-system and V-system 
    CHEN Wei , CAI Zhan-chuan , LI Jian , LIANG Yan-yan , XIONG Gang-qiang , SONG Rui-xia
    2022, 43(6): 1002-1017.  DOI: 10.11996/JG.j.2095-302X.2022061002
    Abstract ( 90 )   PDF (5381KB) ( 69 )  
    The traditional Fourier analysis and continuous wavelet method will produce relatively enormous errors due to the interference of Gibbs phenomenon. To solve this problem, Qi Dongxu proposed the research topic of discontinuous orthogonal function systems, among which U-system and V-system are two typical discontinuous complete orthogonal function systems. In terms of the mathematical theory, U-system and V-system are the results of the extension of the well-known Walsh function and Haar function from piecewise constant to piecewise k degree polynomial, respectively. The most important feature of U-system is that there are both smooth functions and discontinuous functions at various levels in the function system. Therefore, U- and V- systems can process both continuous and discontinuous information, making up for the shortcomings of Fourier analysis and continuous wavelet to a certain extent. This paper reviewed U- and V- systems from two aspects: theory and application. Theoretically, firstly, the construction methods of univariate U-system and V-system were introduced, respectively, then the construction methods of V-system on triangular domain were introduced, and finally the main properties of U- and V-systems were introduced. In terms of application, some representative cases of applications were introduced. 
    Related Articles | Metrics
    T-splines a new representation for CAD, CAE and CAM
    HU Wen-kai , MA Hong-yu , LIU Ya-zui , WEI Xiao-dong , ZHAO Gang , SHEN Li-yong , LI Xin
    2022, 43(6): 1018-1033.  DOI: 10.11996/JG.j.2095-302X.2022061018
    Abstract ( 144 )   PDF (8323KB) ( 122 )  
    Geometric modeling is the main research area in computer-aided design (CAD), computer-aided engineering (CAE), and computer-aided manufacturing (CAM), focusing on the representation and design of geometric models on computers. Currently, the standard representation is non-uniform rational B-splines (NURBS). However, due to the tensor-product structure, NURBS has several main limitations in terms of the representations of complex models. T-spline is a new NURBS compatible free-form surface representation technology. T-splines have drawn wide attention both from academia and industry, after solving several limitations of the standard NURBS representation. To enable a more comprehensive understanding of T-splines, a systematic review was conducted. Based on the extensive literature research, the definition and basic operations for T-splines and applications of T-splines in CAD, CAE and CAM were summarized and compared, which mainly focused on the basic ideas, theories, merits and demerits of various algorithms. Due to the high demand of accuracy and efficiency required by geometric representation in the industry, the research on T-spline remains imperfect, and a number of problems and possible prospects are in need of exploration. 
    Related Articles | Metrics
    Methods of porous structure design 
    LI Ming , ZHANG Cheng-hu , HU Jing-qiao , HU Xin-zhuo , LIU Ji-kai
    2022, 43(6): 1034-1048.  DOI: 10.11996/JG.j.2095-302X.2022061034
    Abstract ( 67 )   PDF (9870KB) ( 74 )  
    The porous models are of light weight and excellent composite mechanical, thermal, and magnetic properties. They are expected to break through the traditional design limit, to obtain mechanical parts with excellent comprehensive performance, and to meet the extreme physical performance pursuit of advanced industrial products. In recent years, the development and maturity of additive manufacturing technology have boosted the industrial applications of porous models, playing a unique and outstanding industrial role in aerospace components, medical devices, and other important equipment or instruments. This review focused on the design method of porous model, and described the related work from two aspects: the forward design method of porous model via geometric modeling and the reverse design method of porous model via topology optimization. In case of the former, the porous model modeling methods were discussed, such as discrete voxel representation, continuous parameter representation, continuous implicit representation, others and mixed representation, while the latter was expounded on in terms of the optimization design methods of porous microstructure units and the overall porous model structures, as well as the trend of porous model design from these two aspects. 
    Related Articles | Metrics
    Geometry-guided active 3D perception and interaction
    XU Kai, HU Rui-zhen, YANG Xin
    2022, 43(6): 1049-1056.  DOI: 10.11996/JG.j.2095-302X.2022061049
    Abstract ( 103 )   PDF (4028KB) ( 120 )  
    With the proliferation of 3D sensors and the development of large-scale 3D data, visual perception based on 3D reconstruction and understanding has received much attention. Meanwhile, intelligent graphics also leads a breakthrough in active interaction, becoming task-driven and targeting both virtual and real environments. In this sense, computer graphics, which is traditionally a field of information expression, is now expanding into the territory of information sensing. The interaction of computer graphics is also moving towards active interaction driven by intelligent tasks. Alongside this trend, data-driven analysis and modeling of 3D data, especially the corresponding online techniques, have been playing a critical role. This article expounded on active 3D perception and interaction from the perspective of the fusion between graphics and vision, along with several concrete research examples. A special emphasis was put on the advantages and challenges of being active for 3D perception and 3D interaction, and tentative explorations were made on the open problems and trends along this direction. 
    Related Articles | Metrics
    Computer Graphics and Virtual Reality
    Representation of a kind of G2 continuous composite curve 
    YAN Lan-lan, SONG Xi-chen, WEI Zi-hua, XIE Lei
    2022, 43(6): 1057-1069.  DOI: 10.11996/JG.j.2095-302X.2022061057
    Abstract ( 54 )   PDF (596KB) ( 61 )  
    To meet the strict requirements for the control points made by the G2 continuity conditions of the Bézier curve and many existing extended Bézier curves with shape parameter, a G2 continuous composite curve representation method was proposed. The method could synthesize the advantages of the Bézier method and B-spline method, and its basis function had explicit expression. It was of the automatic smoothness as that of the B-spline method, easily possessing the end-point geometric characteristic of the Bézier curve. To this end, a set of basis function with six parameters was constructed. On this basis, a curve segment based on four control points was constructed according to the definition mode of the cubic Bézier curve. According to the -continuity conditions between the curve segments, a kind of composite curve on four-point piecewise scheme was constructed according to the definition mode of the cubic B-spline curve. The basis function was of total positivity, and contained the cubic Bernstein basis functions and the cubic B-spline basis functions that were determined by the node vector with the repetition degree of all internal nodes being one. The curve segment had the feature of convexity-preserving, endpoint position, and adjustable shape, and contained the cubic Bézier curve and the cubic B-spline curve segment as special cases. The definition of the composite curve could automatically ensure its G2 continuity at each junction. The composite curve could have end-point interpolation and end-edge tangency by setting some of its parameters as specific values. At this point, the composite curve still contained independent parameters used to adjust its internal shape. As long as the parameters of the composite curve were selected according to certain rules, the C2 continuous cubic B-spline curve could be reconstructed. 
    Related Articles | Metrics
    Free-form deformation based on extension factor for toric-Bézier curve 
    WANG Han , ZHU Chun-gang
    2022, 43(6): 1070-1079.  DOI: 10.11996/JG.j.2095-302X.2022061070
    Abstract ( 36 )   PDF (693KB) ( 56 )  
    To gain ideal geometric deformation results, the expansion factor and the toric degeneration are applied to the toric-Bézier curve, realizing the free-form deformation of the curve. Firstly, the with parameter t weight factor was constructed by the given lifting function, thereby obtaining the with parameter t toric-Bézier curve. Secondly, according to the selected center of deformation, region of deformation, smoothness of deformation region boundary, and select rule of the control function f (t), the appropriate control function was selected, and the extension factor was determined, thus constructing the deformation matrix. Then, the deformation matrix acted on the with parameter t toric-Bézier curve. At last, when t tended to reach infinity, the target curve was obtained, and the free-form deformation of the toric-Bézier curve could be achieved. By changing the control parameters interactively, the expected deformation result could be attained, and the deformation animation demo of the toric-Bézier curve could be yielded. The experiments showed that the technique was simple and easy to control. The curve could be deformed freely both globally and locally, and the technique was of adjustability and foreseeability. Such a technique could be in repeated use, thereby generating the rich deformation animation results, which could be applicable to many fields, such as geometric modeling and computer animation. 
    Related Articles | Metrics
    Atomic model rendering method based on reference images 
    WU Chen , CAO Li , QIN Yu , WU Miao-miao , Koo SiuKong
    2022, 43(6): 1080-1087.  DOI: 10.11996/JG.j.2095-302X.2022061080
    Abstract ( 28 )   PDF (4929KB) ( 40 )  
    Along with advances in biology and the simulation of nano electronic devices, atomic structures play a crucial role in modern science and technology. The complex details of the atomic structure result in the far-reaching impact of the position of the light source on the rendering effect, incurring difficulties in rendering atomic models. On this basis, an atomic model rendering method based on a reference image was proposed, in which the lighting parameters of the reference image were calculated for the rendering of the atomic model. First, a POV-Ray script was used to render a batch of models at different light angles by changing the light source positions, and the light source position parameters and rendered images were collected to obtain a dataset of rendered images corresponding to the light source positions. Then, the light source estimation network was designed with the residual neural network as the backbone, and the attention mechanism was embedded in the network to enhance the network accuracy. The optimized light source estimation network was employed to train the dataset and regress the light source location parameters. Finally, the trained convolutional neural network was used to estimate the rendering parameters of the reference image, and the target model was rendered using the rendering parameters. The experimental results show that the parameters predicted by the network are highly reliable with minimal error compared with the real lighting parameters. 
    Related Articles | Metrics
    Fast rendering of Boolean difference of solidsfor CNC machining simulation
    FENG Zi-yan, WU Rui-cheng, BO Peng-bo
    2022, 43(6): 1088-1095.  DOI: 10.11996/JG.j.2095-302X.2022061088
    Abstract ( 46 )   PDF (2428KB) ( 51 )  
    In order to improve the speed of CNC milling simulation, a method for fast rendering of differences between entity sets was proposed. By applying space partition of the entities, the dimension of the problem was reduced and fast rendering could be achieved. The number of sweeping volumes of milling tools in the rendering sequence and the repeated drawing of primitives could be reduced by subdividing the viewing frustum. At the same time, the data of the sub-window was independent in the drawing process, facilitating the parallel generation of images. Several experiments demonstrated that the proposed algorithm can achieve the real-time rendering of CNC milling process simulation, and provided the processing simulation results of several industrial parts by analyzing and experimentally verifying the parameters of the algorithm. 
    Related Articles | Metrics
    AC-HAPE3D: an algorithm for irregular packing based on reinforcement learning
    ZHU Peng-hui, YUAN Hong-tao, NIE Yong-wei, LI Gui-qing
    2022, 43(6): 1096-1103.  DOI: 10.11996/JG.j.2095-302X.2022061096
    Abstract ( 57 )   PDF (1112KB) ( 60 )  
    In areas such as 3D printing and express logistics, irregular packing results from the need to place parts or goods of different shapes in a defined space. A placement solution could be put forward, allowing as many polyhedra as possible to fit into a given container, or a batch of objects could be placed so closely together that they occupy the smallest volume, which is known as the irregular packing problem. This is an NP problem but is difficult to solve efficiently. This paper undertook the following investigation: placing a given set of polyhedra inside a 3D container with a variable dimension, so that the variable dimension of the packed container could be minimized. We proposed a reinforcement learning based algorithm, AC-HAPE3D. This algorithm could model the problem into a Markov process using the heuristic algorithm HAPE3D, and then utilize the policy-based reinforcement learning method Actor-Critic. We simplified the representation of state information by using voxels to represent containers and polyhedra, and employed neural networks to represent value and policy functions; to address the problem of variable length of state information as well as action space, we adopted a masking approach to masking some of the inputs and outputs, and introduced LSTM to handle variable length of state information. Experiments conducted on five different datasets show that the algorithm can yield good results. 
    Related Articles | Metrics
    Error-bounded unstructured T-spline surface fitting with low distortion 
    GUAN Qi-chao, LIU Hao, WANG Yuan-cheng, FU Xiao-ming
    2022, 43(6): 1104-1113.  DOI: 10.11996/JG.j.2095-302X.2022061104
    Abstract ( 48 )   PDF (4928KB) ( 51 )  
    In order to calculate the unstructured T-spline fitting surface with low distortion and meet the fitting error threshold and fewer control points for any complex topology fitting domain, we presented a step-by-step solution method. First, a polycube structure with the same topology as the fitting domain was generated as the parameter domain, and the corresponding relationship between the surface to be fitted and the parameter domain was optimized through multiple re-parameterization processes, thus obtaining a low distortion mapping suitable for the generation of low fitting error spline surfaces. At the same time, with the local subdivision property of the unstructured T-spline, the region, which did not meet the fitting error threshold, was adaptively subdivided, and the low distortion spline surface meeting the fitting error threshold was obtained. Next, a simplification strategy of fitting surface was presented to delete redundant control vertices. On the basis of meeting the fitting error threshold and low distortion, redundant control vertices were deleted, and a low distortion unstructured T-spline fitting surface was obtained with less control vertices and bounded error. The effectiveness of this method was verified on various complex models. Compared with the latest methods, this method could attain lower parametric distortion with fewer control vertices. 
    Related Articles | Metrics
    Circle packing based texture generation
    HE Ke-yu, CHEN Zhong-gui
    2022, 43(6): 1114-1123.  DOI: 10.11996/JG.j.2095-302X.2022061114
    Abstract ( 46 )   PDF (8817KB) ( 44 )  
    Artificial decorative textures are in wide use in our lives. The traditional case-based texture generation methods would first place some small primitives on the target area, then iteratively grow these primitives, and finally fill the entire target area. In the iteration process, there would be intersections and overlays between adjacent primitives, entailing the deforming, clipping, and other processing of primitives, which was usually time-consuming. Procedure-based methods can generate textures with rich layers in the two-dimensional plane by designing various rules with complex structures. However, such methods would be difficult to extend to the 3D space. This paper provided a texture generation method based on circle packing, thereby generating 2D or 3D textures. As an NP-hard problem, the circle packing problem could be converted into a nonlinear optimization problem, so that it could be quickly and approximately solved. With the problem solved, different rules could be defined to fill or replace the circle to generate textures. Since the texture is generated by rules, the proposed method could avoid intersections and overlays between primitives. 
    Related Articles | Metrics
    Image Processing and Computer Vision
    1. School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai Shandong 264005, China; 2. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100195, China
    GUO Wen , LI Dong , YUAN Fei
    2022, 43(6): 1124-1133.  DOI: 10.11996/JG.j.2095-302X.2022061124
    Abstract ( 60 )   PDF (1114KB) ( 80 )  
    The key to achieving point cloud face recognition is discriminative feature extraction and noise robustness for low quality data. To address the problems that the existing lightweight point cloud face recognition algorithms cannot adequately extract discriminative features and that the large amount of noise in the dataset affects model training, we designed a lightweight and efficient network model and proposed a point cloud face recognition algorithm based on multi-scale attention fusion and noise-resistant adaptive loss function. Firstly, the features of receptive fields of different sizes were generalized. Then, the multi-scale attention features were extracted, and high-level attention weights were utilized to guide the generation of low-level attention weights. Finally, channel fusion was performed to obtain multi-scale fusion features, which improved the model’s ability to capture face details. Meanwhile, according to the noise information characteristics of low-quality point cloud face images, a novel anti-noise adaptive loss function was designed to deal with the possible negative impact of the large amount of noise in the dataset on the model training process, thus enhancing the robustness and generalization ability of the model. Experiments on open-source datasets such as Lock3Dface and KinectFaces show that the proposed method yields better performance on low-quality 3D face recognition accuracy. 
    Related Articles | Metrics
    3D object detection based on semantic segmentation guidance 
    CUI Zhen-dong , LI Zong-min, YANG Shu-lin , LIU Yu-jie , LI Hua
    2022, 43(6): 1134-1142.  DOI: 10.11996/JG.j.2095-302X.2022061134
    Abstract ( 97 )   PDF (9324KB) ( 71 )  
    3D object detection is one of the most popular research fields in computer vision. In the self-driving system, the 3D object detection technology detects the surrounding objects by capturing the surrounding point cloud information and RGB image information, thereby planning the upcoming route for the vehicle. Therefore, it is of great importance to attain the accurate detection and perception of the surrounding environment. To address the loss of foreground points incurred by random sampling in the field of 3D object detection, a random sampling algorithm based on semantic segmentation was proposed, which guided the sampling process through the predicted semantic features, so as to increase the sampling proportion of foreground points and heighten the precision of 3D object detection. Secondly, to address the inconsistency between the location confidence of 3D object detection and the classification confidence, the CL joint loss was proposed, leading the network to select the 3D bounding box with high location confidence and classification confidence, so as to prevent the ambiguity caused by the traditional NMS only considering the classification confidence. Experiments on KITTI 3D object detection datasets show that the proposed method can improve the precision at the three levels of difficulties: easy, moderate, and hard, which verifies the effectiveness of the method in 3D object detection task. 
    Related Articles | Metrics
    Image smoothing based on image decomposition and relative total variation
    LIU Ye-peng , YANG De-zhi , LI Si-yuan , ZHANG Fan , ZHANG Cai-ming,
    2022, 43(6): 1143-1149.  DOI: 10.11996/JG.j.2095-302X.2022061143
    Abstract ( 109 )   PDF (15833KB) ( 109 )  
     The purpose of image smoothing is to process the image through certain technical methods, so as to remove the texture details in the image while preserving the important structural edges. Therefore, how to distinguish the two correctly has become the key to image smoothing. Gradients, as an important index for calculating the speed of image change, are an effective measure to distinguish the structural edges and texture details. However, the gradient difference between texture details and structural edges in different images or different regions of the same image is not fixed. In order to effectively distinguish structural edges and texture details based on gradients, an image smoothing method was proposed based on image decomposition and relative total variation. To expand the difference between the structural edges and texture details, the gradients of texture details were reduced without changing the gradients of structural edges as much as possible. The image was decomposed in frequency domain under the constraint of multi-directional gradient, and then the smooth components of the decomposed image were extracted. Next, for the smooth component of the input image, at the specific scale, based on the structural differences of a specific size region of the image, the relative total variation method was employed to remove the texture details at this scale while preserving the structure edges. Finally, through iterative optimization, the size of the image region was continuously adjusted to gradually remove texture details of different scales. Compared with the existing algorithms, the new method could attain better visual effects in effectively removing the texture details and completely preserving the structure edges. 
    Related Articles | Metrics
    Multi-scale modality perception network for referring image segmentation
    LIU Jing , HU Yong-li , LIU Xiu-ping , TAN Hong-chen , YIN Bao-cai
    2022, 43(6): 1150-1158.  DOI: 10.11996/JG.j.2095-302X.2022061150
    Abstract ( 69 )   PDF (6581KB) ( 78 )  
    Referring image segmentation (RIS) is the task of parsing the instance referred to by the text description and segmenting the instance in the corresponding image. It is a popular research topic in computer vision and media. Currently, most RIS methods are based on the fusion of single-scale text/image modality information to perceive the location and semantic information of referential instances. However, it is difficult for single-scale modal information to simultaneously cover both the semantics and structural context information required to locate instances of different sizes. This defect hinders the model from perceiving referent instances of any size, which affects the model’s segmentation of referent instances of different sizes. This paper designed a Multi-scale Visual-Language Interaction Perception Module and a Multi-scale Mask Prediction Module to solve this problem. The former could enhance the model’s ability to perceive instances at different scales and promote effective alignment of semantics between different modalities. The latter could improve the performance of referring instance segmentation by fully capturing the required semantic and structural information of instances at different scales. Therefore, this paper proposed a multi-scale modality perception network for referring image segmentation (MMPN-RIS). The experimental results show that the MMPN-RIS model has achieved cutting-edge performance on the oIoU indicators of the three public datasets RefCOCO, RefCOCO+, and RefCOCOg. For the RIS of different scales, the MMPN-RIS model could also yield good performance. 
    Related Articles | Metrics
    Multimodal emotion recognition with action features
    SUN Ya-nan, WEN Yu-hui, SHU Ye-zhi, LIU Yong-jin
    2022, 43(6): 1159-1169.  DOI: 10.11996/JG.j.2095-302X.2022061159
    Abstract ( 173 )   PDF (867KB) ( 121 )  
    In recent years, using knowledge of computer science to realize emotion recognition based on multimodal data has become an important research direction in the fields of natural human-computer interaction and artificial intelligence. The emotion recognition research using visual modality information usually focuses on facial features, rarely considering action features or multimodal features fused with action features. Although action has a close relationship with emotion, it is difficult to extract valid action information from the visual modality. In this paper, we started with the relationship between action and emotion, and introduced action data extracted from visual modality to classic multimodal emotion recognition dataset, MELD. The body action features were extracted based on ST-GCN model, and the action features were applied to the LSTM model-based single-modal emotion recognition task. In addition, body action features were introduced to bi-modal emotion recognition in MELD dataset, improving the performance of the fusion model based on the LSTM network. The combination of body action features and text features enhanced the recognition accuracy of the context model with pre-trained memory compared with that only using the text features. The results of the experiment show that although the accuracy of body action features for emotion recognition is not higher than those of traditional text features and audio features, body action features play an important role in the process of multimodal emotion recognition. The experiments on emotion recognition based on single-modal and multimodal features validate that people use actions to convey their emotions, and that using body action features for emotion recognition has great potential. 
    Related Articles | Metrics
    Face recognition-driven low-light image enhancement  
    FAN Yi-hua , WANG Yong-zhen , YAN Xue-feng , GONG Li-na , GUO Yan-wen , WEI Ming-qiang
    2022, 43(6): 1170-1181.  DOI: :10.11996/JG.j.2095-302X.2022061170
    Abstract ( 197 )   PDF (3417KB) ( 126 )  
    Images are susceptible to external lighting conditions or camera parameters, resulting in overall darkness and poor visualization, which can degrade the performance of downstream vision tasks and thus lead to security issues. In this paper, a contrastive learning-based unpaired low-light image enhancement method termed Low-FaceNet was proposed for face recognition tasks. The backbone of Low-FaceNet was in the form of an image enhancement network based on the U-Net structure, introducing three sub-networks, i.e., feature retention network, semantic segmentation network, and face recognition network, thereby assisting the training of the image enhancement network. The contrastive learning paradigm enabled a large number of real-world unpaired low-light and normal-light images to be used as negative/positive samples, improving the generalization ability of the proposed model in the wild scenarios. The incorporation of high-level semantic information could guide the low-level image enhancement network to enhance images with higher quality. In addition, the task-driven approach made it possible to enhance images and improve the accuracy of face recognition simultaneously. Validated on several publicly available datasets, both visualization and quantification results show that Low-FaceNet can effectively improve the accuracy of face recognition under low-light conditions by enhancing the brightness of images while maintaining various detailed features of the images. 
    Related Articles | Metrics
    The construction and application of integral invariants and differential invariants of graphics and images
    MO Han-lin, HAO You, GUO Rui, HAO Hong-xiang, ZHANG He, LI Qi, LI Hua,
    2022, 43(6): 1182-1192.  DOI: :10.11996/JG.j.2095-302X.2022061182
    Abstract ( 40 )   PDF (2227KB) ( 33 )  
    As common features for graphics and images, differential invariants and integral invariants represented by moment invariants play significant roles in such fields as computer vision, pattern recognition, and computer graphics. In the past two decades, based on fundamental generating functions, our research group have constructed moment invariants of various data types of graphics and images, including grayscale images, color images, vector fields, point clouds, curves, and mesh surfaces, under the conditions of geometric transforms, color transforms, image blurring, and total transforms. The research proved the existence of the isomorphism between geometric moment invariants and differential invariants under affine transform, proposed a simple method for the generation of affine differential invariants by means of this property, and further derived differential invariants of graphics and images under projective transform and Möbius transform. In order to enhance the invariance of deep neural networks for the commonly used graphic/image transform models, the exploration was conducted on how to combine certain invariants of graphics or images with deep neural network models. This paper reviewed and summarized our previous work. In addition, a brief introduction was presented on how to utilize fundamental generating functions to generate geometric moment invariants and differential invariants of graphics and images under affine transform. Analyses were also undertaken on typical applications, advantages, and disadvantages of graphic and image invariants, with future research plan proposed. 
    Related Articles | Metrics
    Visual information accumulation network for person re-identification
    GENG Yuan, TAN Hong-chen, LI Jing-hua, WANG Li-chun
    2022, 43(6): 1193-1200.  DOI: 10.11996/JG.j.2095-302X.2022061193
    Abstract ( 61 )   PDF (1418KB) ( 74 )  
    The preceding person re-identification methods were mostly focused on the learning of the image attention region, but ignored the impact of the non-attention region on the final feature learning. If the feature learning of image non-attention regions is enhanced while focusing on attention regions, the final person features can be further enriched, which is beneficial to the accurate identification of person identity information. Based on this, this paper proposed a visual information accumulation network (VIA Net), adopting two branches. One branch tended to learn the global features of the image, and the other branch was expanded into a multi-branch structure. By combining the features of the attention and non-attention regions, the learning of local features could be gradually strengthened, thus realizing the accumulation of visual information and further enriching the feature information. The experimental results show that the proposed VIA Net could attain high experimental performance in terms of person re-identification datasets such as Market-1501. At the same time, the experiment on the In-Shop Clothes Retrieval dataset shows that the network could also be applicable to general image retrieval tasks and possess certain universality. 
    Related Articles | Metrics
    Total Contents
    Total Contents of 2022
    2022, 43(6): 1201-1204. 
    Abstract ( 56 )   PDF (331KB) ( 48 )  
    Related Articles | Metrics