Welcome to Journal of Graphics share: 
Bimonthly, Started in 1980
Administrated: China Association for  Science and Technology
Sponsored: China Graphics Society
Edited and Published: Editorial Board  of Journal of Graphics
Chief Editor: Guoping Wang
Editorial Director: Xiaohong Hou
ISSN 2095-302X
CN 10-1034/T
Current Issue
29 February 2024, Volume 45 Issue 1 Previous Issue   
For Selected: Toggle Thumbnails
Cover
Cover of issue 1, 2024
2024, 45(1): 0. 
PDF 34     120
Related Articles | Metrics
Contents
Table of Contents for Issue 1, 2024
2024, 45(1): 0. 
PDF 29     86
Related Articles | Metrics
Review
A review on neural radiance fields acceleration
WANG Zhiru, CHANG Yuan, LU Peng, PAN Chengwei
2024, 45(1): 1-13.  DOI: 10.11996/JG.j.2095-302X.2024010001
HTML    PDF 54     52

Neural radiance field (NeRF) has become an important research area in computer graphics and computer vision in recent years. Due to its highly realistic visual synthesis effects, NeRF has been widely used in photorealistic rendering, virtual reality, human body modeling, urban mapping, and other domains. NeRF employs neural networks to learn implicit representations of 3D scenes from input image sets and to synthesize highly realistic novel view images. However, the training and inference speeds of the primitive NeRF model are very slow, posing challenges for real-condition deployment and application. To address the acceleration problem of NeRF, researchers have studied the acceleration of NeRF from the aspects of scene modeling methods and ray sampling strategies. Those works can be categorized into the following research directions: baking model, integrating models with discrete representation methods, enhancing sampling efficiency, using hash coding to reduce the complexity of MLP network, introducing scene generalization, and introducing deep supervision information and field decomposition methods. After introducing the background of the NeRF model, the advantages and characteristics of the representative methods of the above ideas were discussed and analyzed. Finally, the progress made in the acceleration of NeRF-related work and future prospects were summarized.

Figures and Tables | References | Related Articles | Metrics
A survey of dynamic 3D scene reconstruction
HUANG Jiahui, MU Taijiang
2024, 45(1): 14-25.  DOI: 10.11996/JG.j.2095-302X.2024010014
HTML    PDF 46     51

Three-dimensional reconstruction technology aims to recover the digital 3D representation of an observed scene through sensor input. It is an important research direction in the fields of computer graphics and vision, with significant applications in visualization, simulation, route planning, and various other tasks. Compared to static scenes, dynamic scenes introduce an additional temporal dimension. The reconstruction of dynamic scenes not only requires accurately reconstructing the geometric details of each frame but also capturing the motion trends of the target over time and correlations for downstream analysis tasks, presenting greater challenges to the design of reconstruction algorithms. However, the existing literature pertaining to the reconstruction of dynamic scenes is still in their infancy, and systematic summarizations of existing methodologies are notably lacking. In an endeavor to address these problems and to enlighten future algorithm design, the latest dynamic 3D scene reconstruction technologies in the literature were reviewed and summarized. A general definition of dynamic 3D scene reconstruction and its general solution framework was provided. Existing technologies were reviewed from the perspectives of dynamic 3D representation methods and optimization frameworks, and the reconstruction algorithms and processing methods for structured scenes were discussed. Finally, existing datasets were summarized, the existing problems in dynamic 3D scene reconstruction were identified, and an outlook on future research was provided.

Figures and Tables | References | Related Articles | Metrics
Human action recognition algorithm based on semantics guided neural networks
GUO Zongyang, LIU Lidong, JIANG Donghua, LIU Zixiang, ZHU Shukang, CHEN Jinghua
2024, 45(1): 26-34.  DOI: 10.11996/JG.j.2095-302X.2024010026
HTML    PDF 25     53

In recent years, modeling the three-dimensional coordinates of skeletal joints using deep feedforward neural networks has become a trend. However, challenges such as low recognition accuracy, huge parametric volume, and poor real-time performance still persist in the field of skeletal data-based action recognition. In response, an improved network model built upon semantic-guided networks (SGN) was proposed. Firstly, a non-local feature extraction module was integrated into the original network to enhance its training and prediction performance in advanced semantic guidance models, thereby decreasing its computational complexity and inference time in natural language processing tasks. Secondly, an attention mechanism was implemented to learn the channel weights of each convolutional network layer and lessen the redundant information between channels, thus further enhancing the computational efficiency and recognition accuracy of the model. Additionally, a deformable convolution module was employed to dynamically learn the weights of different graph convolutional network (GCN) layer channels and effectively aggregate the joint features across different channels for the final classification of the network, thereby boosting the utilization of feature information. Finally, human action recognition experiments were conducted on the public datasets NTU RGB+D and NTU RGB+D 120. The numerical results demonstrated that the proposed network was an order of magnitude smaller than most networks, and it significantly outperformed the original network and several other state-of-the-art algorithms in terms of recognition accuracy.

Figures and Tables | References | Related Articles | Metrics
Lightweight multi-modal pedestrian detection algorithm based on YOLO
YUAN Chao, ZHAO Yadong, ZHANG Yao, WANG Jiaxuan, XU Dawei, ZHAI Yongjie, ZHU Songsong
2024, 45(1): 35-46.  DOI: 10.11996/JG.j.2095-302X.2024010035
HTML    PDF 60     76

To address the problems of low accuracy in pedestrian detection and the large number of model parameters in low-light environments, a lightweight multi-modal pedestrian detection algorithm named EF-DEM-YOLO was proposed based on the YOLO framework. This algorithm employed the lightweight ES-MobileNet as the backbone feature extraction network and integrated ECA and SE-ECA attention mechanism modules in this network to enhance the important channel features, thereby elevating the detection accuracy for small-target pedestrians. A DBL module based on depth-separable convolution was also designed in the neck network to further reduce the number of parameters in the model. In addition, to improve the detection accuracy of pedestrians under low-light conditions, a weighted fusion method of visible and infrared modes based on image entropy was proposed. This method utilized the complementary features of visible and infrared modes under different lighting conditions, and the fusion module EWF is designed. In comparison to baseline methods: the proposed algorithm yielded significant improvements for pedestrian targets under different lighting conditions. The model’s mAP was increased by 55.5%, the MR was reduced by 85.9%, and the inference speed reached 33.4 frames per second, outperforming other classical object detection algorithms. This algorithm provided the possibility for real-time detection of pedestrian targets in edge computing and low-light scenes.

Figures and Tables | References | Related Articles | Metrics
PCB defect detection method based on fusion of MBAM and YOLOv5
HU Xin, HU Shuai, MA Lijun, SI Liyun, XIAO Jian, YUAN Ye
2024, 45(1): 47-55.  DOI: 10.11996/JG.j.2095-302X.2024010047
HTML    PDF 56     48

With the rapid development of the electronic information industry, the printed circuit board (PCB) industry, serving as its foundation, plays a crucial role in determining the quality of electronic products produced subsequently. Addressing the challenges of small defect target in PCBs, numerous defect types, and indistinct features, which often lead to false detection and missed detection in the actual production process, a multi-branch attention multi-branch attention module (MBAM) module method was proposed. This method focused on the feature map in three different dimensions to enhance feature extraction capabilities and allocate more attention to defect areas. By enhancing the YOLOv5 structure and integrating MBAM with YOLOv5 network, the detection performance for small and medium-sized targets in PCBs was effectively improved. Finally, by comparing the MBAM modules at different locations of the network, the best location was selected. The experimental results on the PCB defect dataset demonstrated that the improved PCB defect detection algorithm exhibited superior detection performance compared to other algorithms. The final AP reached 96.7%, a 2 percentage points increase over 94.7% of the standard YOLOv5. Other indicators all showed an upward trend, and the algorithm could accurately identify PCB defect types while maintaining the detection speed.

Figures and Tables | References | Related Articles | Metrics
Multi-directional text detection based on the fusion of enhanced feature extraction network and semantic feature
LV Ling, LI Hua, WANG Wu
2024, 45(1): 56-64.  DOI: 10.11996/JG.j.2095-302X.2024010056
HTML    PDF 24     15

A text detection method was proposed based on an enhanced feature extraction network and semantic feature fusion, thus addressing the challenges such as variable length and oblique angle of scene text. An enhanced dilated residual module (EDRM) was designed by combining deformable convolution with atrous convolution for the layers conv4_x and conv5_x of ResNet18. This module served as the backbone network, enhancing the capability of feature extraction while increasing the feature map resolution and reducing the loss of spatial information. Secondly, to address the inadequacies of the existing algorithms in extracting text semantic features, bi-directional long short-term memory (BiLSTM) was applied to the feature fusion section, enhancing the representation ability of fusion feature map for scene text, the correlation of feature sequences, and the text localization ability of the model. The model was evaluated on the multi-directional text dataset ICDAR2015 and the long text dataset MSRA-TD500. The results demonstrated that compared with the current efficient DBNet algorithm, the F value of the proposed algorithm increased by 1.8% and 3.3 %, respectively, showing strong competitiveness.

Figures and Tables | References | Related Articles | Metrics
Deep multimodal medical image fusion network based on high-low frequency feature decomposition
WANG Xinyu, LIU Hui, ZHU Jicheng, SHENG Yurui, ZHANG Caiming
2024, 45(1): 65-77.  DOI: 10.11996/JG.j.2095-302X.2024010065
HTML    PDF 16     16

Multimodal medical image fusion aims to enhance the interpretability and applicability of medical images in clinical settings by leveraging correlations and complementary information across different imaging modalities. However, existing manually designed models often fail to effectively extract critical target features, resulting in issues such as blurred fusion images and loss of textural details. To address this, a novel deep multimodal medical image fusion network based on high-low frequency feature decomposition was proposed. This approach incorporated channel attention and spatial attention mechanisms into the fusion process, allowing for a more intricate fusion of high-low frequency features while preserving both global structure and local textural details. Firstly, the high-frequency features of two modal images were extracted using the pre-trained model VGG-19, and their low-frequency features were extracted through downsampling to form intermediate features between high and low frequencies. Secondly, a residual attention network was embedded in the feature fusion module to sequentially infer attention maps from independent channels and spatial dimensions. These maps were then employed to guide the adaptive feature optimization of input feature maps. Finally, the reconstruction module fused high-low frequency features and output the fusion image. Experimental results on both the Harvard open dataset and a self-created abdominal dataset demonstrated that compared to the source image, the fusion image produced by the proposed method achieved an 8.29% improvement in peak signal-to-noise ratio, 85.07% in structural similarity, 65.67% in correlation coefficient, 46.76% in feature mutual information, and 80.89% in visual fidelity.

Figures and Tables | References | Related Articles | Metrics
Classification and segmentation network based on Transformer for triangular mesh
LI Jiaqi, WANG Hui, GUO Yu
2024, 45(1): 78-89.  DOI: 10.11996/JG.j.2095-302X.2024010078
HTML    PDF 27     21

Triangular mesh is an important geometric data structure for effectively expressing the shape details of 3D models. However, the irregular distribution of surface elements poses a challenge in directly applying existing neural networks to triangular meshes. To address the irregular structure of triangular meshes, taking the mesh surface as Token directly, a deep neural network based on Transformer for triangular meshes is proposed. Firstly, the coordinates for the center of gravity or spectral domain features of the face are utilized as the position information, incorporating its intrinsic features as the input feature, and followed by the position embedding of the input feature. Secondly, the global feature is extracted through a self-attention module, and a face convolution module was employed to extract local features, thereby enhancing the ability to extract local features. Finally, integrating the local and global features, the classification and segmentation deep neural network for triangular meshes is constructed. The experimental results on the SHREC classification dataset and COSEG segmentation dataset demonstrate the proposed method’s high accuracy and its effectiveness in improving the training speed.

Figures and Tables | References | Related Articles | Metrics
IDD-YOLOv7: a lightweight method for multiple defect detection of insulators in transmission lines
ZHAI Yongjie, ZHAO Xiaoyu, WANG Luyao, WANG Yaru, SONG Xiaoke, ZHU Haoshuo
2024, 45(1): 90-101.  DOI: 10.11996/JG.j.2095-302X.2024010090
HTML    PDF 37     36

The YOLO objective detection algorithm is currently the mainstream method for detecting insulator defects in image-based power transmission lines. However, due to the high complexity of existing models, a reasonable and effective parameter compression method is urgently needed as a prerequisite to establish the foundation for solving the dilemma of UAV edge device deployment. Additionally, the complex background of the insulator defect images captured by drones and small size of defects can lead to problems such as false detections and omissions. To address these issues, the Insulator Defect Detection-YOLOv7 (IDD-YOLOv7) model was proposed for multi-defect detection in power transmission line insulators, aiming to reduce model complexity and enhance robustness. Firstly, a coordinate attention mechanism was incorporated during the multi-scale feature fusion process to suppress interference from complex backgrounds and enhance the model’s global perception of small objects. Secondly, a C3GhostNetV2 module was designed to capture long-range dependencies between different spatial pixels, thus enhancing the model’s expressive power while reducing the parameter quantity and floating-point operation complexity. Lastly, the Focal-CIoU loss function was proposed to improve the contribution of high-quality anchors to the model and accelerate model convergence. Experimental results demonstrated that compared with the baseline model, the mAP50 of this method has increased by 3.8%, with precision and recall rates increasing by 1.7% and 7.6%, respectively, and the parameter quantity and floating-point operations have decreased by 18.3% and 14.0%, respectively. The AP50 of insulator self-explosion, damage, and flashover defects have increased by 0.8%, 4.5%, and 6.3%, respectively.

Figures and Tables | References | Related Articles | Metrics
Diversified generation of theatrical masks based on SASGAN
GU Tianjun, XIONG Suya, LIN Xiao
2024, 45(1): 102-111.  DOI: 10.11996/JG.j.2095-302X.2024010102
HTML    PDF 16     13

To address the problem of low resolution and lack of realism in existing automatically generated theatrical masks, a stylized generative adversarial network (SASGAN) based on a self-attentive mechanism was proposed. Firstly, SASGAN introduced the self-attentive mechanism and vector quantization method based on StyleGAN, thereby enhancing the extraction of geometric structure features of mask patterns. Subsequently, the diversified differentiation generation (DDG) method was supplemented with a mask hue-assisted algorithm by expanding the data with DDG to build a theatrical mask dataset containing 12,599 images. The final training was performed on this dataset to generate mask images with both diversity and realism. The experimental results demonstrated significant improvement in data augmentation for theatrical masks using the DDG method compared to the traditional methods, while SASGAN enhanced the resolution and realism of theatrical masks, achieving the desired effect in subjective visualization.

Figures and Tables | References | Related Articles | Metrics
Steel surface defect detection algorithm based on MCB-FAH-YOLOv8
CUI Kebin, JIAO Jingyi
2024, 45(1): 112-125.  DOI: 10.11996/JG.j.2095-302X.2024010112
HTML    PDF 62     46

To address the problems of misdetection, omission, and low detection accuracy in existing deep learning-based algorithms for detecting defects on steel surfaces, a YOLOv8 steel surface defect detection algorithm was proposed based on a modified CBAM (MCB) and replaceable four-head ASFF prediction head (FAH), abbreviated as MCB-FAH-YOLOv8. By integrating the modified convolutional attention mechanism module (CBAM), the algorithm could achieve better determination of the densely populated targets. By changing the FPN structure to BiFPN, it could extract context information more efficiently. It also incorporated adaptive feature fusion (ASFF) for the automatic identification of the most suitable fusion features. The algorithm also boosted its precision by replacing the SPPF module with the SimCSPSPPF module. Meanwhile, for tiny object detection, a four-head ASFF prediction head was proposed, designed to be replaceable based on the dataset characteristics. The experimental results demonstrated that the MCB-FAH-YOLOv8 algorithm could achieve a detection accuracy (mAP) of 88.8% on the VOC2007 dataset and 81.8% on the NEU-DET steel defect detection dataset, outperforming the benchmark model by 5.1% and 3.4%, respectively. This new algorithm achieved a higher detection accuracy with less loss of detection speed, thus ensuring a good balance between accuracy and speed.

Figures and Tables | References | Related Articles | Metrics
Research on stylization method of copper chiseling paper-cutting
ZHOU Leijing, ZHANG Yuxin, LEI Rui, SHEN Aoyi
2024, 45(1): 126-138.  DOI: 10.11996/JG.j.2095-302X.2024010126
HTML    PDF 5     10

Copper chiseling paper-cutting is a traditional art form that involves chiseling copper foil and coloring it with mineral pigments, resulting in dazzling final products. The production process of copper chiseling paper-cutting art is complex and time-consuming, requiring a high level of technical proficiency from craftspeople. A method for stylizing copper chiseling paper-cutting was proposed, along with the design and implementation of a computer-assisted design tool for this art. This tool generated image outlines, chiseled point maps, and effect images of copper chiseling paper-cutting, aiding artisans in quickly completing the creation and production of this art. The input image was segmented to extract its lines, thus generating image outlines. A color loss function was defined to obtain the optimal color mapping scheme using a combination of a greedy algorithm and gradient descent method. Style transfer of the image lines was performed using the VGG-19 network to generate chiseled point maps. The line-style-transferred image was merged with the color-transferred image to generate the copper chiseling paper-cutting effect image. The design tool for the copper chiseling paper-cutting art was developed based on the PyQt5 framework, which provided an interactive platform. Experimental results demonstrated that the proposed method can quickly stylize images in the copper chiseling paper-cutting style, closely resembling the actual art form. It facilitated craftspeople in rapidly generating relevant materials such as image outlines, chiseled point maps, and effect images for the production process, thereby enhancing the efficiency of producing copper chiseling paper-cutting art and offering significant practical value.

Figures and Tables | References | Related Articles | Metrics
Computer Graphics and Virtual Reality
Research on multi-constrained harness layout algorithm for harness pre-assembly
LUO Yuetong, PENG Jun, GAO Jingyi, LUO Ruiming, CHEN Ji, ZHOU Bo
2024, 45(1): 139-147.  DOI: 10.11996/JG.j.2095-302X.2024010139
HTML    PDF 7     10

The harness is composed of a group of harness segments connected in a tree structure. It is a wiring component connecting electrical equipment in aircraft, automobile, and other products. To improve assembly efficiency, complex wiring harnesses need to be pre-assembled on an assembly board. This involves placing the wiring harnesses on the assembly board in a way that meets process constraints such as angle, distance, intersection, and boundary. This presents a multi-constrained harness layout problem. The paper drew upon a graph layout method, transformed the wiring harness layout into an optimization problem, and employed SGD to optimize a randomly selected pair of harness segments each time, gradually iterating and converging. Due to harness segments being rigid bodies with constant length, moving one segment could affect other segments, leading to oscillations in the SGD iteration process and making convergence difficult. Therefore, a bidirectional transmission segment movement algorithm was proposed to minimize the impact on other segments while ensuring that the segment moved to the target position. Both synthetic cases and a real case of an aircraft wiring harness were used for effectiveness verification, and the results showed that various process constraints could be met, and production requirements for wire harness pre-assembly could be satisfied.

Figures and Tables | References | Related Articles | Metrics
A seamless texture mapping method with highlight processing
SHI Min, WANG Bingqi, LI Zhaoxin, ZHU Dengming
2024, 45(1): 148-158.  DOI: 10.11996/JG.j.2095-302X.2024010148
HTML    PDF 3     12

As an important step of 3D reconstruction, texture mapping is directly related to the visual effect of the generative model. Classical texture mapping methods usually employ a simple blending method to obtain the surface texture. The corresponding texture regions are obtained by projecting the model surface onto each texture image, and then blending them to obtain the surface texture. However, due to the inaccuracy of the camera pose, the texture mapping results exhibit evident problems such as blurriness and ghosting. In addition, the texture images acquired in the high-light environment tend to contain highlight areas, resulting in texture color loss and diminished texture authenticity. To address these issues, a seamless texture mapping method capable of effectively eliminating highlight reflections was proposed. The method selected the best texture image for each model surface by assessing the quality of the texture image, and chromaticity consistency was utilized to constrain the optimized camera pose, eliminating obvious texture misalignments. To address the highlight problem, a highlight processing module was proposed, utilizing multi-view image information and employing a two-color reflection model to identify and process the highlight texture. Finally, adjustments were made to the texture color consistency to tackle color inconsistency between textures. The experimental results demonstrated that the proposed algorithm can obtain superior texture mapping results compared with state-of-the-art methods and effectively eliminate highlight reflections.

Figures and Tables | References | Related Articles | Metrics
A 3D human pose estimation approach based on spatio-temporal motion interaction modeling
LV Heng, YANG Hongyu
2024, 45(1): 159-168.  DOI: 10.11996/JG.j.2095-302X.2024010159
HTML    PDF 13     18

3D human pose estimation plays a crucial role in fields such as virtual reality and human-computer interaction. In recent years, the Transformer has been introduced into the domain of 3D human pose estimation to capture the spatiotemporal motion information of human joints. However, existing studies typically focus on the collective movement of joint clusters or exclusively model the movement of individual joints, without delving into the unique movement patterns of each joint and their interdependencies. Consequently, an innovative approach was proposed, which meticulously learnt the spatial information of 2D human joints in each frame and conducted an in-depth analysis of the specific movement patterns of each joint. Through the design of a motion information interaction module based on the Transformer encoder, the proposed method accurately captured the dynamic relationships between different joints. In comparison to existing models that directly learnt the overall motion of human joints, the proposed method enhanced prediction accuracy by approximately 3%. When benchmarked against the state-of-the-art MixSTE model, which primarily focused on individual joint movement, the proposed model demonstrated greater efficiency in capturing spatiotemporal features of joints, achieving an inference speed boost of over 20%, making it especially suitable for real-time inference scenarios.

Figures and Tables | References | Related Articles | Metrics
Collaborative 3D modeling technique in virtual reality
WANG Haomiao, SANG Shengju, DUAN Xiaodong, ZHANG Weihua, TAO Tiwei, MA Ting
2024, 45(1): 169-182.  DOI: 10.11996/JG.j.2095-302X.2024010169
HTML    PDF 14     29

3D modeling technology plays a crucial role in various fields, but the predominantly desktop-based interaction in 3D modeling remains complex, abstract, and lacks support for online collaboration. Therefore, utilizing the advantages of immersion, interaction, and imagination of virtual reality (VR) technology, a network collaborative 3D modeling method in a VR environment was proposed. This method enabled users to create 3D models through immersive interaction and supported multi-user real-time online visual collaboration. Firstly, a 3D model drawing interaction method in a VR environment was proposed. Secondly, the 3D models were categorized, and a 3D model mesh generation algorithm based on Layered Build was proposed for building planar models and 3D models. Finally, a 3D modeling network collaboration module in a VR environment was designed, achieving network synchronization through Socket communication. Comparative experiments with the 3D modeling methods of traditional 3D modeling software demonstrated that this method is more concise, intuitive, and efficient, making it easy for ordinary users to master.

Figures and Tables | References | Related Articles | Metrics
Extracting node center coordinates of point clouds in reticulated shell structure using least squares method
WANG Peng, XIN Peikang, LIU Yin, YU Fangqiang
2024, 45(1): 183-190.  DOI: 10.11996/JG.j.2095-302X.2024010183
HTML    PDF 10     12

To address the difficult measurement and low processing efficiency of node coordinates in the reticulated shell structure, an algorithm for extracting node center coordinates was proposed based on scanning point clouds in the reticulated shell structure. Firstly, point cloud data of the reticulated shell structure was collected using the three-dimensional laser scanning technology and was preprocessed. Subsequently, based on the design model, local point cloud segmentation and plane fitting of the shell nodes were performed on the initial model. The attributes and characteristics of the fitted planes were determined, and the normal vector directions on the sides were corrected. Finally, the accurate center coordinates of the nodes in the reticulated shell structure were determined through the utilization of spatial constraint relationships and the least squares algorithm. Taking the Sanya International Duty Free City project as an example, the results demonstrated that the proposed method yielded a mean deviation of 2.89 mm between the central coordinates obtained through this method and the Total Station measurements. Moreover, its efficiency was nearly 4 times higher than that of the traditional method. This significant advancement enhanced the accuracy and efficiency of the node center coordinate extraction in the reticulated shell structure, offering accurate data support for subsequent tasks, including structural construction deviation review, deepened design and processing of curtain wall panels.

Figures and Tables | References | Related Articles | Metrics
Intelligent simulation social experiment method based on related event fields
CHEN Yulong, ZHANG Zhizhong, MA Lizhuang, YE Shulan, CHEN Mingang
2024, 45(1): 191-198.  DOI: 10.11996/JG.j.2095-302X.2024010191
HTML    PDF 7     15

The advancement of intelligence is rapidly promoting the development of digital China and continuously changing the relevant social and humanistic landscapes, while also bringing to the fore key issues such as socially derived risks involved in artificial intelligence (AI). It is of great practical significance to prospectively explore new models for AI social experimentation and regulate the development of AI technologies under the premise of adhering to social ethics and privacy protection. Methods for constructing event fields within specific social experimental settings were examined, alongside the interrelationships between characters, behaviors, and scenarios. Social transmission patterns and social derived effects were also predicted based on data and experimentation. Through prospective virtual simulation game experiments, an attempt was made to validate the effectiveness of an intelligent simulation social experiment model, explored the formation, discover innovative theoretical models and methods for new artificial intelligence social experiments, verify the role of event field models in intelligent simulation social experiments, and explore the social trans-mission patterns based on data and experiments as well as social derivative effects based on data and experiments for prediction, aiming to provide forward-looking decision-making tools and intelligent management suggestions for the government.

Figures and Tables | References | Related Articles | Metrics
Hybrid-structure based multi-view 3D scene reconstruction
ZHOU Jingyi, ZHANG Qitong, FENG Jieqing
2024, 45(1): 199-208.  DOI: 10.11996/JG.j.2095-302X.2024010199
HTML    PDF 10     18

Achieving accurate and efficient 3D reconstruction through PatchMatch-based multi-view stereo (MVS) algorithms remains a challenging task. The red-black checkerboard propagation method offers high computational efficiency, yet its corresponding view selection strategy lacks accuracy. The view selection strategy based on Markov chain can obtain more accurate matching results, but lacks parallelism. To balance reconstruction quality and runtime, a hybrid-structure based multi-view 3D scene reconstruction algorithm was proposed. In the first stage, the algorithm employed a parallel row/col propagation strategy and a Markov chain-based view selection strategy to produce high-quality initial depth maps. Meanwhile, multi-level processing was utilized to improve the reconstruction quality of weak texture regions. In the second stage, checkerboard propagation and a voting-based view selection strategy were used to increase computational efficiency and reduce reconstruction time. Extensive experiments and comparisons on the Strecha and ETH3D datasets demonstrated that the proposed algorithm can generate results 2.5 times faster without accuracy reduction.

Figures and Tables | References | Related Articles | Metrics
TCPVis: visual analysis system of traditional Chinese painting school based on six principles of Chinese painting
WANG Sijia, FENG Yingchaojie, ZHU Hang, ZHANG Wei, ZHU Lin, CHEN Wei
2024, 45(1): 209-218.  DOI: 10.11996/JG.j.2095-302X.2024010209
HTML    PDF 42     65

Traditional Chinese painting (TCP), throughout its historical development, has formed a diverse array of painting schools. Analyzing these painting schools can greatly enhance our understanding and appreciation of the value of TCP. However, existing tools are limited in intuitively and effectively representing the distinctive stylistic features of the paintings. In response, we proposed TCPVis, a visual analysis system for multi-dimensional features analysis of painting schools in TCP based on the Six Principles of Chinese Painting. This system could automatically analyze the common features of different schools across these six principles, recommend dimension weights, and visualize the label distribution of the Six Principles dimensions to support user verification. The system also presented the collection of paintings according to the dimensionality reduction of weighted painting features, assisting users in exploring more stylistically similar paintings, and supporting the analysis of painter correlations for the paintings. The effectiveness and usability of the system in analyzing features of painting schools and exploring more painters were demonstrated through two case studies and interviews with experts.

Figures and Tables | References | Related Articles | Metrics
DGOA: point cloud upsampling based on dynamic graph and offset attention
HAN Yazhen, YIN Mengxiao, MA Weizhao, YANG Shigeng, HU Jinfei, ZHU Congyang
2024, 45(1): 219-229.  DOI: 10.11996/JG.j.2095-302X.2024010219
HTML    PDF 14     13

The point clouds obtained directly from 3D scanning equipment are often sparse, uneven, and noisy. Therefore, point cloud upsampling has become increasingly vital in fields such as point cloud reconstruction and rendering. A new point cloud upsampling network named DGOA was proposed based on Dynamic Graph and Offset Attention. DGOA mainly consisted of three modules: LFE (local feature extraction), GFE (global feature extraction), and CR (coordinate reconstruction). LFE utilized a multi-layer structure to extract neighborhood information, constructed a dynamic graph based on feature similarity at each layer, and adaptively grouped point clouds in the feature space. This increased the receptive field, obtained long-distance semantic information, and more effectively modeled the local geometry of the point cloud. GFE employed offset attention based on the Laplace operator, enabling each point to obtain global information of the point cloud. This ensured that the details of the generated point cloud were consistent with the original point cloud and reduced the impact of noise. CR, inspired by the FoldingNet operation, prevented the generated points from clustering together. In addition, the entire network was permutation invariant with respect to the order of points in the input point cloud. Quantitative and qualitative experimental results on multiple datasets demonstrated that the proposed method outperformed other methods and exhibited good generalization and stability.

Figures and Tables | References | Related Articles | Metrics
Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion
WANG Jiang’an, HUANG Le, PANG Dawei, QIN Linzhen, LIANG Wenqian
2024, 45(1): 230-239.  DOI: 10.11996/JG.j.2095-302X.2024010230
HTML    PDF 11     14

To address the problems such as difficulties in weak texture reconstruction, high resource consumption, and long reconstruction time, a multi-stage dense point cloud reconstruction network based on adaptive aggregation cyclic recursive convolution was proposed, namely A2R2-MVSNet (adaptive aggregation recurrent recursive multi view stereo net). This method first introduced a feature extraction module based on multi-scale cyclic recursive residuals to aggregate contextual semantic information, addressing the problem of difficult feature extraction in weakly textured or textureless regions. In the cost body regularization part, a residual regularization module was proposed. This module enhanced the ability of 3D CNN to extract and aggregate contextual semantics under the premise of slightly increasing memory consumption. The experimental results demonstrated that the proposed method ranked high in comprehensive metrics on the DTU dataset, showcasing superior performance in reconstructing details. Additionally, it could generate good depth maps and point cloud results on the BlendedMVS dataset. Furthermore, the network was tested for generalization on self-collected large-scale high-resolution datasets. Thanks to the coarse-to-fine multi-stage idea and our proposed module, the network could not only generate high-accuracy and complete depth maps, but also perform high-resolution reconstructions suitable for practical applications.

Figures and Tables | References | Related Articles | Metrics
Published as
Published as 1, 2024
2024, 45(1): 240. 
PDF 21     52
Related Articles | Metrics