This three-volume set LNCS 14406-14408 constitutes the refereed proceedings of the 7th Asian Conference on Pattern Recognition, ACPR 2023, held in Kitakyushu, Japan, in November 2023. The 93 full papers presented were carefully reviewed and selected from 164 submissions. The conference focuses on four important areas of pattern recognition: pattern recognition and machine learning, computer vision and robot vision, signal processing, and media processing and interaction, covering various technical aspects. Preface for ACPR 2023 Proceedings Organization Contents – Part II Bridging Distinct Spaces in Graph-Based Machine Learning 1 Introduction 2 Related Work 3 Proposed Framework 3.1 Graph Edit Distances and Edit Costs 3.2 The GECL Framework 3.3 GECL Based on Supervised Metric Learning 4 Applications of the Framework 5 Experiments 5.1 Datasets 5.2 On Prediction Tasks 5.3 Graph Generations: A Pre-image Example 6 Conclusion and Future Work References A New Contrastive Learning Based Model for Estimating Degree of Multiple Personality Traits Using Social Media Posts 1 Introduction 2 Related Work 3 Proposed Model 3.1 Contrastive Learning for Textual Feature Extraction 3.2 Contrastive Learning for Visual Feature Extraction 3.3 Weighted Fusion and Estimating Degree of Personality Traits 4 Experimental Results 4.1 Ablation Study 4.2 Experiments on Segmentation 4.3 Estimating Multiple Personality Traits 5 Conclusion and Future Work References A New Transformer-Based Approach for Text Detection in Shaky and Non-shaky Day-Night Video 1 Introduction 2 Related Work 2.1 Methods for Text Detection in Natural Scene Images 2.2 Methods for Text Detection in Videos 2.3 Methods for Text Detection in Low Light Images 3 Proposed Model 3.1 Activation Frame Selection 3.2 Text Detection 3.3 Loss Function 4 Experimental Results 4.1 Dataset Creation and Evaluation 4.2 Ablation Study 4.3 Experiments on Active Frame Selection 4.4 Experiments on Detection 5 Conclusion and Future Work References MobileViT Based Lightweight Model for Prohibited Item Detection in X-Ray Images 1 Introduction 2 Related Work 2.1 Vision Transformer 2.2 Prohibited Item Detection in X-Ray Images 3 MVray Method 3.1 MobileViT Backbone 3.2 Dense Connection and Learn-Group-Conv 4 Experiment Results 4.1 Experimental Datasets and Implementation Details 4.2 Comparison Results 4.3 Ablation Experiment 4.4 Visualization 5 Conclusion References Typical Contract Graph Feature Enhanced Smart Contract Vulnerability Detection 1 Introduction 2 Related Work 2.1 Smart Contract Vulnerability 2.2 Graph Neural Network 2.3 Self-attention Mechanism 3 Proposed Method 3.1 Semantic Syntax Feature Extraction 3.2 Contract Clustering 3.3 Typical Contract Graph Enhancement 4 Experiment 4.1 Experimental Settings 4.2 Method Comparison and Analysis 4.3 Ablation Studies 5 Conclusion References EfficientSRFace: An Efficient Network with Super-Resolution Enhancement for Accurate Face Detection 1 Introduction 2 Related Work 2.1 Face Detection 2.2 CNNs for Image Super-Resolution 3 EfficientSRFace 3.1 Network Architecture 3.2 Image Super-Resolution Enhancement 3.3 Loss Function 4 Experiments 4.1 Datasets and Evaluation Metrics 4.2 Implementation Details 4.3 Data Enhancement 4.4 Results 5 Conclusions References CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging Handwritten Documents Using Deep Feature Learning from JPEG Coefficients 1 Introduction 2 Preamble to JPEG Compressed Domain 2.1 Uncompressed Document Images 2.2 JPEG Compressed Document Images 3 Proposed Methodology 4 Experiments and Analysis 5 Conclusion References Heatmap Overlay Using Neutral Body Model for Visualizing the Measured Gaze Distributions of Observers 1 Introduction 2 Our 3D Heatmap-Based Visualization Method 2.1 Overview 2.2 Pixel Attention Probability 2.3 Neutral Human Body Model 2.4 Vertex Attention Probability 2.5 3D Heatmap for Generating the Visualization Image 3 Experiments 3.1 Visualization Images Generation Conditions 3.2 Gaze Measurement 3.3 Visualization Results 3.4 Subjective Assessment of the Visualization Images 4 Conclusions References Image Inpainting for Large and Irregular Mask Based on Partial Convolution and Cross Semantic Attention 1 Introduction 2 Related Work 2.1 Image Inpainting 2.2 Partial Convolution 2.3 Generative Adversarial Networks 2.4 Cross Semantic Attention 3 Approach 3.1 Network Architecture 3.2 Loss Function 4 Experiments 4.1 Experimental Setting 4.2 Qualitative Comparison 4.3 Quantitative Comparison 4.4 Analysis 5 Conclusion References Character Structure Analysis by Adding and Pruning Neural Networks in Handwritten Kanji Recognition 1 Introduction 2 Related Works 2.1 Investigating Explainability in Neural Networks 2.2 Applying Edge Pruning to Neural Networks 3 Proposed Method 3.1 Outline 3.2 Obtaining Feature Extractor 3.3 Adding and Pruning Pattern Detectors 4 Experimental Results 5 Conclusions References A Few-Shot Approach to Sign Language Recognition: Can Learning One Language Enable Understanding of All? 1 Introduction 2 Dataset Selection and Description 3 Methodology 3.1 Problem Statement 3.2 Overall Framework 3.3 Experimental Setup 3.4 Evaluation Metric 4 Results 5 Discussion 6 Conclusion References CILF: Causality Inspired Learning Framework for Out-of-Distribution Vehicle Trajectory Prediction 1 Introduction 2 Related Work 3 Theoretical Analysis 3.1 Problem Formulation 3.2 OOD-CG 4 CILF 4.1 Extract Domain-Invariant Causal Feature 4.2 Extract Domain-Variant Feature 4.3 Separate Domain-Variant Causal and Non-causal Feature 5 Experiments 5.1 Experiment Design 5.2 Quantitative Experiment and Analysis 5.3 Qualitative Experiment and Analysis 6 Conclusion References A Framework for Centrifugal Pump Diagnosis Using Health Sensitivity Ratio Based Feature Selection and KNN 1 Introduction 2 Experimental Setup 3 Proposed Framework 4 Results and Performance Evaluation 5 Conclusion References WPT-Base Selection for Bearing Fault Feature Extraction: A Node-Specific Approach Study 1 Introduction 2 Technical Background 2.1 Wavelet Packet Decomposition 3 Proposed Methodology 3.1 Envelope Analysis 3.2 Wavelet Base Evaluation Criterion 3.3 Signal Representation Using WPT with Node-Specific Bases 3.4 Statistical Feature Extraction 4 Experimental Test Bed and Data Collection 5 Performance Evaluation and Discussion 6 Conclusions References A Study on Improving ALBERT with Additive Attention for Text Classification 1 Introduction 2 Related Work 2.1 Efficient Transformer-Based Models 2.2 ALBERT Model 2.3 Self-attention and Additive-Attention Mechanism 3 Method 3.1 Overview and Procedure 3.2 Parameter and Computational Complexity Analysis 4 Experiments 4.1 Experimental Setup 4.2 Datasets and Evaluation Methodology 4.3 Comparison of Parameter Numbers and Inference Speed 4.4 Experimental Results 5 Conclusions and Future Work References Dual Branch Fusion Network for Pathological Image Classification with Extreme Different Image Size 1 Introduction 2 Method 3 Experiments and Results 3.1 Datasets 3.2 Experimental Setting 3.3 Experimental Results and Analysis 4 Discussion 5 Conclusions References Synergizing Chest X-ray Image Normalization and Discriminative Feature Selection for Efficient and Automatic COVID-19 Recognition 1 Introduction 1.1 Related Work 1.2 Data Set of Radiographic Images 2 Overview of the Lung Finder Algorithm (LFA) 2.1 Coordinates Labeling for the LFA Training Stage 2.2 Data Augmentation 2.3 Estimating the Corner Coordinates of the Lung Region by Regression 2.4 Image Warping 3 Feature Reduction and Selection 3.1 Eigenfaces for Dimensionality Reduction 3.2 Using the Fisher Discriminant to Reduce the Number of Useful Features 3.3 The Fisher Ratio as a Weight for Each Feature 4 Experiments Setup 5 Experimental Results 6 Discussion of Results 7 Conclusions and Future Work References A Unified Convolutional Neural Network for Gait Recognition 1 Introduction 2 Related Work 3 Proposed Method 3.1 Generation of Gait Entropy Images 3.2 Convolutional Neural Network Architecture 3.3 Training the CNN 3.4 Gait Recognition 4 Experimental Results 4.1 Datasets and Test Protocol 4.2 Results 5 Conclusion References A New Lightweight Attention-Based Model for Emotion Recognition on Distorted Social Media Face Images 1 Introduction 2 Related Work 3 Proposed Model 3.1 Overview of Proposed Architecture 3.2 GSSAN Attention Module 4 Experimental Results 4.1 Ablation Study 4.2 Experiments for Emotion Recognition 5 Conclusion and Future Work References Uncertainty-Guided Test-Time Training for Face Forgery Detection 1 Introduction 2 Related Work 2.1 Face Forgery Detection 2.2 Test-Time Training Strategy 3 Proposed Method 3.1 Spectrum Transformation 3.2 Multi-level Interaction 3.3 Multi-modal Fusion 3.4 Uncertainty-Guided Test-Time Training 3.5 Loss Function 4 Experiments 4.1 Experimental Setup 4.2 Experimental Results 4.3 Ablation Study 4.4 Visualization 5 Conclusion References CTC-Net: A Novel Coupled Feature-Enhanced Transformer and Inverted Convolution Network for Medical Image Segmentation 1 Introduction 2 Method 2.1 Overall Architecture 2.2 Feature-Enhanced Transformer Module 2.3 Inverted Residual Coordinate Attention Block 3 Experiments and Results 3.1 Datasets 3.2 Implementation Details 3.3 Comparison with State-of-the-Art Methods 3.4 Ablation Studies 4 Conclusion References EEG Emotion Recognition Based on Temporal-Spatial Features and Adaptive Attention of EEG Electrodes 1 Introduction 2 Proposed Methods 2.1 Feature Construction 2.2 The Construction of Proposed Model CALS 3 Experiments and Results 3.1 Data Materials 3.2 Experimental Settings 3.3 Results and Analysis 4 Conclusion References Replaying Styles for Continual Semantic Segmentation Across Domains 1 Introduction 2 Related Works 2.1 Semantic Segmentation 2.2 Incremental Learning 2.3 Domain Adaptation 3 Method 3.1 Problem Formulation 3.2 Style Extraction from Low-Level Representations 3.3 Replaying Styles for Knowledge Preservation 4 Experiments 4.1 Datasets 4.2 Evaluation Metrics 4.3 Implementation Details 4.4 Comparison with State-of-the-Art 4.5 Ablation Study 5 Conclusion References ABFNet: Attention Bottlenecks Fusion Network for Multimodal Brain Tumor Segmentation 1 Introduction 2 Method 2.1 CNN Encoder 2.2 Bottleneck Fusion Transformer 2.3 Fusion Connection Gating 2.4 CNN Decoder 3 Experiments 3.1 Dataset and Evaluation Metrics 3.2 Baseline Methods and Settings 3.3 Experiment Results 3.4 Result Analysis 4 Conclusion References Efficient Tensor Low-Rank Representation with a Closed Form Solution 1 Introduction 2 Notations and Preliminaries 3 Related Work 3.1 Low-Rank Representation (LRR) 3.2 Tensor Low-Rank Representation (TLRR) 3.3 Enhanced Tensor Low-Rank Representation (ETLRR) 3.4 Tensor Low-Rank Sparse Representation (TLRSR) 4 Efficient Tensor Low-Rank Representation with a Closed Form Solution (ETLRR/CFS) 4.1 Computational Complexity Analysis and Comparison 5 Experiments 5.1 Data Sets Description 5.2 Compared Methods 5.3 Evaluation Method 5.4 Parameter Analysis 5.5 Experimental Analysis 6 Conclusion References Fine-Grained Face Sketch-Photo Synthesis with Text-Guided Diffusion Models 1 Introduction 2 Related Work 3 Methods 3.1 Preliminaries 3.2 Face Sketch-Text Guided Diffusion 4 Experiments 4.1 Experimental Settings 4.2 Comparison with Previous Methods 4.3 Text Control 5 Conclusions References MMFA-Net: A New Brain Tumor Segmentation Method Based on Multi-modal Multi-scale Feature Aggregation 1 Introduction 2 Method 2.1 Dual-Branch Network 2.2 Spatial Position Activation (SPA) Module 2.3 Multi-scale Feature Aggregation (MFA) Module 3 Experiment 3.1 Implementation Details 3.2 Datasets 3.3 Evaluation Metrics 3.4 Comparison 3.5 Ablation Study 4 Conclusion References Diffusion Init: Stronger Initialisation of Decision-Based Black-Box Attacks for Visual Object Tracking 1 Introduction 2 Related Work 2.1 Transformer-Based Visual Object Tracker 2.2 Decision-Based Attack 2.3 Diffusion Model 3 Diffusion Init 3.1 Motivation 3.2 Diffusion Init 4 Experiments 4.1 Experimental Settings 4.2 Overall Attack Results 4.3 Ablation Studies 5 Conclusion References MMID: Combining Maximized the Mutual Information and Diffusion Model for Image Super-Resolution 1 Introduction 2 Related Work 2.1 Super-Resolution 2.2 Regression-Based SISR Method 2.3 Generative-Based SISR Methods 3 Methodology 3.1 Preliminaries 3.2 Maximizing Mutual Information for Diffusion Model 3.3 Convolution Structure with the Transform Feature into U-Net 4 Experiments 4.1 Qualitative Results 4.2 Quantitative Comparison 5 Conclusion References Fibrosis Grading Methods for Renal Whole Slide Images Based on Uncertainty Estimation 1 Introduction 2 Related Works 2.1 Algorithms Applied to Renal Whole Slide Images 2.2 Weakly Supervised Learning-Based Approaches to WSIs Classification 3 Methods 3.1 Patches Selection Stage 3.2 Decision Aggregation Stage 3.3 Loss Function 4 Experiments and Results 4.1 Datasets 4.2 Results 5 Conclusion References Zooplankton Classification Using Hierarchical Attention Branch Network 1 Introduction 2 Related Work 2.1 Plankton Image Datasets 2.2 Plankton Image Classification Using CNNs 3 Hierarchical Plankton Image Classification 3.1 Taxonomic Ranks of Plankton 3.2 Attention Branch Network 3.3 Hierarchical Attention Branch Network 4 Plankton Image Dataset 5 Experiments and Discussion 6 Conclusion References Author Index