Digital Library
Vol. 19, No. 6, Dec. 2023
-
Yunxiang Peng, Guixian Tian
Vol. 19, No. 6, pp. 713-721, Dec. 2023
https://doi.org/10.3745/JIPS.04.0292
Keywords: BCC (VRS) Model, DEA Model, Economic performance, Mining, Stakeholders
Show / Hide AbstractConventional mining enterprises, particularly coal-related ones, exhibit substantial environmental pollution and high energy consumption, while those involved in new energy resources, such as lithium and cobalt, face severe resource shortages. Consequently, the economic efficiency of China’s mining enterprises is significantly constrained. This study examines data from nine representative listed enterprises in China spanning 2016 to 2021. Employing the DEA model—i.e., BCC (VRS) model, we analyze the economic efficiency of mining enterprises with a focus on stakeholders. The paper provides static and dynamic analyses, offering insights and recommendations for enhancing technology, reducing costs, and fortifying social relationships. -
Nan Zhang
Vol. 19, No. 6, pp. 722-729, Dec. 2023
https://doi.org/10.3745/JIPS.02.0206
Keywords: Data visualization, Intangible Cultural Heritage, Visual Analytics
Show / Hide AbstractVisual analytic for intangible cultural heritage has recently developed in China. Using advanced interactive visualization tools experts can observe data distribution trends and explore the implicit relationships among data within a short time. It can enhance human cognitive and analytical abilities and improve the scientific preservation of intangible cultural heritage. To support this research topic, we have reviewed recent visualization works on intangible cultural heritage in China. We divide these works into three types: text visualization, multi-dimensional visualization, and geographical visualization. Each type is illustrated by several representative works. New development trends in this area are also discussed. -
Malika Douache, Badra Nawal Benmoussat
Vol. 19, No. 6, pp. 730-744, Dec. 2023
https://doi.org/10.3745/JIPS.02.0207
Keywords: Convolutional Neural Network, Deep Learning, Egocentric Vision (or First-Person Vision), Human Activity Recognition, Image Classification, Inertial Measurement Unit (IMU)
Show / Hide AbstractThe topic of this paper is the recognition of human activities using egocentric vision, particularly captured by body-worn cameras, which could be helpful for video surveillance, automatic search and video indexing. This being the case, it could also be helpful in assistance to elderly and frail persons for revolutionizing and improving their lives. The process throws up the task of human activities recognition remaining problematic, because of the important variations, where it is realized through the use of an external device, similar to a robot, as a personal assistant. The inferred information is used both online to assist the person, and offline to support the personal assistant. With our proposed method being robust against the various factors of variability problem in action executions, the major purpose of this paper is to perform an efficient and simple recognition method from egocentric camera data only using convolutional neural network and deep learning. In terms of accuracy improvement, simulation results outperform the current state of the art by a significant margin of 61% when using egocentric camera data only, more than 44% when using egocentric camera and several stationary cameras data and more than 12% when using both inertial measurement unit (IMU) and egocentric camera data. -
Xiaolei Wang, Zhe Kan
Vol. 19, No. 6, pp. 745-755, Dec. 2023
https://doi.org/10.3745/JIPS.04.0293
Keywords: Coal Mine, Deep Learning, Defect Detection, Wire Rope, YOLOv5
Show / Hide AbstractThe wire rope is an indispensable production machinery in coal mines. It is the main force-bearing equipment of the underground traction system. Accurate detection of wire rope defects and positions exerts an exceedingly crucial role in safe production. The existing defect detection solutions exhibit some deficiencies pertaining to the flexibility, accuracy and real-time performance of wire rope defect detection. To solve the aforementioned problems, this study utilizes the camera to sample the wire rope before the well entry, and proposes an object based on YOLOv5. The surface small-defect detection model realizes the accurate detection of small defects outside the wire rope. The transfer learning method is also introduced to enhance the model accuracy of small sample training. Herein, the enhanced YOLOv5 algorithm effectively enhances the accuracy of target detection and solves the defect detection problem of wire rope utilized in mine, and somewhat avoids accidents occasioned by wire rope damage. After a large number of experiments, it is revealed that in the task of wire rope defect detection, the average correctness rate and the average accuracy rate of the model are significantly enhanced with those before the modification, and that the detection speed can be maintained at a real-time level. -
Mingfang Jiang, Hengfu Yang
Vol. 19, No. 6, pp. 756-766, Dec. 2023
https://doi.org/10.3745/JIPS.03.0189
Keywords: Deep residual network, Multipurpose Watermarking, Perceptual hashing, Reversible Visible Watermarking
Show / Hide AbstractTo effectively track the illegal use of digital images and maintain the security of digital image communication on the Internet, this paper proposes a reversible multipurpose image watermarking algorithm based on a deep residual network (ResNet) and perceptual hashing (also called MWR). The algorithm first combines perceptual image hashing to generate a digital fingerprint that depends on the user’s identity information and image characteristics. Then it embeds the removable visible watermark and digital fingerprint in two different regions of the orthogonal separation of the image. The embedding strength of the digital fingerprint is computed using ResNet. Because of the embedding of the removable visible watermark, the conflict between the copyright notice and the user’s browsing is balanced. Moreover, image authentication and traitor tracking are realized through digital fingerprint insertion. The experiments show that the scheme has good visual transparency and watermark visibility. The use of chaotic mapping in the visible watermark insertion process enhances the security of the multipurpose watermark scheme, and unauthorized users without correct keys cannot effectively remove the visible watermark. -
Jimin Ha, Jungho Kang, Jong Hyuk Park
Vol. 19, No. 6, pp. 767-777, Dec. 2023
https://doi.org/10.3745/JIPS.03.0190
Keywords: CCTV, Chaotic Masking, Privacy Protection, Security
Show / Hide AbstractIn modern society, user privacy is emerging as an important issue as closed-circuit television (CCTV) systems increase rapidly in various public and private spaces. If CCTV cameras monitor sensitive areas or personal spaces, they can infringe on personal privacy. Someone's behavior patterns, sensitive information, residence, etc. can be exposed, and if the image data collected from CCTV is not properly protected, there can be a risk of data leakage by hackers or illegal accessors. This paper presents an innovative approach to “machine learning based reversible chaotic masking method for user privacy protection in CCTV environment.” The proposed method was developed to protect an individual's identity within CCTV images while maintaining the usefulness of the data for surveillance and analysis purposes. This method utilizes a two-step process for user privacy. First, machine learning models are trained to accurately detect and locate human subjects within the CCTV frame. This model is designed to identify individuals accurately and robustly by leveraging state-of-the-art object detection techniques. When an individual is detected, reversible chaos masking technology is applied. This masking technique uses chaos maps to create complex patterns to hide individual facial features and identifiable characteristics. Above all, the generated mask can be reversibly applied and removed, allowing authorized users to access the original unmasking image. -
Yongli Liu, Congcong Zhao, Hao Chao
Vol. 19, No. 6, pp. 778-790, Dec. 2023
https://doi.org/10.3745/JIPS.04.0294
Keywords: Density peak clustering, Information Bottleneck, Multicenter Clustering
Show / Hide AbstractAlthough density peak clustering can often easily yield excellent results, there is still room for improvement when dealing with complex, high-dimensional datasets. One of the main limitations of this algorithm is its reliance on geometric distance as the sole similarity measurement. To address this limitation, we draw inspiration from the information bottleneck theory, and propose a novel density peak clustering algorithm that incorporates this theory as a similarity measure. Specifically, our algorithm utilizes the joint probability distribution between data objects and feature information, and employs the loss of mutual information as the measurement standard. This approach not only eliminates the potential for subjective error in selecting similarity method, but also enhances performance on datasets with multiple centers and high dimensionality. To evaluate the effectiveness of our algorithm, we conducted experiments using ten carefully selected datasets and compared the results with three other algorithms. The experimental results demonstrate that our information bottleneck-based density peaks clustering (IBDPC) algorithm consistently achieves high levels of accuracy, highlighting its potential as a valuable tool for data clustering tasks. -
Xu Han, Xianhao Wang, Chong Chen, Gong Li, Changhao Piao
Vol. 19, No. 6, pp. 791-802, Dec. 2023
https://doi.org/10.3745/JIPS.02.0208
Keywords: DeepLab V3+, Geometric Location, Hot Spot Location, Hot Spot Recognition, YOLO v3
Show / Hide AbstractThe manual inspection of photovoltaic (PV) panels to meet the requirements of inspection work for large-scale PV power plants is challenging. We present a hot spot detection and positioning method to detect hot spots in batches and locate their latitudes and longitudes. First, a network based on the YOLOv3 architecture was utilized to identify hot spots. The innovation is to modify the RU_1 unit in the YOLOv3 model for hot spot detection in the far field of view and add a neural network residual unit for fusion. In addition, because of the misidentification problem in the infrared images of the solar PV panels, the DeepLab v3+ model was adopted to segment the PV panels to filter out the misidentification caused by bright spots on the ground. Finally, the latitude and longitude of the hot spot are calculated according to the geometric positioning method utilizing known information such as the drone's yaw angle, shooting height, and lens field-of-view. The experimental results indicate that the hot spot recognition rate accuracy is above 98%. When keeping the drone 25 m off the ground, the hot spot positioning error is at the decimeter level. -
Chaehyeon Kim, Hyewon Ryu, Ki Yong Lee
Vol. 19, No. 6, pp. 803-816, Dec. 2023
https://doi.org/10.3745/JIPS.04.0295
Keywords: Explainable Artificial Intelligence, Graph Convolutional Network, Gradient-based Explanation
Show / Hide AbstractExplainable artificial intelligence is a method that explains how a complex model (e.g., a deep neural network) yields its output from a given input. Recently, graph-type data have been widely used in various fields, and diverse graph neural networks (GNNs) have been developed for graph-type data. However, methods to explain the behavior of GNNs have not been studied much, and only a limited understanding of GNNs is currently available. Therefore, in this paper, we propose an explanation method for node classification using graph convolutional networks (GCNs), which is a representative type of GNN. The proposed method finds out which features of each node have the greatest influence on the classification of that node using GCN. The proposed method identifies influential features by backtracking the layers of the GCN from the output layer to the input layer using the gradients. The experimental results on both synthetic and real datasets demonstrate that the proposed explanation method accurately identifies the features of each node that have the greatest influence on its classification. -
Shu Tang, Yuanhong Deng, Peng Yang
Vol. 19, No. 6, pp. 817-829, Dec. 2023
https://doi.org/10.3745/JIPS.04.0298
Keywords: Bandwidth Change Response, Bandwidth Utilization, Intra-only Coding, Queuing Delay, Real-Time VideoTransmission
Show / Hide AbstractVariable wireless channel is a big challenge for real-time video applications, and the rate adaptation of realtime video streaming becomes a hot topic. Intra-video coding is important for high-quality video communication and industrial video applications. In this paper, we proposed a novel adaptive scheme for real-time video transmission with intra-only coding over a wireless network. The key idea of this scheme is to estimate the instantaneous remaining capacity of the network to adjust the quality of the next several video frames, which not only can keep low queuing delay and ensure video quality, but also can respond to bandwidth changes quickly. We compare our scheme with three different schemes in the video transmission system. The experimental results show that our scheme has higher bandwidth utilization and faster bandwidth change response, while maintaining low queuing delay. -
Byoungwook Kim, Hong-Jun Jang
Vol. 19, No. 6, pp. 830-841, Dec. 2023
https://doi.org/10.3745/JIPS.04.0296
Keywords: Spatio-temporal Document Classification, Tokenization, Word-Level Embedding
Show / Hide AbstractTokenization is the process of segmenting the input text into smaller units of text, and it is a preprocessing task that is mainly performed to improve the efficiency of the machine learning process. Various tokenization methods have been proposed for application in the field of natural language processing, but studies have primarily focused on efficiently segmenting text. Few studies have been conducted on the Korean language to explore what tokenization methods are suitable for document classification task. In this paper, an exploratory study was performed to find the most suitable tokenization method to improve the performance of a representative spatio-temporal document classifier in Korean. For the experiment, a convolutional neural network model was used, and for the final performance comparison, tasks were selected for document classification where performance largely depends on the tokenization method. As a tokenization method for comparative experiments, commonly used Jamo, Character, and Word units were adopted. As a result of the experiment, it was confirmed that the tokenization of word units showed excellent performance in the case of representative spatio-temporal document classification task where the semantic embedding ability of the token itself is important. -
Haiqin Tang, Ruirui Zhang
Vol. 19, No. 6, pp. 842-857, Dec. 2023
https://doi.org/10.3745/JIPS.04.0299
Keywords: Chinese Microblog Review, Deep Learning, Sentiment Classification, TextCNN-BiLSTM
Show / Hide AbstractCurrently, most sentiment classification models on microblogging platforms analyze sentence parts of speech and emoticons without comprehending users’ emotional inclinations and grasping moral nuances. This study proposes a hybrid sentiment analysis model. Given the distinct nature of microblog comments, the model employs a combined stop-word list and word2vec for word vectorization. To mitigate local information loss, the TextCNN model, devoid of pooling layers, is employed for local feature extraction, while BiLSTM is utilized for contextual feature extraction in deep learning. Subsequently, microblog comment sentiments are categorized using a classification layer. Given the binary classification task at the output layer and the numerous hidden layers within BiLSTM, the Tanh activation function is adopted in this model. Experimental findings demonstrate that the enhanced TextCNN-BiLSTM model attains a precision of 94.75%. This represents a 1.21%, 1.25%, and 1.25% enhancement in precision, recall, and F1 values, respectively, in comparison to the individual deep learning models TextCNN. Furthermore, it outperforms BiLSTM by 0.78%, 0.9%, and 0.9% in precision, recall, and F1 values. -
Tiantian Yin, Yina Guo, Ningning Zhang
Vol. 19, No. 6, pp. 858-869, Dec. 2023
https://doi.org/10.3745/JIPS.04.0297
Keywords: Discriminator, Image restoration, SCBSS, TriGAN
Show / Hide AbstractAs one of the pivotal techniques of image restoration, single-channel blind source separation (SCBSS) is capable of converting a visual-only image into multi-source images. However, image degradation often results from multiple mixing methods. Therefore, this paper introduces an innovative SCBSS algorithm to effectively separate source images from a composite image in various mixed modes. The cornerstone of this approach is a novel triple generative adversarial network (TriGAN), designed based on dual learning principles. The TriGAN redefines the discriminator's function to optimize the separation process. Extensive experiments have demonstrated the algorithm's capability to distinctly separate source images from a composite image in diverse mixed modes and to facilitate effective image restoration. The effectiveness of the proposed method is quantitatively supported by achieving an average peak signal-to-noise ratio exceeding 30 dB, and the average structural similarity index surpassing 0.95 across multiple datasets.