Digital Library
Vol. 20, No. 6, Dec. 2024
-
Daoqing Gong, Cheng Yuan, Xinyan Gan, Xiang Gao, Guizhi Sun
Vol. 20, No. 6, pp. 718-730, Dec. 2024
https://doi.org/10.3745/JIPS.04.0324
Keywords: Machine Learning, Data Mining, Optimization Model, meteorological data, Rainfall Forecast
Show / Hide AbstractIn recent years, the rapid development of artificial intelligence technology has brought new opportunities to the meteorological field. Specifically, machine learning (ML) algorithms have proven valuable tools in rainfall retrievals, demonstrating the practicability of using ML algorithms when facing high-dimensional and complex data. By collecting data and using ML algorithms to mine and analyze the data, ML models can solve the problem of rainfall prediction in meteorology. Spurred by this advantage, this paper compared five ML algorithms for rainfall prediction using the National Population Health Science data from China, and the five ML algorithms were optimized appropriately. The data employed was first preprocessed to find and fill in the missing values, remove duplicate values, mine the correlation between data features, and generate visual results. Then, logistic regression, k-nearest neighbor algorithm, naive Bayes, decision tree algorithms, and random forest were used to mine and analyze the meteorological data for weather prediction. Finally, the performance of the models before and after optimization is compared to provide decision support for rainfall prediction. -
Yiwen Dou, Hong-Chao Miao, Li-ping Zhang, Jia-Le Gong
Vol. 20, No. 6, pp. 731-745, Dec. 2024
https://doi.org/10.3745/JIPS.02.0223
Keywords: Low-illumination Image, Multi-Cues Fusion, Retinex
Show / Hide AbstractTo improve the visual perception of low-illumination images, image enhancement is usually required after capturing low-illumination images. Although some popular image enhancement methods can obtain some advantages such as high pixel value, high denoising performance, and fast response time, it is not enough for them to achieve good visual effects overall. In this paper, a multi-cues fusion method is presented, which can get a better image enhancement result with limited time consumption. After undergoing homomorphic filtering, guided filtering and Retinex enhancement, three basic fusion source images for multi-cues fusion processing are shown clearly provided by the original low-illuminance color image. Then, components with similar frequency and orientation decomposition properties are fused together to reconstruct and eventually form an enhanced image after discrete wavelet decomposition and principal component analysis. In conclusion, experimental results obtained in the paper show clearly that the proposed method can improve the visual perception and hold the visual consistence in different illumination conditions. By qualitative and quantitative tests, the proposed method has many better advantages than those state-of-the-art algorithms. -
Jae Won Lee
Vol. 20, No. 6, pp. 746-757, Dec. 2024
https://doi.org/10.3745/JIPS.04.0325
Keywords: Bulk Metrics, Phoneme Boundary Detection, speech recognition, Volatility Metric
Show / Hide AbstractThis paper proposes a novel Korean phoneme boundary detection method that can be applied to phoneme-based Korean speech recognition systems. The proposed method employs two time-domain metrics—volatility and bulk metrics—as the foundation for phoneme boundary detection. The input speech signal is divided into blocks of 300 integer samples. For each block, the volatility metric is computed that adds up all the changes between neighboring samples within the block. A bulk is a grouping of consecutive samples with the same sign. For each bulk, two bulk metrics are calculated: bulk size and bulk length. Three dedicated algorithms that utilize both types of metrics are used to detect phoneme boundaries by recognizing vowels, voiced consonants, and voiceless consonants in turn. The experimental results show that the proposed method can significantly reduce the error rate compared to an existing boundary detection method. -
Li Lin, Yidan Wang, Yan Wang, Changhua Tang
Vol. 20, No. 6, pp. 758-766, Dec. 2024
https://doi.org/10.3745/JIPS.02.0221
Keywords: Attention Mechanism, Bert model, Fused Features, Neural Network
Show / Hide AbstractIn the process of Chinese long-text classification, due to the large amount of text data and complex features, methods suitable for ordinary text classification often lack sufficient accuracy, which directly leads to frequent classification failures in long-text environments. To solve this problem, the research designed a bi-directional long short-term memory (Bi-LSTM) model that combines forward and backward operations and utilized attention mechanisms to improve fusion. At the same time, the bi-directional encoder representations from transformers (BERT) model was introduced into the text processing to form a long-text classification model. Finally, different datasets were tested to verify the actual classification effect of the model. The research results showed that under different dataset environments, the classification accuracy rates of the designed models were 92.93% and 93.77%, respectively, which are the models with the highest classification accuracy rates among the same type of models. The calculation time was 85.42 seconds and 117.51 seconds, respectively, which are the models with the shortest calculation time among the same type of models. It can be seen that the research designed long-text classification model innovatively combined the BERT model, convolutional neural network model, Bi-LSTM model, and attention mechanism structure based on the data characteristics of long-text classification, enabling the model to achieve higher classification accuracy in a shorter computational time. Moreover, it has better classification results in actual long-text classification, overcomes the classification failure problem caused by complex text features in the long-text classification environment, and provides a possibility for long-text specific classification paths. -
Wenhui Si
Vol. 20, No. 6, pp. 767-778, Dec. 2024
https://doi.org/10.3745/JIPS.04.0326
Keywords: City Promotion Films, Impacts on Audience’s Psychology, Signification Mechanism, Signs
Show / Hide AbstractWith the rapid development of information technology, mass media has changed people’s ways of learning and communicating. People, especially the younger generation, are becoming more inclined to visual resources to learn about the world, exchange ideas, and entertain themselves than ever before. A city promotion film plays a vital role in constructing a city’s image and improving its reputation. Hangzhou has been making preparations for the 19th Asian Games, which is a good chance for this city to become more popular in the world and to get a favorable position in global competition. It is of great significance to study how the city promotion film helps Hangzhou successfully create its city image and in what ways it affects the audience’s psychology. This paper selects “Hangzhou is not only a poem,” a famous city promotion film of Hangzhou, as its study object, adopts the French Semiotician Roland Barthes’ signification mode theory to study the signs used in this film and how they work together to signify and create the international image of Hangzhou. In addition, it tries to explore its impacts on the audience’s psychology from the semiotic perspective. It is found that this promotional film mainly employs such visual signs as characters, landscapes, subtitles, and audio signs like background music and voice-over. Various signs are used jointly at three signification levels. At the first signification level, picture signs and musical signs work together to signify the images of Hangzhou as a city of ecological tourism, vitality & innovation, and happiness & civilization, respectively. At the second signification level, the three images of Hangzhou are juxtaposed to signify that Hangzhou is a city with favorable living and working environment. At the third signification level, the multi-dimensional image of Hangzhou as the signifier refers to the ultimate purpose of this film, which is to improve the fame and influence of Hangzhou. This city promotion film is of great significance to make Hangzhou more well-known and successfully satisfies the audience’s psychological needs such as seeking information, being curious, entertaining, and so on. -
Jiseob Park, Hyeob Kim, Hyuk-Jun Kwon
Vol. 20, No. 6, pp. 779-792, Dec. 2024
https://doi.org/10.3745/JIPS.04.0327
Keywords: Lexical Characteristics, Online Education, Press Releases, Remote Work Self-Efficacy, Self-Efficacy
Show / Hide AbstractIn this study, a comparative analysis was conducted between two crowdsourcing-based remote work platforms. In the first phase, 60 marketers (30 working exclusively on each platform) were surveyed using the “Remote Work Self-Efficacy” questionnaire. The results revealed that in three key areas: goal setting, prioritizing tasks, and sending documents via email, marketers from one platform scored higher on average than those from the other, with statistically significant differences. In the second phase, the lexical density, lexical sophistication, and type-token ratio of press releases written by marketers from both platforms were analyzed. This comparison showed that the press releases from one group scored higher in these lexical indicators with statistically significant differences. -
Zheping Quan, Jianing Li, Weijia Song
Vol. 20, No. 6, pp. 793-800, Dec. 2024
https://doi.org/10.3745/JIPS.04.0328
Keywords: DBiLSTM, Early Warning Model, Online Courses, Sports
Show / Hide AbstractThe development of data mining technology has pushed data-driven decision-making to gradually become the core content of educational data mining. To identify students who are at risk of failing physical education online courses at an early stage, this article uses bidirectional long-short term memory (BiLSTM) neural networks to construct a deep BiLSTM (DBiLSTM) prediction model. The experimental verification of its effectiveness showed that in the full attribute data experiment, the DBiLSTM specificity at Stage 1 was the highest, at 30.8%, and the accuracy rate at Stage 3 was as high as 73.6%. In the best attribute data experiment, compared to the full attribute, the accuracy of all models at Stage 2 increased, except for the SVM model, which had a 61.8% accuracy rate. At Stage 3, the early warning accuracy of DBiLSTM was higher than other algorithms, with a rate of 75.7%. In the experiment after introducing the balanced data method, the accuracy of the DBiLSTMSMOTE model combined with the Synthetic Minority Oversampling Technique was 72.6%. At this time, the AUC value of DBiLSTM-SMPOTE reached 72.6% in the middle of the semester, significantly superior to other algorithm models. Overall, DBiLSTM is effective in the early warning of students’ performance in online sports courses, while DBiLSTM-SMOTE is highly practical in early warning of performance in online sports teaching. -
Yuxiang Shan, Gang Yu, Yanghua Gao
Vol. 20, No. 6, pp. 801-811, Dec. 2024
https://doi.org/10.3745/JIPS.02.0222
Keywords: Block Collaborative Gait Representation, Dilated Convolution, gait recognition, Residual Mechanism
Show / Hide AbstractHuman identification based on gait analysis is a promising biometric technology that can recognize different individuals by their walking patterns. This study primarily addresses the challenges of gait representation and partial occlusion. Firstly, considering the multi-scale and multi-perspective aspects of gait in practical application scenarios, a novel block collaborative gait representation method is proposed based upon local structures, aiming to enhance the accuracy of identity recognition by integrating information from multiple scales and perspectives. Then, we propose a new gait recognition network that integrates dilated convolutions and the residual mechanism (DCRM). The DCRM network adds dilated convolutional blocks to the residual branch to expand the receptive field without losing resolution, thereby reducing the negative impact of local occlusion on recognition accuracy. Experimental results on two public datasets demonstrated that the proposed approach shows clear advantages over existing gait analysis methods. -
Jihong Kim, Nammee Moon
Vol. 20, No. 6, pp. 812-826, Dec. 2024
https://doi.org/10.3745/JIPS.01.0110
Keywords: Big Data Processing and Analysis, Credit Risk Prediction, Deep Learning, Explainable AI
Show / Hide AbstractThe modern financial industry demands rapid decision-making based on diverse information from dynamic environments. Predicting outcomes from such data is complex due to rapid shifts influenced by numerous factors. Despite advancements in artificial intelligence technology that offer sophisticated analytical models, accurately predicting outcomes and providing sufficient justification for these predictions remain challenging, particularly with fragmented model constructions. In this paper, we propose a novel approach for efficient processing of available public personal credit data, deriving new analysis elements, and comparing prediction interpretations. Specifically, we develop 11 prediction models that can be categorized into two types: data image transformation and time-series transformation. The models undergo standardization, preprocessing, and cross-validation for optimization, with their predictive performances compared and validated. Models leveraging convolutional neural network (CNN) and convolutional neural network-long short-term memory (CNN-LSTM) architectures demonstrate strong performance across both categories. To fully interpret the classification process, SHAP is applied to compare and explain the prediction results for each model type. -
Chaehyeon Kim, Sara Yu, Ki Yong Lee
Vol. 20, No. 6, pp. 827-840, Dec. 2024
https://doi.org/10.3745/JIPS.04.0329
Keywords: Electric Scooter Sharing Service, Object Detection, Photo Recognition, YOLO
Show / Hide AbstractThe use of electric scooter (e-scooter) sharing services has increased significantly in recent years due to their convenience and economy. In order to rent an e-scooter, a user first finds nearby e-scooters using a smartphone application, which shows the global positioning system (GPS) locations of e-scooters around the user. However, since the error of GPS can be more than 10 m, the user may have difficulty finding the exact location of the escooter the user wants to use. To alleviate this problem, an e-scooter sharing service “Kickgoing,” operated by Olulo in South Korea, provides users with e-scooter photos taken by users upon return, along with their GPS locations, on its smartphone application. Those photos help subsequent users to find e-scooters more accurately. However, since some users upload photos that do not include e-scooters or are unrecognizable, it is essential to provide users with only those photos that clearly include an e-scooter. Therefore, in this paper, we develop an e-scooter photo recognition system that can accurately recognize only those photos that include e-scooters. The developed system, which is based on YOLO, uses three techniques: if a whole e-scooter is not recognized, it recognizes an e-scooter by recognizing its parts individually; it recognizes e-scooters with significantly different photography angles as different classes; and it provides users with only those photos in which the proportion of the e-scooter is within a certain range. Experimental results on a real dataset show that the developed system recognizes e-scooter photos more accurately compared to a system that uses the YOLO model as is. -
Aili Gao, Lan Chen, Xiaohan Wei, Chao Liu, Lihua Cheng
Vol. 20, No. 6, pp. 841-852, Dec. 2024
https://doi.org/10.3745/JIPS.04.0330
Keywords: Feedforward Neural Networks, Migration Characteristics, Petroleum Hydrocarbon, spatial distribution
Show / Hide AbstractSoil pollution resulting from petroleum hydrocarbons (PHCs) arising from industrialization and human activities has emerged as a progressively severe global concern. Establishing an accurate spatial distribution prediction model for PHCs through limited sampling data play an important role in understanding the migration characteristics of PHCs and effectively preventing soil pollution. This article employs soil samples within 8 m of a chemical plant, in conjunction with hydrogeological data, to model the spatial distribution of PHC content using a feedforward neural network (FNN). The prediction outcomes are characterized through three-dimensional visualization. The findings indicate that FNN demonstrates superior estimation accuracy compared to traditional interpolation method. Regarding the horizontal distribution within surface soil, there is pronounced lateral migration of PHC content in both the storage area and manufacturing shop, with migration aligning following the direction of groundwater. Vertically, PHC content exhibits a consistent pattern of increasing and then decreasing with greater depth. It is predominantly enriched in the lower section of the aeration zone and the upper part of the saturated zone, particularly within 4 m, under influence of groundwater. In this study, the prediction model offers an original approach to the spatial distribution of soil pollutants.