Digital Library
Vol. 20, No. 4, Aug. 2024
-
Mika Anttonen, Dongwann Kang
Vol. 20, No. 4, pp. 418-431, Aug. 2024
https://doi.org/10.3745/JIPS.02.0216
Keywords: image annotation, Medical Imaging, Virtual reality
Show / Hide AbstractThe usage of virtual reality (VR) in healthcare field has been gaining attention lately. The main use cases revolve around medical imaging and clinical skill training. Healthcare professionals have found great benefits in these cases when done in VR. While medical imaging on the desktop has lots of available software with various tools, VR versions are mostly stripped-down with only basic tools. One of the many tool groups significantly missing is annotation. In this paper, we survey the current situation of medical imaging software both on the desktop and in the VR environment. We will discuss general information on medical imaging and provide examples of both desktop and VR applications. We will also discuss the current status of annotation in VR, the problems that need to be overcome and possible solutions for them. The findings of this paper should help developers of future medical image annotation tools in choosing which problem they want to tackle and possible methods. The findings will be used to help in our future work of developing annotation tools. -
Pingping Li, Feng Zhang
Vol. 20, No. 4, pp. 432-441, Aug. 2024
https://doi.org/10.3745/JIPS.03.0199
Keywords: Binary system, Cloud computing, Full Homomorphic Encryption Algorithm, Gene Matching
Show / Hide AbstractTo improve the security of gene information and the accuracy of matching, this paper designs a homomorphic encryption algorithm for gene matching based on cloud computing environment. Firstly, the gene sequences of cloud files entered by users are collected, which are converted into binary code by binary function, so that the encrypted text is obviously different from the original text. After that, the binary code of genes in the database is compared with the generated code to complete gene matching. Experimental analysis indicates that when the number of fragments in a 1 GB gene file is 65, the minimum encryption time of the algorithm is 80.13 ms. Aside from that, the gene matching time and energy consumption of this algorithm are the least, which are 85.69 ms and 237.89 J, respectively. -
Phitchawat Lukkanathiti, Chantana Chantrapornchai
Vol. 20, No. 4, pp. 442-457, Aug. 2024
https://doi.org/10.3745/JIPS.04.0313
Keywords: Cloud computing, Container Technology, Drone Imagery Processing, Infrastructure as Code, WebODM
Show / Hide AbstractThe issues were studied of an open-source scaling drone imagery platform, called WebODM. It is known that processing drone images has a high demand for resources because of many preprocessing and post-processing steps involved in image loading, orthophoto, georeferencing, texturing, meshing, and other procedures. By default, WebODM allocates one node for processing. We explored methods to expand the platform's capability to handle many processing requests, which should be beneficial to platform designers. Our primary objective was to enhance WebODM's performance to support concurrent users through the use of container technology. We modified the original process to scale the task vertically and horizontally utilizing the Kubernetes cluster. The effectiveness of the scaling approaches enabled handling more concurrent users. The response time per active thread and the number of responses per second were measured. Compared to the original WebODM, our modified version sometimes had a longer response time by 1.9%. Nonetheless, the processing throughput was improved by up to 101% over the original WebODM’s with some differences in the drone image processing results. Finally, we discussed the integration with the infrastructure as code to automate the scaling is discussed. -
Huijuan Sun
Vol. 20, No. 4, pp. 458-476, Aug. 2024
https://doi.org/10.3745/JIPS.01.0106
Keywords: Bi-directional Gated Recurrent Unit, Class Imbalance, deep neural network, Edge Computing, Network intrusion detection, Transformer-Encoder
Show / Hide AbstractTo address the issue of class imbalance in network traffic data, which affects the network intrusion detection performance, a combined framework using transformers is proposed. First, Tomek Links, SMOTE, and WGAN are used to preprocess the data to solve the class-imbalance problem. Second, the transformer is used to encode traffic data to extract the correlation between network traffic. Finally, a hybrid deep learning network model combining a bidirectional gated current unit and deep neural network is proposed, which is used to extract longdependence features. A DNN is used to extract deep level features, and softmax is used to complete classification. Experiments were conducted on the NSLKDD, UNSWNB15, and CICIDS2017 datasets, and the detection accuracy rates of the proposed model were 99.72%, 84.86%, and 99.89% on three datasets, respectively. Compared with other relatively new deep-learning network models, it effectively improved the intrusion detection performance, thereby improving the communication security of network data. -
Sinhye Nam, Chaerin Jang, Sunyoung Kim
Vol. 20, No. 4, pp. 477-490, Aug. 2024
https://doi.org/10.3745/JIPS.04.0314
Keywords: Corpus Analysis, Keywords Analysis, Learners of Korean as a Heritage Language, Vocabulary Development
Show / Hide AbstractThis study identifies the vocabulary usage patterns of Korean heritage language learners. We analyzed the interlanguage of the Korean heritage language learners and examined their vocabulary usage patterns, especially the major content keywords being used at their respective proficiency levels. The Korean Learner’s Corpus from the National Institute of Korean Language is used for the data analysis. We found that as the heritage language learners’ proficiency increases, low-frequency (high-level) vocabulary is often used as the keywords and the semantic vocabulary areas expand from daily to social to specialized fields. It is therefore confirmed that the vocabulary use of Korean heritage language learners develops as their proficiency increases. This study confirms the development of Korean vocabulary in Korean heritage language learners and exemplifies how corpus-based applied linguistic research and computer science can be integrated using a keyword extraction algorithm. -
Hui Xu
Vol. 20, No. 4, pp. 491-500, Aug. 2024
https://doi.org/10.3745/JIPS.02.0217
Keywords: DTW, English, Endpoint Detection, MFCC, online teaching, Voice Recognition
Show / Hide AbstractIn hopes of resolving the issue of poor quality of information input for teaching spoken English online, the study creates an English teaching assistance model based on a recognition algorithm named dynamic time warping (DTW) and relies on automated voice recognition technology. In hopes of improving the algorithm's efficiency, the study modifies the speech signal's time-domain properties during the pre-processing stage and enhances the algorithm's performance in terms of computational effort and storage space. Finally, a simulation experiment is employed to evaluate the model application's efficacy. The study's revised DTW model, which achieves recognition rates of above 95% for all phonetic symbols and tops the list for cloudy consonant recognition with rates of 98.5%, 98.8%, and 98.7% throughout the three tests, respectively, is demonstrated by the study's findings. The enhanced model for DTW voice recognition also presents higher efficiency and requires less time for training and testing. The DTW model's KS value, which is the highest among the models analyzed in the KS value analysis, is 0.63. Among the comparative models, the model also presents the lowest curve position for both test functions. This shows that the upgraded DTW model features superior voice recognition capabilities, which could significantly improve online English education and lead to better teaching outcomes. -
Eunjo Jang, Ki Yong Lee
Vol. 20, No. 4, pp. 501-513, Aug. 2024
https://doi.org/10.3745/JIPS.04.0315
Keywords: Ensemble Method, Gradient Boosting, Graph Neural Network
Show / Hide AbstractIn recent years, graph neural networks (GNNs) have been extensively used to analyze graph data across various domains because of their powerful capabilities in learning complex graph-structured data. However, recent research has focused on improving the performance of a single GNN with only two or three layers. This is because stacking layers deeply causes the over-smoothing problem of GNNs, which degrades the performance of GNNs significantly. On the other hand, ensemble methods combine individual weak models to obtain better generalization performance. Among them, gradient boosting is a powerful supervised learning algorithm that adds new weak models in the direction of reducing the errors of the previously created weak models. After repeating this process, gradient boosting combines the weak models to produce a strong model with better performance. Until now, most studies on GNNs have focused on improving the performance of a single GNN. In contrast, improving the performance of GNNs using multiple GNNs has not been studied much yet. In this paper, we propose gradient boosted graph neural networks (GBGNN) that combine multiple shallow GNNs with gradient boosting. We use shallow GNNs as weak models and create new weak models using the proposed gradient boosting-based loss function. Our empirical evaluations on three real-world datasets demonstrate that GBGNN performs much better than a single GNN. Specifically, in our experiments using graph convolutional network (GCN) and graph attention network (GAT) as weak models on the Cora dataset, GBGNN achieves performance improvements of 12.3%p and 6.1%p in node classification accuracy compared to a single GCN and a single GAT, respectively. -
Gang Cheng, Hanlin Zhang, Jie Lin, Fanyu Kong, Leyun Yu
Vol. 20, No. 4, pp. 514-523, Aug. 2024
https://doi.org/10.3745/JIPS.03.0200
Keywords: Decision Tree, Heart Disease Diagnosis, Secure multi-party computation
Show / Hide AbstractIn the Internet of Medical Things, due to the sensitivity of medical information, data typically need to be retained locally. The training model of heart disease data can predict patients' physical health status effectively, thereby providing reliable disease information. It is crucial to make full use of multiple data sources in the Internet of Medical Things applications to improve model accuracy. As network communication speeds and computational capabilities continue to evolve, parties are storing data locally, and using privacy protection technology to exchange data in the communication process to construct models is receiving increasing attention. This shift toward secure and efficient data collaboration is expected to revolutionize computer modeling in the healthcare field by ensuring accuracy and privacy in the analysis of critical medical information. In this paper, we train and test a multiparty decision tree model for the Internet of Medical Things on a heart disease dataset to address the challenges associated with developing a practical and usable model while ensuring the protection of heart disease data. Experimental results demonstrate that the accuracy of our privacy protection method is as high as 93.24%, representing a difference of only 0.3% compared with a conventional plaintext algorithm. -
Deokyeon Jang, Minsoo Kim, Yon Dohn Chung
Vol. 20, No. 4, pp. 524-534, Aug. 2024
https://doi.org/10.3745/JIPS.04.0316
Keywords: Anonymization, Data Perturbation, data privacy, Personal Data Protection
Show / Hide AbstractThe release of relational data containing personal sensitive information poses a significant risk of privacy breaches. To preserve privacy while publishing such data, it is important to implement techniques that ensure protection of sensitive information. One popular technique used for this purpose is data perturbation, which is popularly used for privacy-preserving data release due to its simplicity and efficiency. However, the data perturbation has some limitations that prevent its practical application. As such, it is necessary to propose alternative solutions to overcome these limitations. In this study, we propose a novel approach to preserve privacy in the release of relational data containing personal sensitive information. This approach addresses an intuitive, syntactic privacy criterion for data perturbation and two perturbation methods for relational data release. Through experiments with synthetic and real data, we evaluate the performance of our methods. -
Jianzeng Chen, Ningning Chen
Vol. 20, No. 4, pp. 535-549, Aug. 2024
https://doi.org/10.3745/JIPS.01.0107
Keywords: contextual information, Convolutional Channel Attention, Deep Learning, facial expression recognition, feature fusion, Reverse Attention
Show / Hide AbstractFacial expressions (FEs) serve as fundamental components for human emotion assessment and human computer interaction. Traditional convolutional neural networks tend to overlook valuable information during the FE feature extraction, resulting in suboptimal recognition rates. To address this problem, we propose a deep learning framework that incorporates hierarchical feature fusion, contextual data, and an attention mechanism for precise FE recognition. In our approach, we leveraged an enhanced VGGNet16 as the backbone network and introduced an improved group convolutional channel attention (GCCA) module in each block to emphasize the crucial expression features. A partial decoder was added at the end of the backbone network to facilitate the fusion of multilevel features for a comprehensive feature map. A reverse attention mechanism guides the model to refine details layer-by-layer while introducing contextual information and extracting richer expression features. To enhance feature distinguishability, we employed islanding loss in combination with softmax loss, creating a joint loss function. Using two open datasets, our experimental results demonstrated the effectiveness of our framework. Our framework achieved an average accuracy rate of 74.08% on the FER2013 dataset and 98.66% on the CK+ dataset, outperforming advanced methods in both recognition accuracy and stability. -
Xiaoyan Huang
Vol. 20, No. 4, pp. 550-557, Aug. 2024
https://doi.org/10.3745/JIPS.04.0317
Keywords: Coherence, Communication, Cultural Noise, Spoken English
Show / Hide AbstractThis paper provides a brief overview of cultural noise interference in English communication. Subsequently, it conducts an illustrative analysis using 100 first-year students from Chongqing Vocational College of Light Industry to explore the impact of cultural noise interference on speaking coherence. Initially, a questionnaire is employed to assess the influence of cultural noise on students' judgments of speaking coherence. Different conversation scenarios involving different types of cultural noise interference are introduced to analyze the speaking coherence of students gradually. A significant impact of cultural noise on learners' speaking coherence is revealed by the results. As the variety of cultural noise increases, the influence on speaking coherence grows more pronounced. -
Kuldeep Gurjar, Surjeet Kumar, Arnav Bhavsar, Kotiba Hamad, Yang-Sae Moon, Dae Ho Yoon
Vol. 20, No. 4, pp. 558-573, Aug. 2024
https://doi.org/10.3745/JIPS.04.0318
Keywords: Explainable Deep Learning, Face Image Quality Assessment, Image Classification, MobileNet, Transfer Learning
Show / Hide AbstractConsidering factors such as illumination, camera quality variations, and background-specific variations, identifying a face using a smartphone-based facial image capture application is challenging. Face Image Quality Assessment refers to the process of taking a face image as input and producing some form of "quality" estimate as an output. Typically, quality assessment techniques use deep learning methods to categorize images. The models used in deep learning are shown as black boxes. This raises the question of the trustworthiness of the models. Several explainability techniques have gained importance in building this trust. Explainability techniques provide visual evidence of the active regions within an image on which the deep learning model makes a prediction. Here, we developed a technique for reliable prediction of facial images before medical analysis and security operations. A combination of gradient-weighted class activation mapping and local interpretable modelagnostic explanations were used to explain the model. This approach has been implemented in the preselection of facial images for skin feature extraction, which is important in critical medical science applications. We demonstrate that the use of combined explanations provides better visual explanations for the model, where both the saliency map and perturbation-based explainability techniques verify predictions.