Results | Kerko

Cui, Y., Liang, S., & Zhang, Y. (2024). Multimodal representation learning for tourism recommendation with two-tower architecture. PLOS ONE, 19(2). https://doi.org/10.1371/journal.pone.0299370

<jats:p>Personalized recommendation plays an important role in many online service fields. In the field of tourism recommendation, tourist attractions contain rich context and content information. These implicit features include not only text, but also images and videos. In order to make better use of these features, researchers usually introduce richer feature information or more efficient feature representation methods, but the unrestricted introduction of a large amount of feature information will undoubtedly reduce the performance of the recommendation system. We propose a novel heterogeneous multimodal representation learning method for tourism recommendation. The proposed model is based on two-tower architecture, in which the item tower handles multimodal latent features: Bidirectional Long Short-Term Memory (Bi-LSTM) is used to extract the text features of items, and an External Attention Transformer (EANet) is used to extract image features of items, and connect these feature vectors with item IDs to enrich the feature representation of items. In order to increase the expressiveness of the model, we introduce a deep fully connected stack layer to fuse multimodal feature vectors and capture the hidden relationship between them. The model is tested on the three different datasets, our model is better than the baseline models in NDCG and precision.</jats:p>

View on dspace.usj.edu.mo

Liang, S., Chen, T., Ma, J., Ren, S., Lu, X., & Du, G. (2024). Identification of mild cognitive impairment using multimodal 3D imaging data and graph convolutional networks. Physics in Medicine & Biology, 69(23). https://doi.org/10.1088/1361-6560/ad8c94

<jats:title>Abstract</jats:title> <jats:p> <jats:italic>Objective.</jats:italic> Mild cognitive impairment (MCI) is a precursor stage of dementia characterized by mild cognitive decline in one or more cognitive domains, without meeting the criteria for dementia. MCI is considered a prodromal form of Alzheimer’s disease (AD). Early identification of MCI is crucial for both intervention and prevention of AD. To accurately identify MCI, a novel multimodal 3D imaging data integration graph convolutional network (GCN) model is designed in this paper. <jats:italic>Approach.</jats:italic> The proposed model utilizes 3D-VGGNet to extract three-dimensional features from multimodal imaging data (such as structural magnetic resonance imaging and fluorodeoxyglucose positron emission tomography), which are then fused into feature vectors as the node features of a population graph. Non-imaging features of participants are combined with the multimodal imaging data to construct a population sparse graph. Additionally, in order to optimize the connectivity of the graph, we employed the pairwise attribute estimation （PAE） method to compute the edge weights based on non-imaging data, thereby enhancing the effectiveness of the graph structure. Subsequently, a population-based GCN integrates the structural and functional features of different modal images into the features of each participant for MCI classification. <jats:italic>Main results.</jats:italic> Experiments on the AD Neuroimaging Initiative demonstrated accuracies of 98.57%, 96.03%, and 96.83% for the normal controls (NC)-early MCI (EMCI), NC-late MCI (LMCI), and EMCI-LMCI classification tasks, respectively. The AUC, specificity, sensitivity, and F1-score are also superior to state-of-the-art models, demonstrating the effectiveness of the proposed model. Furthermore, the proposed model is applied to the ABIDE dataset for autism diagnosis, achieving an accuracy of 91.43% and outperforming the state-of-the-art models, indicating excellent generalization capabilities of the proposed model. <jats:italic>Significance.</jats:italic> This study demonstrate<jats:bold>s</jats:bold> the proposed model’s ability to integrate multimodal imaging data and its excellent ability to recognize MCI. This will help achieve early warning for AD and intelligent diagnosis of other brain neurodegenerative diseases.</jats:p>

View on dspace.usj.edu.mo

Al-Razgan, M., Ali, Y. A., Neira-Molina, H., Ma, H., Du, G., & Ain, Q. U. (2024). Optimizing Fetal Health Status Detection Using Quantum Intelligent Deep-Learning Methods on Cardiotocographic Data. SPIN, 15(02). https://doi.org/10.1142/S2010324724400058

<jats:p> This work compares the performance of different algorithms — quantum Fourier transform, Gaussian–Newton method, hyperfast, metropolis-adjusted Langevin algorithm, and nonparametric classification and regression trees — for the classification of fetal health states from FHR signals. In the conducted research, the effectiveness of each algorithm was measured using confusion matrices, which gave information about class precision, recall, and total accuracy in three classes: Normal, Suspect, and Pathological. The QFT algorithm gives an overall accuracy of 90%, where it is highly reliable in recognizing Normal (94% F1-score) and Pathological states (91% F1-score), but performs poorly regarding the Suspect cases, at 58% F1-score. On the other hand, using the GNM method gives an accuracy of 88%, whereby it performed well on Normal cases, at 93% F1-score, and poor performance with Suspect, at 50% F1-score, and Pathological classifications, at 82% F1-score. The hyperfast algorithm yielded an accuracy of 89%, thus performing well on Normal classifications with an F1-score of 93%, but less well on the Suspect states with an F1-score of 56%. The MALA algorithm outperformed all other algorithms tested in this study, giving an overall accuracy of 91% and adequately classifying Normal, Suspect, and Pathological states with corresponding F1-scores of 94%, 63%, and 90%, respectively; therefore, the algorithm is quite robust and reliable for fetal health monitoring. The NCART algorithm achieved an accuracy of 89%, thus showing great capability for classification in Normal cases with 94% F1-score and in Pathological cases with 88% F1-score; this is moderate for Suspect cases with 53% F1-score. Overall, while all algorithms exhibit potential for fetal health classification, MALA stands out as the most effective, offering reliable classification across all health states. These findings highlight the need for further refinement, particularly in enhancing the detection of Suspect conditions, to ensure comprehensive and accurate fetal health monitoring. </jats:p>

View on dspace.usj.edu.mo

Nizamani, A. H., Chen, Z., Nizamani, A. A., Bhatti, M. A., Ma, H., & Du, G. (2024). Trans-EffNet: A Hybrid Model for Brain Tumor Detection Using EfficientNet and Transformer Encoder. 2024 IEEE Smart World Congress (SWC). https://doi.org/10.1109/SWC62898.2024.00260

Accurate classification of brain tumors from MRI is critical for effective diagnosis and treatment. In this study, we introduce Trans-EffNet, a hybrid model combining pre-trained EfficientNet architectures with a transformer encoder to enhance brain tumor classification accuracy. By leveraging EfficientNet's deep CNN capabilities for localized feature extraction and the transformer encoder for capturing global contextual relationships, our model improves the identification of intricate tumor characteristics. Fine-tuned with ImageNet-derived weights and utilizing extensive data augmentation, Trans-EffNet was validated on both multi-class and binary datasets. Trans-EffNetB1 achieved 99.49 % accuracy on the multi-class dataset, while Trans-EffNetB2 recorded 99.83 % accuracy on the binary dataset, with perfect precision, recall, and F1-Score. These results underscore Trans-EffNet's robustness and potential as a significant advancement in brain tumor detection and classification.

View on dspace.usj.edu.mo

Wang, H., Chen, Q., Wang, X., Du, G., Li, X., & Nallanathan, A. (2025). Adaptive Block Sparse Backtracking-Based Channel Estimation for Massive MIMO-OTFS Systems. IEEE Internet of Things Journal, 12(1). https://doi.org/10.1109/JIOT.2024.3466911

—Orthogonal time frequency space (OTFS) modulation, combined with massive multiple-input–multiple-output (MIMO) technology, offers robust performance in high-mobility environments and high-user densities by capturing the full diversity of the wireless channel and effectively utilizing spatial multiplexing. This article introduces an adaptive block sparse backtracking (ABSB) algorithm designed to enhance channel estimation in OTFS with massive MIMO (massive MIMO-OTFS) systems. The proposed ABSB algorithm features dynamic block size adjustment based on the residual signal, improving its adaptability to the varying sparsity structure of the channel. Additionally, the algorithm extends the selection range of related block atoms to increase redundancy, reducing the risk of underfitting. Comprehensive simulation results demonstrate that the ABSB algorithm significantly outperforms traditional pilot-based methods in terms of channel estimation accuracy. It also surpasses the block orthogonal matching pursuit (BOMP) method as well as other classical compressed sensing methods. Specifically, the ABSB algorithm achieves up to a 20% reduction in estimation error compared to some of these traditional methods. The enhanced adaptability and robustness of the ABSB algorithm make it a promising solution for channel estimation in massive MIMO-OTFS systems, paving the way for more reliable and efficient next-generation wireless communications.

View on dspace.usj.edu.mo

Ma, J., Du, W., & Lu, W. (2022). Message from Program Chairs. Proceedings - 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science, ICIS 2022, X. Scopus. https://doi.org/10.1109/ICIS54925.2022.9882414

Read document

Ma, J., Du, W., & Lu, W. (2023). Foreword. Studies in Computational Intelligence, 1055, v–vii. Scopus.

Li, X., Jiao, T., Ma, J., Duan, D., & Liang, S. (2023). LSDA-APF: A Local Obstacle Avoidance Algorithm for Unmanned Surface Vehicles Based on 5G Communication Environment. Computer Modeling in Engineering & Sciences, 138(1), 595–617. https://doi.org/10.32604/cmes.2023.029367

In view of the complex marine environment of navigation, especially in the case of multiple static and dynamic obstacles, the traditional obstacle avoidance algorithms applied to unmanned surface vehicles (USV) are prone to fall into the trap of local optimization. Therefore, this paper proposes an improved artificial potential field (APF) algorithm, which uses 5G communication technology to communicate between the USV and the control center. The algorithm introduces the USV discrimination mechanism to avoid the USV falling into local optimization when the USV encounter different obstacles in different scenarios. Considering the various scenarios between the USV and other dynamic obstacles such as vessels in the process of performing tasks, the algorithm introduces the concept of dynamic artificial potential field. For the multiple obstacles encountered in the process of USV sailing, based on the International Regulations for Preventing Collisions at Sea (COLREGS), the USV determines whether the next step will fall into local optimization through the discrimination mechanism. The local potential field of the USV will dynamically adjust, and the reverse virtual gravitational potential field will be added to prevent it from falling into the local optimization and avoid collisions. The objective function and cost function are designed at the same time, so that the USV can smoothly switch between the global path and the local obstacle avoidance. The simulation results show that the improved APF algorithm proposed in this paper can successfully avoid various obstacles in the complex marine environment, and take navigation time and economic cost into account.

Read document

Liang, S., Jin, J., Du, W., & Qu, S. (2023). A Multi-Channel Text Sentiment Analysis Model Integrating Pre-training Mechanism. Information Technology and Control, 52(2), 263–275. https://doi.org/10.5755/j01.itc.52.2.31803

The number of tourist attractions reviews, travel notes and other texts has grown exponentially in the Internet age. Effectively mining users’ potential opinions and emotions on tourist attractions, and helping to provide users with better recommendation services, which is of great practical significance. This paper proposes a multi-channel neural network model called Pre-BiLSTM combined with a pre-training mechanism. The model uses a combination of coarse and fine- granularity strategies to extract the features of text information such as reviews and travel notes to improve the performance of text sentiment analysis. First, we construct three channels and use the improved BERT and skip-gram methods with negative sampling to vectorize the word-level and vocabulary-level text, respectively, so as to obtain more abundant textual information. Second, we use the pre-training mechanism of BERT to generate deep bidirectional language representation relationships. Third, the vectors of the three channels are input into the BiLSTM network in parallel to extract global and local features. Finally, the model fuses the text features of the three channels and classifies them using SoftMax classifier. Furthermore, numerical experiments are conducted to demonstrate that Pre-BiLSTM outperforms the baselines by 6.27%, 12.83% and 18.12% in average in terms of accuracy, precision and F1-score.

Read document

Liang, S., Sun, F., Sun, H., Chen, T., & Du, W. (2023). A medical text classification approach with ZEN and capsule network. The Journal of Supercomputing. https://doi.org/10.1007/s11227-023-05612-6

Text classification is an important topic in natural language processing, with the development of social network, many question-and-answer pairs regarding health-care and medicine flood social platforms. It is of great social value to mine and classify medical text and provide targeted medical services for patients. The existing algorithms of text classification can deal with simple semantic text, especially in the field of Chinese medical text, the text structure is complex and includes a large number of medical nomenclature and professional terms, which are difficult for patients to understand. We propose a Chinese medical text classification model using a BERT-based Chinese text encoder by N-gram representations (ZEN) and capsule network, which represent feature uses the ZEN model and extract the features by capsule network, we also design a N-gram medical dictionary to enhance medical text representation and feature extraction. The experimental results show that the precision, recall and F1-score of our model are improved by 10.25%, 11.13% and 12.29%, respectively, compared with the baseline models in average, which proves that our model has better performance.

View on doi.org

Li, N., Yang, X., Du, W., Ogihara, A., Zhou, S., Ma, X., Wang, Y., Li, S., & Li, K. (2022). Exploratory Research on Key Technology of Human-Computer Interactive 2.5-Minute Fast Digital Early Warning for Mild Cognitive Impairment. Computational Intelligence and Neuroscience, 2022, 1–15. https://doi.org/10.1155/2022/2495330

Objective. As the preclinical stage of Alzheimer’s disease (AD), Mild Cognitive Impairment (MCI) is characterized by hidden onset, which is difficult to detect early. Traditional neuropsychological scales are main tools used for assessing MCI. However, due to its strong subjectivity and the influence of many factors such as subjects’ educational background, language and hearing ability, and time cost, its accuracy as the standard of early screening is low. Therefore, the purpose of this paper is to propose a new key technology of fast digital early warning for MCI based on eye movement objective data analysis. Methodology. Firstly, four exploratory indexes (test durations, correlation degree, lengths of gaze trajectory, and drift rate) of MCI early warning are determined based on the relevant literature research and semistructured expert interview; secondly, the eye movement state is captured based on the eye tracker to realize the data extraction of four exploratory indexes. On this basis, the human-computer interactive 2.5-minute fast digital early warning paradigm for MCI is designed; thirdly, the rationality of the four early warning indexes proposed in this paper and their early warning effectiveness on MCI are verified. Results. Through the small sample test of human-computer interactive 2.5 fast digital early warning paradigm for MCI conducted by 32 elderly people aged 70–90 in a medical institution in Hangzhou, the two indexes of “correlation degree” and “drift rate” with statistical differences are selected. The experiment results show that AUC of this MCI early warning paradigm is 0.824. Conclusion. The key technology of human-computer interactive 2.5 fast digital early warning for MCI proposed in this paper overcomes the limitations of the existing MCI early warning tools, such as low objectification level, high dependence on professional doctors, long test time, requiring high educational level, and so on. The experiment results show that the early warning technology, as a new generation of objective and effective digital early warning tool, can realize 2.5-minute fast and high-precision preliminary screening and early warning for MCI in the elderly.

Read document

Wu, W., Jing, X., Du, W., & Chen, G. (2021). Learning dynamics of kernel-based deep neural networks in manifolds. Science China Information Sciences, 64(11), 212103. https://doi.org/10.1007/s11432-020-3022-3

View on link.springer.com

Liang, S., Jin, J., Ren, J., Du, W., & Qu, S. (2023). An Improved Dual-Channel Deep Q-Network Model for Tourism Recommendation. Big Data, big.2021.0353. https://doi.org/10.1089/big.2021.0353

View on www.liebertpub.com

Li, X., Zhang, Y., Jin, J., Sun, F., Li, N., & Liang, S. (2023). A model of integrating convolution and BiGRU dual-channel mechanism for Chinese medical text classifications. PLOS ONE, 18(3), e0282824. https://doi.org/10.1371/journal.pone.0282824

Recently, a lot of Chinese patients consult treatment plans through social networking platforms, but the Chinese medical text contains rich information, including a large number of medical nomenclatures and symptom descriptions. How to build an intelligence model to automatically classify the text information consulted by patients and recommend the correct department for patients is very important. In order to address the problem of insufficient feature extraction from Chinese medical text and low accuracy, this paper proposes a dual channel Chinese medical text classification model. The model extracts feature of Chinese medical text at different granularity, comprehensively and accurately obtains effective feature information, and finally recommends departments for patients according to text classification. One channel of the model focuses on medical nomenclatures, symptoms and other words related to hospital departments, gives different weights, calculates corresponding feature vectors with convolution kernels of different sizes, and then obtains local text representation. The other channel uses the BiGRU network and attention mechanism to obtain text representation, highlighting the important information of the whole sentence, that is, global text representation. Finally, the model uses full connection layer to combine the representation vectors of the two channels, and uses Softmax classifier for classification. The experimental results show that the accuracy, recall and F1-score of the model are improved by 10.65%, 8.94% and 11.62% respectively compared with the baseline models in average, which proves that our model has better performance and robustness.

Read document

Liang, S., Chen, X., Ma, J., Du, W., & Ma, H. (2021). An Improved Double Channel Long Short-Term Memory Model for Medical Text Classification. Journal of Healthcare Engineering, 2021, 1–8. https://doi.org/10.1155/2021/6664893

There are a large number of symptom consultation texts in medical and healthcare Internet communities, and Chinese health segmentation is more complex, which leads to the low accuracy of the existing algorithms for medical text classification. The deep learning model has advantages in extracting abstract features of text effectively. However, for a large number of samples of complex text data, especially for words with ambiguous meanings in the field of Chinese medical diagnosis, the word-level neural network model is insufficient. Therefore, in order to solve the triage and precise treatment of patients, we present an improved Double Channel (DC) mechanism as a significant enhancement to Long Short-Term Memory (LSTM). In this DC mechanism, two channels are used to receive word-level and char-level embedding, respectively, at the same time. Hybrid attention is proposed to combine the current time output with the current time unit state and then using attention to calculate the weight. By calculating the probability distribution of each timestep input data weight, the weight score is obtained, and then weighted summation is performed. At last, the data input by each timestep is subjected to trade-off learning to improve the generalization ability of the model learning. Moreover, we conduct an extensive performance evaluation on two different datasets: cMedQA and Sentiment140. The experimental results show that the DC-LSTM model proposed in this paper has significantly superior accuracy and ROC compared with the basic CNN-LSTM model.

View on www.hindawi.com

Liang, S., Jiao, T., Du, W., & Qu, S. (2021). An improved ant colony optimization algorithm based on context for tourism route planning. PLOS ONE, 16(9), e0257317. https://doi.org/10.1371/journal.pone.0257317

To solve the problem of one-sided pursuit of the shortest distance but ignoring the tourist experience in the process of tourism route planning, an improved ant colony optimization algorithm is proposed for tourism route planning. Contextual information of scenic spots significantly effect people’s choice of tourism destination, so the pheromone update strategy is combined with the contextual information such as weather and comfort degree of the scenic spot in the process of searching the global optimal route, so that the pheromone update tends to the path suitable for tourists. At the same time, in order to avoid falling into local optimization, the sub-path support degree is introduced. The experimental results show that the optimized tourism route has greatly improved the tourist experience, the route distance is shortened by 20.5% and the convergence speed is increased by 21.2% compared with the basic algorithm, which proves that the improved algorithm is notably effective.

View on dx.plos.org

Yin, F., & Du, W. (2021). Power Allocation for 5G Mobile Multiuser Cooperative Networks. Journal of Advanced Transportation, 2021, 1–7. https://doi.org/10.1155/2021/3882100

With the fifth generation (5G) communication technology, the mobile multiuser networks have developed rapidly. In this paper, the performance analysis of mobile multiuser networks which utilize decode-and-forward (DF) relaying is considered. We derive novel outage probability (OP) expressions. To improve the OP performance, we study the power allocation optimization problem. To solve the optimization problem, we propose an intelligent power allocation optimization algorithm based on grey wolf optimization (GWO). We compare the proposed GWO approach with three existing algorithms. The experimental results reveal that the proposed GWO algorithm can achieve a smaller OP, thus improving system efficiency. Also, compared with other channel models, the OP values of the 2-Rayleigh model are increased by 81.2% and 66.6%, respectively.

View on www.hindawi.com

He, C., Liu, J., Zhu, Y., & Du, W. (2021). Data Augmentation for Deep Neural Networks Model in EEG Classification Task: A Review. Frontiers in Human Neuroscience, 15, 765525. https://doi.org/10.3389/fnhum.2021.765525

Classification of electroencephalogram (EEG) is a key approach to measure the rhythmic oscillations of neural activity, which is one of the core technologies of brain-computer interface systems (BCIs). However, extraction of the features from non-linear and non-stationary EEG signals is still a challenging task in current algorithms. With the development of artificial intelligence, various advanced algorithms have been proposed for signal classification in recent years. Among them, deep neural networks (DNNs) have become the most attractive type of method due to their end-to-end structure and powerful ability of automatic feature extraction. However, it is difficult to collect large-scale datasets in practical applications of BCIs, which may lead to overfitting or weak generalizability of the classifier. To address these issues, a promising technique has been proposed to improve the performance of the decoding model based on data augmentation (DA). In this article, we investigate recent studies and development of various DA strategies for EEG classification based on DNNs. The review consists of three parts: what kind of paradigms of EEG-based on BCIs are used, what types of DA methods are adopted to improve the DNN models, and what kind of accuracy can be obtained. Our survey summarizes the current practices and performance outcomes that aim to promote or guide the deployment of DA to EEG classification in future research and development.

View on www.frontiersin.org

Lin, C., Chen, Z., Huang, Y., Jiang, H., Du, W., & Chen, Q. (2022). A Deep Neural Network Based on Circular Representation for Target Detection. Journal of Sensors, 2022, 1–10. https://doi.org/10.1155/2022/4437446

Convolutional neural network (CNN) model based on deep learning has excellent performance for target detection. However, the detection effect is poor when the object is circular or tubular because most of the existing object detection methods are based on the traditional rectangular box to detect and recognize objects. To solve the problem, we propose the circular representation structure and RepVGG module on the basis of CenterNet and expand the network prediction structure, thus proposing a high-precision and high-efficiency lightweight circular object detection method RebarDet. Specifically, circular tubular type objects will be optimized by replacing the traditional rectangular box with a circular box. Second, we improve the resolution of the network feature map and the upper limit of the number of objects detected in a single detect to achieve the expansion of the network prediction structure, optimized for the dense phenomenon that often occurs in circular tubular objects. Finally, the multibranch topology of RepVGG is introduced to sum the feature information extracted by different convolution modules, which improves the ability of the convolution module to extract information. We conducted extensive experiments on rebar datasets and used AB-Score as a new evaluation method to evaluate RebarDet. The experimental results show that RebarDet can achieve a detection accuracy of up to 0.8114 and a model inference speed of 6.9 fps while maintaining a moderate amount of parameters, which is superior to other mainstream object detection models and verifies the effectiveness of our proposed method. At the same time, RebarDet’s high precision detection of round tubular objects facilitates enterprise intelligent manufacturing processes.

View on www.hindawi.com

Your search

Results 19 resources

Explore

Academic Units

Resource type

Publication year