Your search

Results 9 resources

  • As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these experiments time consuming and costly, but experiments that involve animal testing are increasingly subject to ethical concerns. While traditional machine learning (ML) methods have been used in the field with some success, the limited availability of annotated toxicity data is the major hurdle for further improving model performance. Inspired by the success of semi-supervised learning (SSL) algorithms, we propose a Graph Convolution Neural Network (GCN) to predict chemical toxicity and trained the network by the Mean Teacher (MT) SSL algorithm. Using the Tox21 data, our optimal SSL-GCN models for predicting the twelve toxicological endpoints achieve an average ROC-AUC score of 0.757 in the test set, which is a 6% improvement over GCN models trained by supervised learning and conventional ML methods. Our SSL-GCN models also exhibit superior performance when compared to models constructed using the built-in DeepChem ML methods. This study demonstrates that SSL can increase the prediction power of models by learning from unannotated data. The optimal unannotated to annotated data ratio ranges between 1:1 and 4:1. This study demonstrates the success of SSL in chemical toxicity prediction; the same technique is expected to be beneficial to other chemical property prediction tasks by utilizing existing large chemical databases. Our optimal model SSL-GCN is hosted on an online server accessible through: https://app.cbbio.online/ssl-gcn/home.

  • Neuropeptides are a group of neuronal signaling molecules that regulate physiological and behavioral processes in animals. Here, we used in silico mining to predict the polypeptide composition of available transcriptomic data of Turbinaria peltata. In total, 118 transcripts encoding putative peptide precursors were discovered. One neuropeptide Y/F-like peptide, named TpNPY, was identified and selected for in silico structural, in silico binding, and pharmacological studies. In our study, the anti-inflammation effect of TpNPY was evaluated using an LPS-stimulated C8-D1A astrocyte cell model. Our results demonstrated that TpNPY, at 0.75–3 μM, inhibited LPS-induced NO production and reduced the expression of iNOS in a dose-dependent manner. Furthermore, TpNPY reduced the secretion of proinflammatory cytokines. Additionally, treatment with TpNPY reduced LPS-mediated elevation of ROS production and the intracellular calcium concentration. Further investigation revealed that TpNPY downregulated the IKK/IκB/NF-κB signaling pathway and inhibited expression of the NLRP3 inflammasome. Through molecular docking and using an NPY receptor antagonist, TpNPY was shown to have the ability to interact with the NPY Y1 receptor. On the basis of these findings, we concluded that TpNPY might prevent LPS-induced injury in astrocytes through activation of the NPY-Y1R.

  • Stock movement prediction is one of the most challenging problems in time series analysis due to the stochastic nature of financial markets. In recent years, a plethora of statistical methods and machine learning algorithms were proposed for stock movement prediction. Specifically, deep learning models are increasingly applied for the prediction of stock movement. The success of deep learning models relies on the assumption that massive training data are available. However, this assumption is impractical for stock movement prediction. In stock markets, a large number of stocks do not have enough historical data, especially for the companies which underwent initial public offering in recent years. In these situations, the accuracy of deep learning models to predict the stock movement could be affected. To address this problem, in this paper, we propose novel instance-based deep transfer learning models with attention mechanism. In the experiments, we compare our proposed methods with state-of-the-art prediction models. Experimental results on three public datasets reveal that our proposed methods significantly improve the performance of deep learning models when limited training data are available.

  • Ligand peptides that have high affinity for ion channels are critical for regulating ion flux across the plasma membrane. These peptides are now being considered as potential drug candidates for many diseases, such as cardiovascular disease and cancers. In this work, we developed Multi-Branch-CNN, a CNN method with multiple input branches for identifying three types of ion channel peptide binders (sodium, potassium, and calcium) from intra- and inter-feature types. As for its real-world applications, prediction models that are able to recognize novel sequences having high or low similarities to training sequences are required. To this end, we tested our models on two test sets: a general test set including sequences spanning different similarity levels to those of the training set, and a novel-test set consisting of only sequences that bear little resemblance to sequences from the training set. Our experiments showed that the Multi-Branch-CNN method performs better than thirteen traditional ML algorithms (TML13), yielding an improvement in accuracy of 3.2%, 1.2%, and 2.3% on the test sets as well as 8.8%, 14.3%, and 14.6% on the novel-test sets for sodium, potassium, and calcium ion channels, respectively. We confirmed the effectiveness of Multi-Branch-CNN by comparing it to the standard CNN method with one input branch (Single-Branch-CNN) and an ensemble method (TML13-Stack). The data sets, script files to reproduce the experiments, and the final predictive models are freely available at https://github.com/jieluyan/Multi-Branch-CNN.

  • Despite the levels of air pollution in Macao continuing to improve over recent years, there are still days with high-pollution episodes that cause great health concerns to the local community. Therefore, it is very important to accurately forecast air quality in Macao. Machine learning methods such as random forest (RF), gradient boosting (GB), support vector regression (SVR), and multiple linear regression (MLR) were applied to predict the levels of particulate matter (PM10 and PM2.5) concentrations in Macao. The forecast models were built and trained using the meteorological and air quality data from 2013 to 2018, and the air quality data from 2019 to 2021 were used for validation. Our results show that there is no significant difference between the performance of the four methods in predicting the air quality data for 2019 (before the COVID-19 pandemic) and 2021 (the new normal period). However, RF performed significantly better than the other methods for 2020 (amid the pandemic) with a higher coefficient of determination (R2) and lower RMSE, MAE, and BIAS. The reduced performance of the statistical MLR and other ML models was presumably due to the unprecedented low levels of PM10 and PM2.5 concentrations in 2020. Therefore, this study suggests that RF is the most reliable prediction method for pollutant concentrations, especially in the event of drastic air quality changes due to unexpected circumstances, such as a lockdown caused by a widespread infectious disease.

  • Approximately 50 million people are suffering from epilepsy worldwide. Corals have been used for treating epilepsy in traditional Chinese medicine, but the mechanism of this treatment is unknown. In this study, we analyzed the transcriptome of the branching coral Acropora digitifera and obtained its Kyoto Encyclopedia of Genes and Genomes (KEGG), EuKaryotic Orthologous Groups (KOG) and Gene Ontology (GO) annotation. Combined with multiple sequence alignment and phylogenetic analysis, we discovered three polypeptides, we named them AdKuz1, AdKuz2 and AdKuz3, from A. digitifera that showed a close relationship to Kunitz-type peptides. Molecular docking and molecular dynamics simulation indicated that AdKuz1 to 3 could interact with GABAA receptor but AdKuz2–GABAA remained more stable than others. The biological experiments showed that AdKuz1 and AdKuz2 exhibited an anti-inflammatory effect by decreasing the aberrant level of nitric oxide (NO), IL-6, TNF-α and IL-1β induced by LPS in BV-2 cells. In addition, the pentylenetetrazol (PTZ)-induced epileptic effect on zebrafish was remarkably suppressed by AdKuz1 and AdKuz2. AdKuz2 particularly showed superior anti-epileptic effects compared to the other two peptides. Furthermore, AdKuz2 significantly decreased the expression of c-fos and npas4a, which were up-regulated by PTZ treatment. In addition, AdKuz2 reduced the synthesis of glutamate and enhanced the biosynthesis of gamma-aminobutyric acid (GABA). In conclusion, the results indicated that AdKuz2 may affect the synthesis of glutamate and GABA and enhance the activity of the GABAA receptor to inhibit the symptoms of epilepsy. We believe, AdKuz2 could be a promising anti-epileptic agent and its mechanism of action should be further investigated.

  • Antimicrobial resistance has become a critical global health problem due to the abuse of conventional antibiotics and the rise of multi-drug-resistant microbes. Antimicrobial peptides (AMPs) are a group of natural peptides that show promise as next-generation antibiotics due to their low toxicity to the host, broad spectrum of biological activity, including antibacterial, antifungal, antiviral, and anti-parasitic activities, and great therapeutic potential, such as anticancer, anti-inflammatory, etc. Most importantly, AMPs kill bacteria by damaging cell membranes using multiple mechanisms of action rather than targeting a single molecule or pathway, making it difficult for bacterial drug resistance to develop. However, experimental approaches used to discover and design new AMPs are very expensive and time-consuming. In recent years, there has been considerable interest in using in silico methods, including traditional machine learning (ML) and deep learning (DL) approaches, to drug discovery. While there are a few papers summarizing computational AMP prediction methods, none of them focused on DL methods. In this review, we aim to survey the latest AMP prediction methods achieved by DL approaches. First, the biology background of AMP is introduced, then various feature encoding methods used to represent the features of peptide sequences are presented. We explain the most popular DL techniques and highlight the recent works based on them to classify AMPs and design novel peptide sequences. Finally, we discuss the limitations and challenges of AMP prediction.

  • The key challenge of Unsupervised Domain Adaptation (UDA) for analyzing time series data is to learn domain-invariant representations by capturing complex temporal dependencies. In addition, existing unsupervised domain adaptation methods for time series data are designed to align marginal distribution between source and target domains. However, existing UDA methods (e.g. R-DANN Purushotham et al. (2017), VRADA Purushotham et al. (2017), CoDATS Wilson et al. (2020)) neglect the conditional distribution discrepancy between two domains, leading to misclassification of the target domain. Therefore, to learn domain-invariant representations by capturing the temporal dependencies and to reduce the conditional distribution discrepancy between two domains, a novel Attentive Recurrent Adversarial Domain Adaptation with Top-k time series pseudo-labeling method called ARADA-TK is proposed in this paper. In the experiments, our proposed method was compared with the state-of-the-art UDA methods (R-DANN, VRADA and CoDATS). Experimental results on four benchmark datasets revealed that ARADA-TK achieves superior classification accuracy when it is compared to the competing methods.

  • Air pollution in Macau has become a serious problem following the Pearl River Delta’s (PRD) rapid industrialization that began in the 1990s. With this in mind, Macau needs an air quality forecast system that accurately predicts pollutant concentration during the occurrence of pollution episodes to warn the public ahead of time. Five different state-of-the-art machine learning (ML) algorithms were applied to create predictive models to forecast PM2.5, PM10, and CO concentrations for the next 24 and 48 h, which included artificial neural networks (ANN), random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and multiple linear regression (MLR), to determine the best ML algorithms for the respective pollutants and time scale. The diurnal measurements of air quality data in Macau from 2016 to 2021 were obtained for this work. The 2020 and 2021 datasets were used for model testing, while the four-year data before 2020 and 2021 were used to build and train the ML models. Results show that the ANN, RF, XGBoost, SVM, and MLR models were able to provide good performance in building up a 24-h forecast with a higher coefficient of determination (R2) and lower root mean square error (RMSE), mean absolute error (MAE), and biases (BIAS). Meanwhile, all the ML models in the 48-h forecasting performance were satisfactory enough to be accepted as a two-day continuous forecast even if the R2 value was lower than the 24-h forecast. The 48-h forecasting model could be further improved by proper feature selection based on the 24-h dataset, using the Shapley Additive Explanations (SHAP) value test and the adjusted R2 value of the 48-h forecasting model. In conclusion, the above five ML algorithms were able to successfully forecast the 24 and 48 h of pollutant concentration in Macau, with the RF and SVM models performing the best in the prediction of PM2.5 and PM10, and CO in both 24 and 48-h forecasts.

Last update from database: 3/28/24, 4:01 PM (UTC)