Your search
Results 4 resources
-
Facial expression recognition (FER) is essential for discerning human emotions and is applied extensively in big data analytics, healthcare, security, and user experience enhancement. This paper presents an empirical study that evaluates four existing deep learning models—VGG16, DenseNet, ResNet50, and GoogLeNet—utilizing the Facial Expression Recognition 2013 (FER2013) dataset. The dataset contains seven distinct emotional expressions: angry, disgust, fear, happy, neutral, sad, and surprise. Each model underwent rigorous assessment based on metrics including test accuracy, training duration, and weight file size to test their effectiveness in FER tasks. ResNet50 emerged as the top performer with a test accuracy of 69.46%, leveraging its residual learning architecture to effectively address challenges inherent in training deep neural networks. Conversely, GoogLeNet exhibited the lowest test accuracy among the models, suggesting potential architectural constraints in FER applications. VGG16, while competitive in accuracy, demonstrated lengthier training times and a larger weight file size (512MB), highlighting the inherent balance between model complexity and computational efficiency.
-
Nowadays, the increasing number of medical diagnostic data and clinical data provide more complementary references for doctors to make diagnosis to patients. For example, with medical data, such as electrocardiography (ECG), machine learning algorithms can be used to identify and diagnose heart disease to reduce the workload of doctors. However, ECG data is always exposed to various kinds of noise and interference in reality, and medical diagnostics only based on one-dimensional ECG data is not trustable enough. By extracting new features from other types of medical data, we can implement enhanced recognition methods, called multimodal learning. Multimodal learning helps models to process data from a range of different sources, eliminate the requirement for training each single learning modality, and improve the robustness of models with the diversity of data. Growing number of articles in recent years have been devoted to investigating how to extract data from different sources and build accurate multimodal machine learning models, or deep learning models for medical diagnostics. This paper reviews and summarizes several recent papers that dealing with multimodal machine learning in disease detection, and identify topics for future research.
-
<jats:p>Facial expression recognition (FER) is essential for discerning human emotions and is applied extensively in big data analytics, healthcare, security, and user experience enhancement. This study presents a comprehensive evaluation of ten state-of-the-art deep learning models—VGG16, VGG19, ResNet50, ResNet101, DenseNet, GoogLeNet V1, MobileNet V1, EfficientNet V2, ShuffleNet V2, and RepVGG—on the task of facial expression recognition using the FER2013 dataset. Key performance metrics, including test accuracy, training time, and weight file size, were analyzed to assess the learning efficiency, generalization capabilities, and architectural innovations of each model. EfficientNet V2 and ResNet50 emerged as top performers, achieving high accuracy and stable convergence using compound scaling and residual connections, enabling them to capture complex emotional features with minimal overfitting. DenseNet, GoogLeNet V1, and RepVGG also demonstrated strong performance, leveraging dense connectivity, inception modules, and re-parameterization techniques, though they exhibited slower initial convergence. In contrast, lightweight models such as MobileNet V1 and ShuffleNet V2, while excelling in computational efficiency, faced limitations in accuracy, particularly in challenging emotion categories like “fear” and “disgust”. The results highlight the critical trade-offs between computational efficiency and predictive accuracy, emphasizing the importance of selecting appropriate architecture based on application-specific requirements. This research contributes to ongoing advancements in deep learning, particularly in domains such as facial expression recognition, where capturing subtle and complex patterns is essential for high-performance outcomes.</jats:p>