Статьи журнала - International Journal of Intelligent Systems and Applications

Все статьи: 1173

Creation and comparison of language and acoustic models using Kaldi for noisy and enhanced speech data

Creation and comparison of language and acoustic models using Kaldi for noisy and enhanced speech data

Thimmaraja Yadava G., H. S. Jayanna

Статья научная

In this work, the Language Models (LMs) and Acoustic Models (AMs) are developed using the speech recognition toolkit Kaldi for noisy and enhanced speech data to build an Automatic Speech Recognition (ASR) system for Kannada language. The speech data used for the development of ASR models is collected under uncontrolled environment from the farmers of different dialect regions of Karnataka state. The collected speech data is preprocessed by proposing a method for noise elimination in the degraded speech data. The proposed method is a combination of Spectral Subtraction with Voice Activity Detection (SS-VAD) and Minimum Mean Square Error-Spectrum Power Estimator (MMSE-SPZC) based on Zero Crossing. The word level transcription and validation of speech data is done by Indic language transliteration tool (IT3 to UTF-8). The Indian Language Speech Label (ILSL12) set is used for the development of Kannada phoneme set and lexicon. The 75% and 25% of transcribed and validated speech data is used for system training and testing respectively. The LMs are generated by using the Kannada language resources and AMs are developed by using Gaussian Mixture Models (GMM) and Subspace Gaussian Mixture Models (SGMM). The proposed method is studied determinedly and used for enhancing the degraded speech data. The Word Error Rates (WERs) of ASR models for noisy and enhanced speech data are highlighted and discussed in this work. The developed ASR models can be used in spoken query system to access the real time agricultural commodity price and weather information in Kannada language.

Бесплатно

Credibility Detection on Twitter News Using Machine Learning Approach

Credibility Detection on Twitter News Using Machine Learning Approach

Marina Azer, Mohamed Taha, Hala H. Zayed, Mahmoud Gadallah

Статья научная

Social media presence is a crucial portion of our life. It is considered one of the most important sources of information than traditional sources. Twitter has become one of the prevalent social sites for exchanging viewpoints and feelings. This work proposes a supervised machine learning system for discovering false news. One of the credibility detection problems is finding new features that are most predictive to better performance classifiers. Both features depending on new content, and features based on the user are used. The features' importance is examined, and their impact on the performance. The reasons for choosing the final feature set using the k-best method are explained. Seven supervised machine learning classifiers are used. They are Naïve Bayes (NB), Support vector machine (SVM), K-nearest neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Maximum entropy (ME), and conditional random forest (CRF). Training and testing models were conducted using the Pheme dataset. The feature's analysis is introduced and compared to the features depending on the content, as the decisive factors in determining the validity. Random forest shows the highest performance while using user-based features only and using a mixture of both types of features; features depending on content and the features based on the user, accuracy (82.2 %) in using user-based features only. We achieved the highest results by using both types of features, utilizing random forest classifier accuracy(83.4%). In contrast, logistic regression was the best as to using features that are based on contents. Performance is measured by different measurements accuracy, precision, recall, and F1_score. We compared our feature set with other studies' features and the impact of our new features. We found that our conclusions exhibit high enhancement concerning discovering and verifying the false news regarding the discovery and verification of false news, comparing it to the current results of how it is developed.

Бесплатно

Current Trends of High capacity Optical Interconnection Data Link in High Performance Optical Communication Systems

Current Trends of High capacity Optical Interconnection Data Link in High Performance Optical Communication Systems

Ahmed Nabih Zaki Rashed

Статья научная

Optical technologies are ubiquitous in telecommunications networks and systems, providing multiple wavelength channels of transport at 2.5 Gbit/sec to 40 Gbit/sec data rates over single fiber optic cables. Market pressures continue to drive the number of wavelength channels per fiber and the data rate per channel. This trend will continue for many years to come as electronic commerce grows and enterprises demand higher and reliable bandwidth over long distances. Electronic commerce, in turn, is driving the growth curves for single processor and multiprocessor performance in data base transaction and Web based servers. Ironically, the insatiable taste for enterprise network bandwidth, which has driven up the volume and pushed down the price of optical components for telecommunications, is simultaneously stressing computer system bandwidth increasing the need for new interconnection schemes and providing for the first time commercial opportunities for optical components in computer systems. The evolution of integrated circuit technology is causing system designs to move towards communication based architectures. We have presented the current tends of high performance system capacity of optical interconnection data transmission link in high performance optical communication and computing systems over wide range of the affecting parameters.

Бесплатно

DIMK-means “Distance-based Initialization Method for K-means Clustering Algorithm”

DIMK-means “Distance-based Initialization Method for K-means Clustering Algorithm”

Raed T. Aldahdooh, Wesam Ashour

Статья научная

Partition-based clustering technique is one of several clustering techniques that attempt to directly decompose the dataset into a set of disjoint clusters. K-means algorithm dependence on partition-based clustering technique is popular and widely used and applied to a variety of domains. K-means clustering results are extremely sensitive to the initial centroid; this is one of the major drawbacks of k-means algorithm. Due to such sensitivity; several different initialization approaches were proposed for the K-means algorithm in the last decades. This paper proposes a selection method for initial cluster centroid in K-means clustering instead of the random selection method. Research provides a detailed performance assessment of the proposed initialization method over many datasets with different dimensions, numbers of observations, groups and clustering complexities. Ability to identify the true clusters is the performance evaluation standard in this research. The experimental results show that the proposed initialization method is more effective and converges to more accurate clustering results than those of the random initialization method.

Бесплатно

Data Analysis for the Aero Derivative Engines Bleed System Failure Identification and Prediction

Data Analysis for the Aero Derivative Engines Bleed System Failure Identification and Prediction

Khalid Salmanov, Hadi Harb

Статья научная

Middle size gas/diesel aero-derivative power generation engines are widely used on various industrial plants in the oil and gas industry. Bleed of Valve (BOV) system failure is one of the failure mechanisms of these engines. The BOV is part of the critical anti-surge system and this kind of failure is almost impossible to identify while the engine is in operation. If the engine operates with BOV system impaired, this leads to the high maintenance cost during overhaul, increased emission rate, fuel consumption and loss in the efficiency. This paper proposes the use of readily available sensor data in a Supervisory Control and Data Acquisition (SCADA) system in combination with a machine learning algorithm for early identification of BOV system failure. Different machine learning algorithms and dimensionality reduction techniques are evaluated on real world engine data. The experimental results show that Bleed of Valve systems failures could be effectively predicted from readily available sensor data.

Бесплатно

Data Clustering Using Wave Atom

Data Clustering Using Wave Atom

Bilal A.Shehada, Mahmoud Z.Alkurdi, Wesam M. Ashour

Статья научная

Clustering of huge spatial databases is an important issue which tries to track the densely regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. Clustering approach should be efficient and can detect clusters of arbitrary shapes because spatial objects cannot be simply abstracted as isolated points they have different boundary, size, volume, and location. In this paper we use discrete wave atom transformation technique in clustering to achieve more accurate result .By using multi-resolution transformation like wavelet and wave atom we can effectively identify arbitrary shape clusters at different degrees of accuracy. Experimental results on very large data sets show the efficiency and effectiveness of the proposed wave atom bases clustering approach compared to other recent clustering methods. Experimental result shows that we get more accurate result and denoised output than others.

Бесплатно

Data Mining of Students’ Performance: Turkish Students as a Case Study

Data Mining of Students’ Performance: Turkish Students as a Case Study

Oyebade K. Oyedotun, Sam Nii Tackie, Ebenezer O. Olaniyi, Adnan Khashman

Статья научная

Artificial neural networks have been used in different fields of artificial intelligence, and more specifically in machine learning. Although, other machine learning options are feasible in most situations, but the ease with which neural networks lend themselves to different problems which include pattern recognition, image compression, classification, computer vision, regression etc. has earned it a remarkable place in the machine learning field. This research exploits neural networks as a data mining tool in predicting the number of times a student repeats a course, considering some attributes relating to the course itself, the teacher, and the particular student. Neural networks were used in this work to map the relationship between some attributes related to students’ course assessment and the number of times a student will possibly repeat a course before he passes. It is the hope that the possibility to predict students’ performance from such complex relationships can help facilitate the fine-tuning of academic systems and policies implemented in learning environments. To validate the power of neural networks in data mining, Turkish students’ performance database has been used; feedforward and radial basis function networks were trained for this task. The performances obtained from these networks were evaluated in consideration of achieved recognition rates and training time.

Бесплатно

Data Quality for AI Tool: Exploratory Data Analysis on IBM API

Data Quality for AI Tool: Exploratory Data Analysis on IBM API

Ankur Jariwala, Aayushi Chaudhari, Chintan Bhatt, Dac-Nhuong Le

Статья научная

A huge amount of data is produced in every domain these days. Thus for applying automation on any dataset, the appropriately trained data plays an important role in achieving efficient and accurate results. According to data researchers, data scientists spare 80% of their time in preparing and organizing the data. To overcome this tedious task, IBM Research has developed a Data Quality for AI tool, which has varieties of metrics that can be applied to different datasets (in .csv format) to identify the quality of data. In this paper, we will be representing how the IBM API toolkit will be useful for different variants of datasets and showcase the results for each metrics in graphical form. This paper might be found useful for the readers to understand the working flow of the IBM data purifier tool, thus we have represented the entire flow of how to use IBM data quality for the AI toolkit in the form of architecture.

Бесплатно

Data Transformation and Predictive Analytics of Cardiovascular Disease Using Machine and Ensemble Learning Techniques

Data Transformation and Predictive Analytics of Cardiovascular Disease Using Machine and Ensemble Learning Techniques

J. Cruz Antony, E. Murali, D. Deepa, R. Vignesh, S. Hemalatha, Umme Fahad

Статья научная

About one person dies every minute from cardiovascular disease; consequently, it has almost surpassed war as the largest cause of death in the twenty-first century. In cardiology, early and accurate diagnosis of heart illness is a cornerstone of effective healthcare. Predictive analytics, which involves machine-learning algorithms, can be a great option for contributing towards the early detection of cardiovascular disease. This study evaluates the data preprocessing techniques involved in building machine learning models to predict cardiovascular disease and identify the features contributing to the cardio attack. A novel data transformation technique named the superlative boundary binning method was proposed to enhance machine learning and ensemble learning classification models for predicting cardiac illness based on independent physiological feature parameters. The results revealed that the ensemble learning classifier AdaBoost using the superlative boundary binning method has performed well with a classification accuracy of 93% when compared with the other data transformation and machine learning classifier models.

Бесплатно

Data Visualization and its Proof by Compactness Criterion of Objects of Classes

Data Visualization and its Proof by Compactness Criterion of Objects of Classes

Saidov Doniyor Yusupovich

Статья научная

In this paper considered the problem of reducing the dimension of the feature space using nonlinear mapping the object description on numerical axis. To reduce the dimensionality of space used by rules agglomerative hierarchical grouping of different - type (nominal and quantitative) features. Groups do not intersect with each other and their number is unknown in advance. The elements of each group are mapped on the numerical axis to form a latent feature. The set of latent features would be sorted by the informativeness in the process of hierarchical grouping. A visual representation of objects obtained by this set or subset is used as a tool for extracting hidden regularities in the databases. The criterion for evaluating the compactness of the class objects is based on analyzing the structure of their connectivity. For the analysis used an algorithm partitioning into disjoint classes the representatives of the group on defining subsets of boundary objects. The execution of algorithm provides uniqueness of the number of groups and their member objects in it. The uniqueness property is used to calculate the compactness measure of the training samples. The value of compactness is measured with dimensionless quantities in the interval of [0, 1]. There is a need to apply of dimensionless quantities for estimating the structure of feature space. Such a need exists at comparing the different metrics, normalization methods and data transformation, selection and removing the noise objects.

Бесплатно

Data-driven Approximation of Cumulative Distribution Function Using Particle Swarm Optimization based Finite Mixtures of Logistic Distribution

Data-driven Approximation of Cumulative Distribution Function Using Particle Swarm Optimization based Finite Mixtures of Logistic Distribution

Rajasekharreddy Poreddy, Gopi E.S.

Статья научная

This paper proposes a data-driven approximation of the Cumulative Distribution Function using the Finite Mixtures of the Cumulative Distribution Function of Logistic distribution. Since it is not possible to solve the logistic mixture model using the Maximum likelihood method, the mixture model is modeled to approximate the empirical cumulative distribution function using the computational intelligence algorithms. The Probability Density Function is obtained by differentiating the estimate of the Cumulative Distribution Function. The proposed technique estimates the Cumulative Distribution Function of different benchmark distributions. Also, the performance of the proposed technique is compared with the state-of-the-art kernel density estimator and the Gaussian Mixture Model. Experimental results on κ−μ distribution show that the proposed technique performs equally well in estimating the probability density function. In contrast, the proposed technique outperforms in estimating the cumulative distribution function. Also, it is evident from the experimental results that the proposed technique outperforms the state-of-the-art Gaussian Mixture model and kernel density estimation techniques with less training data.

Бесплатно

Decision-Making Using Efficient Confidence-Intervals with Meta-Analysis of Spatial Panel Data for Socioeconomic Development Project-Managers

Decision-Making Using Efficient Confidence-Intervals with Meta-Analysis of Spatial Panel Data for Socioeconomic Development Project-Managers

Ashok Sahai, Clement K. Sankat, Koffka Khan

Статья научная

It is quite common to have access to geospatial (temporal/spatial) panel data generated by a set of similar data for analyses in a meta-data setup. Within this context, researchers often employ pooling methods to evaluate the efficacy of meta-data analysis. One of the simplest techniques used to combine individual-study results is the fixed-effects model, which assumes that a true-effect is equal for all studies. An alternative, and intuitively-more-appealing method, is the random-effects model. A paper was presented by the first author, and his co-authors addressing the efficient estimation problem, using this method in the aforesaid meta-data setup of the ‘Geospatial Data’ at hand, in Map World Forum meeting in 2007 at Hyderabad; INDIA. The purpose of this paper had been to address the estimation problem of the fixed-effects model and to present a simulation study of an efficient confidence-interval estimation of a mean true-effect using the panel-data and a random-effects model, too in order to establish appropriate ‘confidence interval’ estimation for being readily usable in a decision-makers’ setup. The present paper continues the same perspective, and proposes a much more efficient estimation strategy furthering the gainful use of the ‘Geospatial Panel-Data’ in the Global/Continental/ Regional/National contexts of “Socioeconomic & other Developmental Issues’. The ‘Statistical Efficient Confidence Interval Estimation Theme’ of the paper(s) has a wider ambit than its applicability in the context of ‘Socioeconomic Development’ only. This ‘Statistical Theme’ is, as such, equally gainfully applicable to any area of application in the present world-order at large inasmuch as the “Data-Mapping” in any context, for example, the issues in the topically significant area of “Global Environmental Pollution-Mitigation for Arresting the Critical phenomenon of Global Warming”. Such similar issues are tackle-able more readily, as the impactful advances in the “GIS & GPS” technologies have led to the concept of “Managing Global Village” in terms of ‘Geospatial Meta-Data’. This last fact has been seminal to special zeal-n-motivation to the authors to have worked for this improved paper containing rather a much more efficient strategy of confidence-interval estimation for decision-making team of managers for any impugned area of application.

Бесплатно

Deep Hybrid System of Computational Intelligence with Architecture Adaptation for Medical Fuzzy Diagnostics

Deep Hybrid System of Computational Intelligence with Architecture Adaptation for Medical Fuzzy Diagnostics

Iryna Perova, Iryna Pliss

Статья научная

In the paper the deep hybrid system of computational intelligence with architecture adaptation for medical fuzzy diagnostics is proposed. This system allows to increase a quality of medical information processing under the condition of overlapping classes due to special adaptive architecture and training algorithms. The deep hybrid system under consideration can tune its architecture in situation when number of features and diagnoses can be variable. The special algorithms for its training are developed and optimized for situation of different system architectures without retraining of synaptic weights that have been tuned at previous steps. The proposed system was used for processing of three medical data sets (dermatology dataset, Pima Indians diabetes dataset and Parkinson disease dataset) under the condition of fixed number of features and diagnoses and in situation of its increasing. A number of conducted experiments have shown high quality of medical diagnostic process and confirmed the efficiency of the deep hybrid system of computational intelligence with architecture adaptation for medical fuzzy diagnostics.

Бесплатно

Deep Learning Based Traffic Management in Knowledge Defined Network

Deep Learning Based Traffic Management in Knowledge Defined Network

Tejas M. Modi, Kuna Venkateswararao, Pravati Swain

Статья научная

In recent Artificial Intelligence developments, large datasets as knowledge are a prime requirement for analysis and prediction. To manage the knowledge of the network, the Data Center Network (DCN) has been considered a global data storage facility on edge servers and cloud servers. In recent research trends, knowledge-defined networking (KDN) architecture is considered, where the management plane works as the knowledge plane. The major network management task in the DCN is to control traffic congestion. To improve network management, i.e., optimized resource management, enhanced Quality of Service (QoS), we propose a path prediction technique by combining the convolution layer with the RNN deep learning model, i.e., Convolution-Long short-term memory network as Convolution-LSTM and the bi-directional long short-term memory (BiLSTM) network as Convolution-BiLSTM. The experimental results demonstrate that, in terms of many metrics, i.e., network latency, packet loss ratio, network throughput, and overhead, our proposed methodologies perform better than the existing works, i.e., OSPF, FlowDCN, modified discrete PSO, ANN, CNN, and LSTM-based routing approaches. The proposed approach improves the network throughput by approximately 30% and 12% as compared to existing CNN and LSTM-based routing approaches, respectively.

Бесплатно

Deep Learning Sign Language Recognition System Based on Wi-Fi CSI

Deep Learning Sign Language Recognition System Based on Wi-Fi CSI

Marwa R. M. Bastwesy, Nada M. El Shennawy, Mohamed T. Faheem Saidahmed

Статья научная

Many sensing gesture recognition systems based on Wi-Fi signals are introduced because of the commercial off-the-shelf Wi-Fi devices without any need for additional equipment. In this paper, a deep learning-based sign language recognition system is proposed. Wi-Fi CSI amplitude and phase information is used as input to the proposed model. The proposed model uses three types of deep learning: CNN, LSTM, and ABLSTM with a complete study of the impact of optimizers, the use of amplitude and phase of CSI, and preprocessing phase. Accuracy, F-score, Precision, and recall are used as performance metrics to evaluate the proposed model. The proposed model achieves 99.855%, 99.674%, 99.734%, and 93.84% average recognition accuracy for the lab, home, lab + home, and 5 different users in a lab environment, respectively. Experimental results show that the proposed model can effectively detect sign gestures in complex environments compared with some deep learning recognition models.

Бесплатно

Deep Learning for Robust Facial Expression Recognition: A Resilient Defense Against Adversarial Attacks

Deep Learning for Robust Facial Expression Recognition: A Resilient Defense Against Adversarial Attacks

Tinuk Agustin, Moch. Hari Purwidiantoro, Mochammad Luthfi Rahmadi

Статья научная

Adversarial attacks can be extremely dangerous, particularly in scenarios where the precision of facial expression identification is of utmost importance. Hiring adversarial training methods proves effective in mitigating these threats. Although effective, this technique requires large computing resources. This study aims to strengthen deep learning model resilience against adversarial attacks while optimizing performance and resource efficiency. Our proposed method uses adversarial training techniques to create adversarial examples, which are permanently stored as a separate dataset. This strategy helps the model learn and enhances its resilience to adversarial attacks. This study also evaluates models by subjecting them to adversarial attacks, such as the One Pixel Attack and the Fast Gradient Sign Method, to identify any potential vulnerabilities. Moreover, we use two different model architectures to see how well they are protected against adversarial attacks. It compared their performances to determine the best model for making systems more resistant while still maintaining good performance. The findings show that the combination of the proposed adversarial training technique and an efficient model architecture outcome in increased resistance to adversarial attacks. This also improves the reliability of the model and saves more resources for computation. This is evidenced by the high accuracy results achieved at 98.81% accuracy on the CK+ datasets. The adversarial training technique proposed in this study offers an efficient alternative to overcome the limitations of computational resources. This fortifies the model against adversarial attacks, resulting in significant increases in model resilience without loss of performance.

Бесплатно

Deep Learning in Character Recognition Considering Pattern Invariance Constraints

Deep Learning in Character Recognition Considering Pattern Invariance Constraints

Oyebade K. Oyedotun, Ebenezer O. Olaniyi, Adnan Khashman

Статья научная

Character recognition is a field of machine learning that has been under research for several decades. The particular success of neural networks in pattern recognition and therefore character recognition is laudable. Research has also long shown that a single hidden layer network has the capability to approximate any function; while, the problems associated with training deep networks therefore led to little attention given to it. Recently, the breakthrough in training deep networks through various pre-training schemes have led to the resurgence and massive interest in them, significantly outperforming shallow networks in several pattern recognition contests; moreover the more elaborate distributed representation of knowledge present in the different hidden layers concords with findings on the biological visual cortex. This research work reviews some of the most successful pre-training approaches to initializing deep networks such as stacked auto encoders, and deep belief networks based on achieved error rates. More importantly, this research also parallels investigating the performance of deep networks on some common problems associated with pattern recognition systems such as translational invariance, rotational invariance, scale mismatch, and noise. To achieve this, Yoruba vowel characters databases have been used in this research.

Бесплатно

Defect Analysis Using Artificial Neural Network

Defect Analysis Using Artificial Neural Network

S. Bhuvaneswari, J. Sabarathinam

Статья научная

This paper deals with detection of defects in the manufactured ceramic tiles to ensure high density quality. The problem is concerned with the automatic inspection of ceramic tiles using Artificial Neural Network (ANN). The performance of the technique has been evaluated theoretically and experimentally on samples. Architecture of the system involves binary matrix processing and utilization of Artificial Neural Network (ANN) to detect defects. The above automatic inspection procedures have been implemented and tested on company floor tiles. The results obtained confirmed the efficiency of the methodology in defect detection in raw tile and its relevance as a promising approach on matrix, as well as included in quality control and inspection programs.

Бесплатно

Defuzzification Index for Ranking of Fuzzy Numbers on the Basis of Geometric Mean

Defuzzification Index for Ranking of Fuzzy Numbers on the Basis of Geometric Mean

Nalla Veerraju, V. Lakshmi Prasannam, L. N. P. Kumar Rallabandi

Статья научная

The importance of fuzzy numbers to express uncertainty in certain applications, concerned with decision making, is observed in a large number of problems of different kinds. In Decision making problems, the best of available alternatives is chosen to the possible extent. In the process of ordering the alternatives, ranking of fuzzy numbers plays a key role. A large volume of ranking methods, based on different features, have been available in this domain. Owing to the complicated nature of fuzzy numbers, the so far introduced methods suffered setbacks or posed difficulties or showed drawbacks in one context or other. In addition, some methods are lengthy and complicated to apply on concerned problems. In this article, a new ranking procedure based on defuzzification, stemmed from the concepts of geometric mean and height of a fuzzy number, is proposed. Finally, numerical comparisons are made with other existing procedures for testing and validation of proposed method with the support of some standard numerical examples.

Бесплатно

Delay Computation Using Fuzzy Logic Approach

Delay Computation Using Fuzzy Logic Approach

Pandey M.K., Dandotiya A., Trivedi M.K., Bhadoriya S.S., Ramasesh G. R.

Статья научная

The paper presents practical application of fuzzy sets and system theory in predicting delay, with reasonable accuracy, a wide range of factors pertaining to construction projects. In this paper we shall use fuzzy logic to predict delays on account of Delayed supplies and Labor shortage. It is observed that the project scheduling software use either deterministic method or probabilistic method for computation of schedule durations, delays, lags and other parameters. In other words, these methods use only quantitative inputs leaving-out the qualitative aspects associated with individual activity of work. The qualitative aspect viz., the expertise of the mason or the lack of experience can have a significant impact on the assessed duration. Such qualitative aspects do not find adequate representation in the Project Scheduling software. A realistic project is considered for which a PERT chart has been prepared using showing all the major activities in reasonable detail. This project has been periodically updated until its completion. It is observed that some of the activities are delayed due to extraneous factors resulting in the overall delay of the project. The software has the capability to calculate the overall delay through CPM (Critical Path Method) when each of the activity-delays is reported. We shall now demonstrate that by using fuzzy logic, these delays could have been predicted well in advance.

Бесплатно

Журнал