International Journal of Information Technology and Computer Science @ijitcs
Статьи журнала - International Journal of Information Technology and Computer Science
Все статьи: 1211

Covering Based Pessimistic Multigranular Rough Equalities and their Properties
Статья научная
The basic rough set theory introduced by Pawlak as a model to capture imprecision in data has been extended in many directions and covering based rough set models are among them. Again from the granular computing point of view, the basic rough sets are unigranular by nature. Two types of extensions to the context of multigranular computing are done; called the optimistic and pessimistic multigranulation by Qian et al in 2006 and 2010 respectively. Combining these two concepts of covering and multigranulation, covering based multigranular models have been introduced by Liu et al in 2012. Extending the stringent concept of mathematical equality of sets rough equalities were introduced by Novotny and Pawlak in 1985. Three more types of such approximate equalities were introduced by Tripathy in 2011. In this paper we study the approximate equalities introduced by Novotny and Pawlak from the pessimistic multigranular computing point of view and establish several of their properties. These concepts and properties are shown to be useful in approximate reasoning.
Бесплатно

Статья научная
Over the last few years, the amount of video data has increased significantly. So, the necessity of video summarization has reached a new level. Video summarization is summarizing a large video with a fewer number of frames keeping the semantic content same. In this paper, we have proposed an approach which takes all the frames from a video and then shot boundaries are detected using the color moment and SURF (Speeded Up Robust Features). Then the redundancy of the similar frames is eliminated using the color histogram. Finally, a summary slide is generated with the remaining frames which are semantically similar to the total content of the original video. Our experimental result is calculated on the basis of a questionnaire-based user survey which shows on average 78% positive result whereas 3.5% negative result. This experimental result is quite satisfactory in comparison with the existing techniques.
Бесплатно

Credible Mechanism for More Reliable Search Engine Results
Статья научная
The number of websites on the Internet is growing randomly, thanks to HTML language. Consequently, a diversity of information is available on the Web, however, sometimes the content of it may be neither valuable nor trusted. This leads to a problem of a credibility of the existing information on these Websites. This paper investigates aspects affecting on the Websites credibility and then uses them along with dominant meaning of the query for improving information retrieval capabilities and to effectively manage contents. It presents a design and development of a credible mechanism that searches Web search engine and then ranks sites according to its reliability. Our experiments show that the credibility terms on the Websites can affect the ranking of the Web search engine and greatly improves retrieval effectiveness.
Бесплатно

Credit Card Fraud Detection System Using Machine Learning
Статья научная
The security of any system is a key factor toward its acceptability by the general public. We propose an intuitive approach to fraud detection in financial institutions using machine learning by designing a Hybrid Credit Card Fraud Detection (HCCFD) system which uses the technique of anomaly detection by applying genetic algorithm and multivariate normal distribution to identify fraudulent transactions on credit cards. An imbalance dataset of credit card transactions was used to the HCCFD and a target variable which indicates whether a transaction is deceitful or otherwise. Using F-score as performance metrics, the model was tested and it gave a prediction accuracy of 93.5%, as against artificial neural network, decision tree and support vector machine, which scored 84.2%, 80.0% and 68.5% respectively, when trained on the same data set. The results obtained showed a significant improvement as compared with the other widely used algorithms.
Бесплатно

Cuckoo Search Algorithm for Stellar Population Analysis of Galaxies
Статья научная
The cuckoo search algorithm (CS) is a simple and effective global optimization algorithm. It has been applied to solve a wide range of real-world optimization problem. In this paper, an improved Cuckoo Search Algorithm (ICS) is presented for determining the age and relative contribution of different stellar populations in galaxies. The results indicate that the proposed method performs better than, or at least comparable to state-of-the-art method from literature when considering the quality of the solutions obtained. The proposed algorithm will be applied to integrated color of galaxy NGC 3384. Simulation results further demonstrate the proposed method is very effective. The study revealed that cuckoo search can successfully be applied to a wide range of stellar population and space optimization problems.
Бесплатно

Current State and Future Trends in Location Recommender Systems
Статья научная
Technological developments in mobile devices enabled the utilization of geographical data for social networks. Accordingly, location-based social networks have become very attractive. The popularity of location-based social networks has prompted researchers to study recommendation systems for location-based services. There are many studies that develop location recommendation systems using various variables and algorithms. However, articles detailing past and present studies, and making future suggestions, are limited. Therefore, this study aims to thoroughly review the research performed on location recommender systems. For this purpose, topic pairs; "location and recommender system" and "location and recommendation system" were searched in the Web of Knowledge database. Resulting articles were examined in detail with respect to data sources and variables, algorithms, and evaluation techniques used. Thus, the current state of location recommender systems research is summarized and future recommendations are provided for researchers and developers. It is expected that the issues presented in this paper will advance the discussion of next generation location recommendation systems.
Бесплатно

Customer Credit Risk Assessment using Artificial Neural Networks
Статья научная
Since the granting of banking facilities in recent years has faced problems such as customer credit risk and affects the profitability directly, customer credit risk assessment has become imperative for banks and it is used to distinguish good applicants from those who will probably default on repayments. In credit risk assessment, a score is assigned to each customer then by comparing it with the cut-off point score which distinguishes two classes of the applicants, customers are classified into two credit statuses either a good or bad applicant. Regarding good performance and their ability of classification, generalization and learning patterns, Multi-layer Perceptron Neural Network model trained using various Back-Propagation (BP) algorithms considered in designing an evaluation model in this study. The BP algorithms, Levenberg-Marquardt (LM), Gradient descent, Conjugate gradient, Resilient, BFGS Quasi-newton, and One-step secant were utilized. Each of these six networks runs and trains for different numbers of neurons within their hidden layer. Mean squared error (MSE) is used as a criterion to specify optimum number of neurons in the hidden layer. The results showed that LM algorithm converges faster to the network and achieves better performance than the other algorithms. At last, by comparing classification performance of neural network with a number of classification algorithms such as Logistic Regression and Decision Tree, the neural network model outperformed the others in customer credit risk assessment. In credit models, because the cost that Type II error rate imposes to the model is too high, therefore, Receiver Operating Characteristic curve is used to find appropriate cut-off point for a model that in addition to high Accuracy, has lower Type II error rate.
Бесплатно

Cyclic Spectral Features Extracting of Complex Modulation Signal Based on ACP Method
Статья научная
Based on averaged cyclic periodogram cyclic spectral density estimating method(ACP), the cyclic spectral features of complex modulated signals are studied and the correspondence with signal parameters is investigated. The feature extraction methods without prior knowledge are developed. Firstly, the expression of complex modulated signals is described and the relationship between signal parameters is given; Secondly, the cyclic spectral features of signals are analyzed using ACP cyclic spectral density estimating method, the features correspondence with signal parameters is obtained; Based on the above, a method for parameter extracting based on cyclic spectral features is proposed. The normalized RMS error (NRMSE) of frank coded and Costas coded signals parameter extraction are measured to verify the validity of the method.
Бесплатно

Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools
Статья научная
A Data Warehouse is a computer system designed for storing and analyzing an organization's historical data from day-to-day operations in Online Transaction Processing System (OLTP). Usually, an organization summarizes and copies information from its operational systems to the data warehouse on a regular schedule and management performs complex queries and analysis on the information without slowing down the operational systems. Data need to be pre-processed to improve quality of data, before storing into data warehouse. This survey paper presents data cleaning problems and the approaches in use currently for pre-processing. To determine which technique of pre-processing is best in what scenario to improve the performance of Data Warehouse is main goal of this paper. Many techniques have been analyzed for data cleansing, using certain evaluation attributes and tested on different kind of data sets. Data quality tools such as YALE, ALTERYX, and WEKA have been used for conclusive results to ready the data in data warehouse and ensure that only cleaned data populates the warehouse, thus enhancing usability of the warehouse. Results of paper can be useful in many future activities like cleansing, standardizing, correction, matching and transformation. This research can help in data auditing and pattern detection in the data.
Бесплатно

Data Deduplication Methods: A Review
Статья научная
The cloud storage services are used to store intermediate and persistent data generated from various resources including servers and IoT based networks. The outcome of such developments is that the data gets duplicated and gets replicated rapidly especially when large number of cloud users are working in a collaborative environment to solve large scale problems in geo-distributed networks. The data gets prone to breach of privacy and high incidence of duplication. When the dynamics of cloud services change over period of time, the ownership and proof of identity operations also need to change and work dynamically for high degree of security. In this work we will study the following concepts, methods and the schemes that can make the cloud services secure and reduce the incidence of data duplication. With the help of cryptography mathematics and to increase potential storage capacity. The proposed scheme works for deduplication of data with arithmetic key validity operations that reduce the overhead and increase the complexity of the keys so that it is hard to break the keys.
Бесплатно

Data Driven Fuzzy Modeling for Sugeno and Mamdani Type Fuzzy Model using Memetic Algorithm
Статья научная
The process of fuzzy modeling or fuzzy model identification is an arduous task. This paper presents the application of Memetic algorithms (MAs) for the identification of complete fuzzy model that includes membership function design for input and output variables and rulebase generation from the numerical data set. We have applied the algorithms on four bench mark data: A rapid Ni-Cd battery charger, the Box & Jenkins’s gas-furnace data, the Iris data classification problem and the wine data classification problem. The comparison of obtained results from MAs with Genetic algorithms (GAs) brings out the remarkable efficiency of MAs. The result suggests that for these problems the proposed approach is better than those suggested in the literature.
Бесплатно

Data Mining Methods for Detecting the Most Significant Factors Affecting Students’ Performance
Статья научная
The field of using Data Mining (DM) techniques in educational environments is typically identified as Educational Data Mining (EDM). EDM is rapidly becoming an important field of research due to its ability to extract valuable knowledge from various educational datasets. During the past decade, an increasing interest has arisen within many practical studies to study and analyze educational data especially students’ performance. The performance of students plays a vital role in higher education institutions. In keeping with this, there is a clear need to investigate factors influencing students’ performance. This study was carried out to identify the factors affecting students’ academic performance. K-means and X-means clustering techniques were applied to analyze the data to find the relationship of the students' performance with these factors. The study finding includes a set of the most influencing personal and social factors on the students’ performance such as parents’ occupation, parents’ qualification, and income rate. Furthermore, it is contributing to improving the education quality, as well as, it motivates educational institutions to benefit and discover the unseen patterns of knowledge in their students' accumulated data.
Бесплатно

Data Mining for Cyberbullying and Harassment Detection in Arabic Texts
Статья научная
Broadly cyberbullying is viewed as a severe social danger that influences many individuals around the globe, particularly young people and teenagers. The Arabic world has embraced technology and continues using it in different ways to communicate inside social media platforms. However, the Arabic text has drawbacks for its complexity, challenges, and scarcity of its resources. This paper investigates several questions related to the content of how to protect an Arabic text from cyberbullying/harassment through the information posted on Twitter. To answer this question, we collected the Arab corpus covering the topics with specific words, which will explain in detail. We devised experiments in which we investigated several learning approaches. Our results suggest that deep learning models like LSTM achieve better performance compared to other traditional cyberbullying classifiers with an accuracy of 72%.
Бесплатно