Статьи журнала - International Journal of Information Technology and Computer Science
Все статьи: 1195
Covering Based Pessimistic Multigranular Rough Equalities and their Properties
Статья научная
The basic rough set theory introduced by Pawlak as a model to capture imprecision in data has been extended in many directions and covering based rough set models are among them. Again from the granular computing point of view, the basic rough sets are unigranular by nature. Two types of extensions to the context of multigranular computing are done; called the optimistic and pessimistic multigranulation by Qian et al in 2006 and 2010 respectively. Combining these two concepts of covering and multigranulation, covering based multigranular models have been introduced by Liu et al in 2012. Extending the stringent concept of mathematical equality of sets rough equalities were introduced by Novotny and Pawlak in 1985. Three more types of such approximate equalities were introduced by Tripathy in 2011. In this paper we study the approximate equalities introduced by Novotny and Pawlak from the pessimistic multigranular computing point of view and establish several of their properties. These concepts and properties are shown to be useful in approximate reasoning.
Бесплатно
Статья научная
Over the last few years, the amount of video data has increased significantly. So, the necessity of video summarization has reached a new level. Video summarization is summarizing a large video with a fewer number of frames keeping the semantic content same. In this paper, we have proposed an approach which takes all the frames from a video and then shot boundaries are detected using the color moment and SURF (Speeded Up Robust Features). Then the redundancy of the similar frames is eliminated using the color histogram. Finally, a summary slide is generated with the remaining frames which are semantically similar to the total content of the original video. Our experimental result is calculated on the basis of a questionnaire-based user survey which shows on average 78% positive result whereas 3.5% negative result. This experimental result is quite satisfactory in comparison with the existing techniques.
Бесплатно
Credible Mechanism for More Reliable Search Engine Results
Статья научная
The number of websites on the Internet is growing randomly, thanks to HTML language. Consequently, a diversity of information is available on the Web, however, sometimes the content of it may be neither valuable nor trusted. This leads to a problem of a credibility of the existing information on these Websites. This paper investigates aspects affecting on the Websites credibility and then uses them along with dominant meaning of the query for improving information retrieval capabilities and to effectively manage contents. It presents a design and development of a credible mechanism that searches Web search engine and then ranks sites according to its reliability. Our experiments show that the credibility terms on the Websites can affect the ranking of the Web search engine and greatly improves retrieval effectiveness.
Бесплатно
Credit Card Fraud Detection System Using Machine Learning
Статья научная
The security of any system is a key factor toward its acceptability by the general public. We propose an intuitive approach to fraud detection in financial institutions using machine learning by designing a Hybrid Credit Card Fraud Detection (HCCFD) system which uses the technique of anomaly detection by applying genetic algorithm and multivariate normal distribution to identify fraudulent transactions on credit cards. An imbalance dataset of credit card transactions was used to the HCCFD and a target variable which indicates whether a transaction is deceitful or otherwise. Using F-score as performance metrics, the model was tested and it gave a prediction accuracy of 93.5%, as against artificial neural network, decision tree and support vector machine, which scored 84.2%, 80.0% and 68.5% respectively, when trained on the same data set. The results obtained showed a significant improvement as compared with the other widely used algorithms.
Бесплатно
Cuckoo Search Algorithm for Stellar Population Analysis of Galaxies
Статья научная
The cuckoo search algorithm (CS) is a simple and effective global optimization algorithm. It has been applied to solve a wide range of real-world optimization problem. In this paper, an improved Cuckoo Search Algorithm (ICS) is presented for determining the age and relative contribution of different stellar populations in galaxies. The results indicate that the proposed method performs better than, or at least comparable to state-of-the-art method from literature when considering the quality of the solutions obtained. The proposed algorithm will be applied to integrated color of galaxy NGC 3384. Simulation results further demonstrate the proposed method is very effective. The study revealed that cuckoo search can successfully be applied to a wide range of stellar population and space optimization problems.
Бесплатно
Current State and Future Trends in Location Recommender Systems
Статья научная
Technological developments in mobile devices enabled the utilization of geographical data for social networks. Accordingly, location-based social networks have become very attractive. The popularity of location-based social networks has prompted researchers to study recommendation systems for location-based services. There are many studies that develop location recommendation systems using various variables and algorithms. However, articles detailing past and present studies, and making future suggestions, are limited. Therefore, this study aims to thoroughly review the research performed on location recommender systems. For this purpose, topic pairs; "location and recommender system" and "location and recommendation system" were searched in the Web of Knowledge database. Resulting articles were examined in detail with respect to data sources and variables, algorithms, and evaluation techniques used. Thus, the current state of location recommender systems research is summarized and future recommendations are provided for researchers and developers. It is expected that the issues presented in this paper will advance the discussion of next generation location recommendation systems.
Бесплатно
Customer Credit Risk Assessment using Artificial Neural Networks
Статья научная
Since the granting of banking facilities in recent years has faced problems such as customer credit risk and affects the profitability directly, customer credit risk assessment has become imperative for banks and it is used to distinguish good applicants from those who will probably default on repayments. In credit risk assessment, a score is assigned to each customer then by comparing it with the cut-off point score which distinguishes two classes of the applicants, customers are classified into two credit statuses either a good or bad applicant. Regarding good performance and their ability of classification, generalization and learning patterns, Multi-layer Perceptron Neural Network model trained using various Back-Propagation (BP) algorithms considered in designing an evaluation model in this study. The BP algorithms, Levenberg-Marquardt (LM), Gradient descent, Conjugate gradient, Resilient, BFGS Quasi-newton, and One-step secant were utilized. Each of these six networks runs and trains for different numbers of neurons within their hidden layer. Mean squared error (MSE) is used as a criterion to specify optimum number of neurons in the hidden layer. The results showed that LM algorithm converges faster to the network and achieves better performance than the other algorithms. At last, by comparing classification performance of neural network with a number of classification algorithms such as Logistic Regression and Decision Tree, the neural network model outperformed the others in customer credit risk assessment. In credit models, because the cost that Type II error rate imposes to the model is too high, therefore, Receiver Operating Characteristic curve is used to find appropriate cut-off point for a model that in addition to high Accuracy, has lower Type II error rate.
Бесплатно
Cyclic Spectral Features Extracting of Complex Modulation Signal Based on ACP Method
Статья научная
Based on averaged cyclic periodogram cyclic spectral density estimating method(ACP), the cyclic spectral features of complex modulated signals are studied and the correspondence with signal parameters is investigated. The feature extraction methods without prior knowledge are developed. Firstly, the expression of complex modulated signals is described and the relationship between signal parameters is given; Secondly, the cyclic spectral features of signals are analyzed using ACP cyclic spectral density estimating method, the features correspondence with signal parameters is obtained; Based on the above, a method for parameter extracting based on cyclic spectral features is proposed. The normalized RMS error (NRMSE) of frank coded and Costas coded signals parameter extraction are measured to verify the validity of the method.
Бесплатно
Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools
Статья научная
A Data Warehouse is a computer system designed for storing and analyzing an organization's historical data from day-to-day operations in Online Transaction Processing System (OLTP). Usually, an organization summarizes and copies information from its operational systems to the data warehouse on a regular schedule and management performs complex queries and analysis on the information without slowing down the operational systems. Data need to be pre-processed to improve quality of data, before storing into data warehouse. This survey paper presents data cleaning problems and the approaches in use currently for pre-processing. To determine which technique of pre-processing is best in what scenario to improve the performance of Data Warehouse is main goal of this paper. Many techniques have been analyzed for data cleansing, using certain evaluation attributes and tested on different kind of data sets. Data quality tools such as YALE, ALTERYX, and WEKA have been used for conclusive results to ready the data in data warehouse and ensure that only cleaned data populates the warehouse, thus enhancing usability of the warehouse. Results of paper can be useful in many future activities like cleansing, standardizing, correction, matching and transformation. This research can help in data auditing and pattern detection in the data.
Бесплатно
Data Deduplication Methods: A Review
Статья научная
The cloud storage services are used to store intermediate and persistent data generated from various resources including servers and IoT based networks. The outcome of such developments is that the data gets duplicated and gets replicated rapidly especially when large number of cloud users are working in a collaborative environment to solve large scale problems in geo-distributed networks. The data gets prone to breach of privacy and high incidence of duplication. When the dynamics of cloud services change over period of time, the ownership and proof of identity operations also need to change and work dynamically for high degree of security. In this work we will study the following concepts, methods and the schemes that can make the cloud services secure and reduce the incidence of data duplication. With the help of cryptography mathematics and to increase potential storage capacity. The proposed scheme works for deduplication of data with arithmetic key validity operations that reduce the overhead and increase the complexity of the keys so that it is hard to break the keys.
Бесплатно
Data Driven Fuzzy Modeling for Sugeno and Mamdani Type Fuzzy Model using Memetic Algorithm
Статья научная
The process of fuzzy modeling or fuzzy model identification is an arduous task. This paper presents the application of Memetic algorithms (MAs) for the identification of complete fuzzy model that includes membership function design for input and output variables and rulebase generation from the numerical data set. We have applied the algorithms on four bench mark data: A rapid Ni-Cd battery charger, the Box & Jenkins’s gas-furnace data, the Iris data classification problem and the wine data classification problem. The comparison of obtained results from MAs with Genetic algorithms (GAs) brings out the remarkable efficiency of MAs. The result suggests that for these problems the proposed approach is better than those suggested in the literature.
Бесплатно
Data Mining Methods for Detecting the Most Significant Factors Affecting Students’ Performance
Статья научная
The field of using Data Mining (DM) techniques in educational environments is typically identified as Educational Data Mining (EDM). EDM is rapidly becoming an important field of research due to its ability to extract valuable knowledge from various educational datasets. During the past decade, an increasing interest has arisen within many practical studies to study and analyze educational data especially students’ performance. The performance of students plays a vital role in higher education institutions. In keeping with this, there is a clear need to investigate factors influencing students’ performance. This study was carried out to identify the factors affecting students’ academic performance. K-means and X-means clustering techniques were applied to analyze the data to find the relationship of the students' performance with these factors. The study finding includes a set of the most influencing personal and social factors on the students’ performance such as parents’ occupation, parents’ qualification, and income rate. Furthermore, it is contributing to improving the education quality, as well as, it motivates educational institutions to benefit and discover the unseen patterns of knowledge in their students' accumulated data.
Бесплатно
Data Mining for Cyberbullying and Harassment Detection in Arabic Texts
Статья научная
Broadly cyberbullying is viewed as a severe social danger that influences many individuals around the globe, particularly young people and teenagers. The Arabic world has embraced technology and continues using it in different ways to communicate inside social media platforms. However, the Arabic text has drawbacks for its complexity, challenges, and scarcity of its resources. This paper investigates several questions related to the content of how to protect an Arabic text from cyberbullying/harassment through the information posted on Twitter. To answer this question, we collected the Arab corpus covering the topics with specific words, which will explain in detail. We devised experiments in which we investigated several learning approaches. Our results suggest that deep learning models like LSTM achieve better performance compared to other traditional cyberbullying classifiers with an accuracy of 72%.
Бесплатно
Data Mining in Intrusion Detection: A Comparative Study of Methods, Types and Data Sets
Статья научная
In the era of information and communication technology, Security is an important issue. A lot of effort and finance are being invested in this sector. Intrusion detection is one of the most prominent fields in this area. Data mining in network intrusion detection can automate the network intrusion detection field with a greater efficiency. This paper presents a literature survey on intrusion detection system. The research papers taken in this literature survey are published from 2000 to 2012. We can see that almost 67 % of the research papers are focused on anomaly detection, 23 % on both anomaly and misuse detection and 10 % on misuse detection. In this literature survey statistics shows that 42 % KDD cup dataset, 20 % DARPA dataset and 38 % other datasets are used by the different researchers for testing the effectiveness of their proposed method for misuse detection, anomaly detection or both.
Бесплатно
Database Performance Optimization–A Rough Set Approach
Статья научная
As the sizes of databases are growing exponentially, the optimal design and management of both traditional database management systems as well as processing techniques of data mining are of significant importance. Several approaches are being investigated in this direction. In this paper a novel approach to maintain metadata based on rough sets is proposed and it is observed that with a marginal changes in buffer sizes faster query processing can be achieved.
Бесплатно
Database Semantic Interoperability based on Information Flow Theory and Formal Concept Analysis
Статья научная
As databases become widely used, there is a growing need to translate information between multiple databases. Semantic interoperability and integration has been a long standing challenge for the database community and has now become a prominent area of database research. In this paper, we aim to answer the question how semantic interoperability between two databases can be achieved by using Formal Concept Analysis (FCA for short) and Information Flow (IF for short) theories. For our purposes, firstly we discover knowledge from different databases by using FCA, and then align what is discovered by using IF and FCA. The development of FCA has led to some software systems such as TOSCANA and TUPLEWARE, which can be used as a tool for discovering knowledge in databases. A prototype based on the IF and FCA has been developed. Our method is tested and verified by using this prototype and TUPLEWARE.
Бесплатно
Databases in Cloud Computing: A Literature Review
Статья научная
Information Technology industry has been using the traditional relational databases for about 40 years. However, in the most recent years, there was a substantial conversion in the IT industry in terms of commercial applications. Stand-alone applications have been replaced with electronic applications, committed servers with various appropriate servers and devoted storage with system storage. Lower fee, flexibility, the model of pay-as-you-go are the main reasons, which caused the distributed computing are turned into reality. This is one of the most significant revolutions in Information Technology, after the emergence of the Internet. Cloud databases, Big Table, Sherpa, and SimpleDB are getting to be more familiar to communities. They highlighted the obstacles of current social databases in terms of usability, flexibility, and provisioning. Cloud databases are essentially employed for information-escalated applications, such as storage and mining of huge data or commercial data. These applications are flexible and multipurpose in nature. Numerous value-based information administration applications, like banking, online reservation, e-trade and inventory administration, etc. are produced. Databases with the support of these types of applications have to include four important features: Atomicity, Consistency, Isolation, and Durability (ACID), although employing these databases is not simple for using in the cloud. The goal of this paper is to find out the advantages and disadvantages of databases widely employed in cloud systems and to review the challenges in developing cloud databases.
Бесплатно
Статья научная
Decentralized self-adaptive systems consist of multiple control loops that adapt some local and system-level global goals of each locally managed system or component in a decentralized setting. As each component works together in a decentralized environment, a control loop cannot take adaptation decisions independently. Therefore, all the control loops need to exchange their adaptation decisions to infer a global knowledge about the system. Decentralized self-adaptation approaches in the literature uses the global knowledge to take decisions that optimize both local and global goals. However, coordinating in such an unbounded manner impairs scalability. This paper proposes a decentralized self-adaptation technique using reinforcement learning that incorporates partial knowledge in order to reduce coordination overhead. The Q-learning algorithm based on Interaction Driven Markov Games is utilized to take adaptation decisions as it enables coordination only when it is beneficial. Rather than using unbounded number of peers, the adaptation control loop coordinates with a single peer control loop. The proposed approach was evaluated on a service-based Tele Assistance System. It was compared to random, independent and multiagent learners that assume global knowledge. It was observed that, in all cases, the proposed approach conformed to both local and global goals while maintaining comparatively lower coordination overhead.
Бесплатно
Decoding Optimization Algorithms for Convolutional Neural Networks in Time Series Regression Tasks
Статья научная
Optimization algorithms play a vital role in training deep learning models effectively. This research paper presents a comprehensive comparative analysis of various optimization algorithms for Convolutional Neural Networks (CNNs) in the context of time series regression. The study focuses on the specific application of maximum temperature prediction, utilizing a dataset of historical temperature records. The primary objective is to investigate the performance of different optimizers and evaluate their impact on the accuracy and convergence properties of the CNN model. Experiments were conducted using different optimizers, including Stochastic Gradient Descent (SGD), RMSprop, Adagrad, Adadelta, Adam, and Adamax, while keeping other factors constant. Their performance was evaluated and compared based on metrics such as mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), R-squared (R²), mean absolute percentage error (MAPE), and explained variance score (EVS) to measure the predictive accuracy and generalization capability of the models. Additionally, learning curves are analyzed to observe the convergence behavior of each optimizer. The experimental results, indicating significant variations in convergence speed, accuracy, and robustness among the optimizers, underscore the research value of this work. By comprehensively evaluating and comparing various optimization algorithms, we aimed to provide valuable insights into their performance characteristics in the context of time series regression using CNN models. This work contributes to the understanding of optimizer selection and its impact on model performance, assisting researchers and practitioners in choosing the most suitable optimization algorithm for time series regression tasks.
Бесплатно
Defending against Malicious Threats in Wireless Sensor Network: A Mathematical Model
Статья научная
Wireless Sensor Networks offer a powerful combination of distributed sensing, computing and communication. They lend themselves to countless applications and at the same time constrained by limited battery life, processing capability, memory and bandwidth which makes it soft target of malicious objects such as virus and worms. We study the potential threat for worm spread in wireless sensor network using epidemic theory. We propose a new model Susceptible-Exposed-Infectious-Quarantine-Recovered with Vaccination (SEIQRS-V), to characterize the dynamics of the worm spread in WSN. Threshold, equilibrium and their stability are discussed. Numerical methods are employed to solve the system of equations and MATLAB is used to simulate the system. The Quarantine is a method of isolating the most infected nodes from the network till they get recovered and the Vaccination is the mechanism to immunize the network temporarily to reduce the spread worms.
Бесплатно