International Journal of Information Technology and Computer Science @ijitcs
Статьи журнала - International Journal of Information Technology and Computer Science
Все статьи: 1291
Optimized time efficient data cluster validity measures
Статья научная
The main task of any clustering algorithm is to produce compact and well-separated clusters. Well separated and compact type of clusters cannot be achieved in practice. Different types of clustering validation are used to evaluate the quality of the clusters generated by clustering. These measures are elements in the success of clustering. Different clustering requires different types of validity measures. For example, unsupervised algorithms require different evaluation measures than supervised algorithms. The clustering validity measures are categorized into two categories. These categories include external and internal validation. The main difference between external and internal measures is that external validity uses the external information and internal validity measures use internal information of the datasets. A well-known example of the external validation measure is Entropy. Entropy is used to measure the purity of the clusters using the given class labels. Internal measures validate the quality of the clustering without using any external information. External measures require the accurate value of the number of clusters in advance. Therefore, these measures are used mainly for selecting optimal clustering algorithms which work on a specific type of dataset. Internal validation measures are not only used to select the best clustering algorithm but also used to select the optimal value of the number of clusters. It is difficult for external validity measures to have predefined class labels because these labels are not available often in many of the applications. For these reasons, internal validation measures are the only solution where no external information is available in the applications. All these clustering validity measures used currently are time-consuming and especially take additional time for calculations. There are no clustering validity measures which can be used while the clustering process is going on. This paper has surveyed the existing and improved cluster validity measures. It then proposes time efficient and optimized cluster validity measures. These measures use the concept of cluster representatives and random sampling. The work proposes optimized measures for cluster compactness, separation and cluster validity. These three measures are simple and more time efficient than the existing clusters validity measures and are used to monitor the working of the clustering algorithms on large data while the clustering process is going on.
Бесплатно
Optimizing PI Controller for SEPIC Converter with Optimization Algorithm
Статья научная
This paper refers to work evaluating the performance of PI controllers integrated with optimization techniques designed for Single Ended Primary Inductance Converters (SEPIC). With the SEPIC converter, a constant voltage output can be retained while switching a range of dc voltages. Performance of PI controller has been combined with Artificial Bee Colony (ABC) Algorithm, Particle swarm optimization (PSO) Algorithm, Whale optimization algorithm (WOA). In this research, a performance analysis of the SEPIC dc-dc converter controller constructed with the aforementioned optimization strategies is carried out. Statistics proves WOA provides best stability exhibited with fast response when compared to other optimization techniques.
Бесплатно
Optimizing QoS for Multimedia Services in Next Generation Network Based on ACO Algorithm
Статья научная
In Next Generation Network (NGN), the backbone of the overall network architecture will be IP network, supporting different access network technologies and types of traffics. NGN will provide advanced services, such as Quality of Service (QoS) guarantees, to users and their applications. Factors affecting the QoS in NGN are speech encoders, delay, jitter, packet loss and echo. The negotiation and dynamic adaptation of QoS is currently considered to be one of the key features of the NGN concept. In this paper, I propose a novel Ant Colony Optimization algorithm to solve model of the optimal QoS for multimedia services in the NGN. Simulation results show that my new approach has achieved near optimal solutions. Comparison of experimental results with a recently researches shows that the proposed algorithm is better performance and it can meets the demand of the optimal QoS for multimedia services in NGN.
Бесплатно
Статья научная
Sinusitis is an inflammation of the paranasal sinus mucosa, which is an infection caused by a bacterium, fungus or virus. Therefore, for earliest and accurate prediction of sinusitis from Computed Tomography (CT) image, this research introduces a novel Artificial Intelligence (AI) based technique. The developed research is initiated with preprocessing using a Gabor filter to improve the quality of an image. After, segmentation using Gaussian Mixture Model (GMM) is exploited for effective isolation of sinus regions affected by inflammation. For acquiring the crucial features from the segmented regions, Gray-Level Co-occurrence Matrix (GLCM) based feature extraction is utilized which offers clinically meaningful features that improve transparency. Consequently, the hybrid Harmony Search Algorithm (HSA)-Grey Wolf Optimizer (GWO) feature selection is utilized to choose the most relevant features. This hybrid method outperforms traditional selection techniques by effectively identifying the most discriminative and non-redundant features, enhancing classification accuracy while reducing computational complexity. For accurate classification of sinusitis into various severity levels, the modified Artificial Neural Network (ANN) is employed. Unlike end-to-end deep learning models, this modular approach allows for fine-grained control at each stage, ensuring that critical medical insights are not lost in abstraction. This structured pipeline allows each phase to be optimized individually, improving transparency, reliability and ultimately, diagnostic performance. The performance of the research is analyzed via python software and it reveals that the developed classifier achieves an accuracy of 96.41%.
Бесплатно
PCA based Multimodal Biometrics using Ear and Face Modalities
Статья научная
Automatic person identification is an important task in computer vision and related applications. Multimodal biometrics involves more than two modalities. The proposed work is an implementation of person identification fusing face and ear biometric modalities. We have used PCA based neural network classifier for feature extraction from the images. These features are fused and used for identification. PCA method was found better if the modalities were combined. Identification was made using Eigen faces, Eigen ears and their features. These were tested over own created database.
Бесплатно
Статья научная
The Marksheet Generator is flexible for generating progress mark sheet of students. This system is mainly based in the database technology and the credit based grading system (CBGS). The system is targeted to small enterprises, schools, colleges and universities. It can produce sophisticated ready-to-use mark sheet, which could be created and will be ready to print. The development of a marksheet and gadget sheet is focusing at describing tables with columns/rows and sub-column sub-rows, rules of data selection and summarizing for report, particular table or column/row, and formatting the report in destination document. The adjustable data interface will be popular data sources (SQL Server) and report destinations (PDF file). Marksheet generation system can be used in universities to automate the distribution of digitally verifiable mark-sheets of students. The system accesses the students’ exam information from the university database and generates the gadget-sheet Gadget sheet keeps the track of student information in properly listed manner. The project aims at developing a marksheet generation system which can be used in universities to automate the distribution of digitally verifiable student result mark sheets. The system accesses the students’ results information from the institute student database and generates the mark sheets in Portable Document Format which is tamper proof which provides the authenticity of the document. Authenticity of the document can also be verified easily.
Бесплатно
Статья научная
Proposed the PID controller parameters tuning method based-on New Luus-Jaakola (NLJ) algorithm and satisfaction idea. According to the different requirements of each performance index, designed the satisfaction function with fuzzy constraint attributes, and then determined the comprehensive satisfaction function for PID tuning by NLJ algorithm. Provided the steps of PID controller parameters tuning based on the NLJ algorithm and satisfaction, and applied this tuning method to the cascade control system of superheated steam temperature for Power Station Boiler. Finally the simulation and experiment results have shown the proposed method has good dynamic and static control performances for this complicated superheated steam temperature control system.
Бесплатно
PINNs for Stochastic Dynamics: Modeling Brownian Motion via Verlet Integration
Статья научная
This study presents a Physics-Informed Neural Network (PINN) framework for modeling stochastic systems like Brownian motion, designed to overcome critical challenges in physical consistency and numerical stability that affect classical solvers and standard data-driven models. Traditional numerical methods often struggle with high-dimensional spaces or sparse data, while many machine learning approaches fail to enforce fundamental physical laws. To address this, our proposed PINN architecture integrates a multi-component loss function that explicitly enforces the Fokker-Planck equation, which describes the system’s governing physics, alongside boundary conditions and a global probability conservation law. This physics-informed approach is anchored by high-fidelity training data generated from Verlet-integrated trajectories of the underlying Langevin dynamics. We validate our model against the analytical solution for one-dimensional Brownian motion, demonstrating its ability to accurately recover the true probability density function (PDF). Rigorous comparisons using statistical metrics show superior accuracy over a canonical data-driven operator learning model, DeepONet. Specifically, our PINN achieves a relative L2 error of 5.66% and maintains probability normalization within a 0.03% tolerance, significantly outperforming DeepONet’s 32.46% error and 3.2% probability deviation. Furthermore, a recursive error-bounding technique provides quantifiable confidence in the model’s predictions. While validated in a low-dimensional system, our framework demonstrates a promising and robust methodology for problems in fields like soft matter physics and financial modeling, where both physical consistency and data-driven flexibility are crucial. We also provide a transparent analysis of the model’s computational trade-offs, positioning this physics-informed approach as a reliable tool for complex scientific applications.
Бесплатно
PTSLGA: A Provenance Tracking System for Linked Data Generating Application
Статья научная
Tracking provenance of RDF resources is an important task in Linked Data generating applications. It takes on a central function in gathering information as well as workflow. Various Linked Data generating applications have evolved for converting legacy data to RDF resources. These data belong to bibliographic, geographic, government, publications, and cross-domains. However, most of them do not support tracking data and workflow provenance for individual RDF resources. In such cases, it is required for those applications to track, store and disseminate provenance information describing their source data and involved operations. In this article, we introduce an approach for tracking provenance of RDF resources. Provenance information is tracked during the conversion process and it is stored into the triple store. Thereafter, this information is disseminated using provenance URIs. The proposed framework has been analyzed using Harvard Library Bibliographic Datasets. The evaluation has been made on datasets through converting legacy data into RDF and Linked Data with provenance. The outcome has been quiet promising in the sense that it enables data publishers to generate relevant provenance information while taking less time and efforts.
Бесплатно
Pancreatic Cancer Prediction Using Machine Learning: An Investigation of Different Algorithms
Статья научная
Pancreatic cancer, characterized by its high mortality rate and scarce treatment options, poses a formidable challenge in the field of oncology. Now, we live in a reality that requires immediate progress in diagnostic and prognostic methodologies to find pancreatic cancer early and understand its stage. This study deals with the pressing requirement for better diagnostic tools by evaluating and deciding the suitable machine learning (ML) algorithms for detecting pancreatic cancer at an early stage. This work uses a publicly available dataset with 590 urine samples which included control, benign hepatobiliary disease as well as Pancreatic Ductal Adenocarcinoma (PDAC) samples. The primary objectives of the research included developing a predictive model based on clinical data, examining various machine learning (ML) algorithms for their diagnostic precision, and improving the early detection rates for pancreatic cancer. The study assessed the efficacy of a broad array of ML algorithms in forecasting outcomes associated with pancreatic cancer. This analysis systematically explored Random Forest, Support Vector Machine, Decision Trees, K-Nearest Neighbours, XGBoost, ADABoost, CatBoost, and GradientBoost. The assessment focused on standard performance metrics such as accuracy, precision (also known as positive predicted value or PPV), recall (sometimes called sensitivity or true positive rate), F1-score, and support. Notably, CatBoost achieved the highest accuracy of 75%, outperforming other models such as Random Forest (74%) and XGBoost (74%), demonstrating its superior classification performance in distinguishing between pancreatic cancer, benign conditions, and non-cancerous cases. In addition to performance evaluation, this study integrates SHAP (Shapley Additive Explanations) analysis to enhance model interpretability, ensuring transparency in feature contributions. SHAP analysis revealed that Plasma CA19-9, LYVE1, and TFF1 were the most influential biomarkers across all classifications, reinforcing their diagnostic significance. This research emphasizes the critical importance of early detection, model interpretability, and clinical applicability, demonstrating that ML algorithms, particularly CatBoost, not only enhance diagnostic precision but also provide explainable predictions that support real-world medical decision-making.
Бесплатно
Parallel DBSCAN Clustering Algorithm Using Hadoop Map-reduce Framework for Spatial Data
Статья научная
Data clustering is the first step for future applications of big data analysis. It is a driving model for Artificial Intelligence and Machine Learning architectures. Processing large volumes of data in faster mode is a big challenge in these applications. which requires fast and efficient algorithms for handling big data. Parallel clustering algorithms are one promising design, which increases the speed of handling such big data. In this paper, a parallel algorithm for clustering a spatial dataset called the P-DBSCAN algorithm is implemented using Hadoop map-reduce framework. This research paper signifies the improvement for data clustering in data analytic applications. The new P-DBSCAN algorithm is executed over generated dataset. The result of this parallel algorithm is compared with existing DBSCAN algorithm to show improvement of runtime performance. This work offers an increase in the performance of execution time. In addition, the outcome of P-DBSCAN shows how to resolve the scalability problem of a large data set.
Бесплатно
Parallel Implementation of Color Based Image Retrieval Using CUDA on the GPU
Статья научная
Most image processing algorithms are inherently parallel, so multithreading processors are suitable in such applications. In huge image databases, image processing takes very long time for run on a single core processor because of single thread execution of algorithms. Graphical Processors Units (GPU) is more common in most image processing applications due to multithread execution of algorithms, programmability and low cost. In this paper we implement color based image retrieval system in parallel using Compute Unified Device Architecture (CUDA) programming model to run on GPU. The main goal of this research work is to parallelize the process of color based image retrieval through color moments; also whole process is much faster than normal. Our work uses extensive usage of highly multithreaded architecture of multi-cored GPU. An efficient use of shared memory is needed to optimize parallel reduction in CUDA. We evaluated the retrieval of the proposed technique using Recall, Precision, and Average Precision measures. Experimental results showed that parallel implementation led to an average speed up of 6.305×over the serial implementation when running on a NVIDIA GPU GeForce 610M. The average Precision and the average Recall of presented method are 53.84% and 55.00% respectively.
Бесплатно
Parallel bat algorithm using mapreduce model
Статья научная
Bat Algorithm is among the most popular meta-heuristic algorithms for optimization. Traditional bat algorithm work on sequential approach which is not scalable for optimization problems involving large search space, huge fitness computation and having large number of dimensions E.g. stock market strategies therefore parallelizing meta-heuristics to run on parallel machines to reduce runtime is required. In this paper, we propose two parallel variants of Bat Algorithm (BA) using MapReduce parallel programming model proposed by Google and have used these two variants for solving the Software development effort optimization problem. The experiment is conducted using Apache Hadoop implementation of MapReduce on a cluster of 6 machines. These variants can be used to solve various complex optimization problems by simply adding more hardware resources to the cluster and without changing the proposed variant code.
Бесплатно
Parkinson's Brain Disease Prediction Using Big Data Analytics
Статья научная
In healthcare industries, the demand for maintaining large amount of patients' data is steadily growing due to rising population which has resulted in the increase of details about clinical and laboratory tests, imaging, prescription and medication. These data can be called "Big Data", because of their size, complexity and diversity. Big data analytics aims at improving patient care and identifying preventive measures proactively. To save lives and recommend life style changes for a peaceful and healthier life at low costs. The proposed predictive analytics framework is a combination of Decision Tree, Support Vector Machine and Artificial Neural Network which is used to gain insights from patients. Parkinson's disease voice dataset from UCI Machine learning repository is used as input. The experimental results show that early detection of disease will facilitate clinical monitoring of elderly people and increase the chances of their life span and improved lifestyle to lead peaceful life.
Бесплатно
Part-of-speech Tagging for Marathi using Maximum Entropy Markove Model
Статья научная
Part-of-Speech (POS) tagging is an essential and important pre-processing activity for many Natural Language Processing (NLP) applications, this is particularly more evident for morphologically rich languages such as Marathi. This research investigates POS tagging for Marathi using the Maximum Entropy Markov Model (MEMM). MEMM combines the strengths of conditional probability modelling and sequence prediction, allowing the integration of rich contextual features. Features used include word forms, suffixes, prefixes, and neighboring tags, effectively tackling the challenges presented by inflectional variations and ambiguity in Marathi. Experimental results demonstrate that the MEMM-based POS tagger achieves an accuracy of 83.72%. This performance marks a notable advancement in Marathi POS tagging, given the linguistic diversity and the scarcity of annotated data. Error analysis enhances the issues like ambiguity in homonyms and out-of-vocabulary words, providing methods for further improvement through enriched datasets and sophisticated modelling techniques. This study enhances NLP applications such as machine translation, spell checking, and sentiment analysis for Indian languages and offers a solid foundation for future research in Marathi POS tagging.
Бесплатно
Performance Analysis in Bigdata
Статья научная
Big data technologies like Hadoop, NoSQL, Messaging Queues etc. helps in BigData analytics, drive business growth and to take right decisions in time. These Big Data environments are very dynamic and complex; they require performance validation, root cause analysis, and tuning to ensure success. In this paper we talk about how we can analyse and test the performance of these systems. We present the important factors in a big data that are primary candidates for performance testing like data ingestion capacity and throughput, data processing capacity, simulation of expected usage, map reduce jobs and so on and suggest measures to improve performance of bigdata.
Бесплатно
Performance Analysis of 802.16 (WIMAX) Networks under Various Routing Protocols and Traffic Loads
Статья научная
The selection of an appropriate routing protocol is a key issue when designing a scalable and efficient wireless networks. In this paper, we investigate different routing protocols and evaluate their performances on 802.16 WiMAX networks. Further, we present a comparison between 802.16 and 802.11 ad hoc networks based on the performances of various rerouting protocols. The simulation results show that the table driven DSDV protocol has the best performance in terms of the delivery fraction which outperforms the rest of the protocols. In addition, we also assert from the experiments that packet delay experienced by DSDV protocol is very high. Hence, there should be a tradeoff between various performance parameters when using the DSDV protocol in 802.16 networks.
Бесплатно
Статья научная
Among various statistical and data mining discriminant analysis proposed so far for group classification, linear programming discriminant analysis have recently attracted the researchers’ interest. This study evaluates multi-group discriminant linear programming (MDLP) for classification problems against well-known methods such as neural networks, support vector machine, and so on. MDLP is less complex compared to other methods and does not suffer from local optima. However, sometimes classification becomes infeasible due to insufficient data in databases such as in the case of an Internet Service Provider (ISP) small and medium-sized market considered in this research. This study proposes a fuzzy Delphi method to select and gather required data. The results show that the performance of MDLP is better than other methods with respect to correct classification, at least for small and medium-sized datasets.
Бесплатно
Performance Analysis of MANET Routing Protocols in Different Mobility Models
Статья научная
A mobile ad-hoc network (MANET) is basically called as a network without any central administration or fixed infrastructure. It consists of a number of mobile nodes that use to send data packets through a wireless medium. There is always a need of a good routing protocol in order to establish the connection between mobile nodes since they possess the property of dynamic changing topology. Further, in all the existing routing protocols, mobility of a node has always been one of the important characteristics in determining the overall performance of the ad hoc network. Thus, it is essential to know about various mobility models and their effect on the routing protocols. In this paper, we have made an attempt to compare different mobility models and provide an overview of their current research status. The main focus is on Random Mobility Models and Group Mobility Models. Firstly, we present a survey of the characteristics, drawbacks and research challenges of mobility modeling. At the last we present simulation results that illustrate the importance of choosing a mobility model in the simulation of an ad hoc network protocol. Also, we illustrate how the performance results of an ad hoc network protocol drastically change as a result of changing the mobility model simulated.
Бесплатно
Performance Analysis of Most Common Encryption Algorithms on Different Web Browsers
Статья научная
The hacking is the greatest problem in the wireless local area network (WLAN). Many algorithms like DES, 3DES, AES,UMARAM, RC6 and UR5 have been used to prevent the outside attacks to eavesdrop or prevent the data to be transferred to the end-user correctly. We have proposed a Web programming language to be analyzed with five Web browsers in term of their performances to process the encryption of the programming language’s script with the Web browsers. This is followed by conducting tests simulation in order to obtain the best encryption algorithm versus Web browser. The results of the experimental analysis are presented in the form of graphs. We finally conclude on the findings that different algorithms perform differently to different Web browsers like Internet Explorer, Mozilla Firefox, Opera and Netscape Navigator. Hence, we now determine which algorithm works best and most compatible with which Web browser. A comparison has been conducted for those encryption algorithms at different settings for each algorithm such as encryption/decryption speed in the different web Browsers. Experimental results are given to demonstrate the effectiveness of each algorithm.
Бесплатно