International Journal of Image, Graphics and Signal Processing @ijigsp
Статьи журнала - International Journal of Image, Graphics and Signal Processing
Все статьи: 1181
Visual object tracking by fusion of audio imaging in template matching framework
Статья научная
Audio imaging can play a fundamental role in computer vision, in particular in automated surveillance, boosting the accuracy of current systems based on standard optical cameras. We present here a method for object tracking application that fuses visual image with an audio image in the template-matching framework. Firstly, an improved template matching based tracking is presented that takes care of the chaotic movements in the template-matching algorithm. Then a fusion scheme is presented that makes use of deviations in the correlation scores pattern obtained across the individual frame in each imaging domain. The method is compared with various state of art trackers that perform track estimation using only visible imagery. Results highlight a significant improvement in the object tracking by the assistance of audio imaging using the proposed method under severe challenging vision conditions such as occlusions, object shape deformations, the presence of clutters and camouflage, etc.
Бесплатно
Voice Comparison Using Acoustic Analysis and Generative Adversarial Network for Forensics
Статья научная
Forensic Voice Comparison (FVC) is a scientific analysis that examines audio recordings to determine whether they come from the same or different speakers in digital forensics. In this research work, the experiment utilizes three different techniques, like pre-processing, feature extraction, and classification. In preprocessing, the stationery noise reduction algorithm is used to remove unwanted background noise by increasing the clarity of the speech. This in turn helps to improve the overall audio quality by reducing distractions. Further, acoustic features like Mel Frequency Cepstral Coefficients (MFCC) are used to extract relevant and distinctive features from audio signals to characterize and analyze the unique vocal patterns of different individual. Later, the Generative Adversarial Network (GAN) is used to generate synthetic MFCC features and also for augmenting the data samples. Finally, the Logistic Regression (LR) is realized using UK framework for the classification of the model to predict whether the result is true or false. The results achieved in terms of accuracy are 62% considering 3899 samples and 85% when considering set of 985 samples for the Australian English datasets.
Бесплатно
WOA Enabled Fuzzy-C-Means Segmentation for Accurate Detection of Polycystic Kidney Disease
Статья научная
Polycystic Kidney disease (PKD) is often caused due to inherited condition and it forms many cysts around the kidney, and it is damaged when it grow. Accurate segmentation of PKD is very crucial for a persistent MRI diagnostics. Because many people have no symptoms, they can lead to complications until the surgery is done to remove the cyst. Methods: For accurate detection PKD, the heap of MRI images have been considered, In this work, A novel method includes feature based Fuzzy C means (FFCM) with whale optimization algorithm (WOA) for accurate segmentation of kidney cyst. WOA is used to optimally attach the cluster centroids of FCM. In the conventional methods like mountain models and fuzzy C-shells models are used to identify the regions of interest (ROI). Result: The outcomes of FFCM and WOA based process are compared with the results from existing methods using IB-FCM and Fuzzy K-means and FCM model. Conclusion: However, an exact boundary of the region is obtained and computed an experimental dispersal of the image by Feature extraction based Fuzzy C-Means Clustering segmentation. A detection process is based on the FFCM and WOA segmentation is accomplished to discriminate the normal cyst and the kidney disease. The experimental evaluation is accomplished through the use of Ischemic kidney Disease (IKD) database.
Бесплатно
Wavelet Based Image Fusion for Detection of Brain Tumor
Статья научная
Brain tumor, is one of the major causes for the increase in mortality among children and adults. Detecting the regions of brain is the major challenge in tumor detection. In the field of medical image processing, multi sensor images are widely being used as potential sources to detect brain tumor. In this paper, a wavelet based image fusion algorithm is applied on the Magnetic Resonance (MR) images and Computed Tomography (CT) images which are used as primary sources to extract the redundant and complementary information in order to enhance the tumor detection in the resultant fused image. The main features taken into account for detection of brain tumor are location of tumor and size of the tumor, which is further optimized through fusion of images using various wavelet transforms parameters. We discuss and enforce the principle of evaluating and comparing the performance of the algorithm applied to the images with respect to various wavelets type used for the wavelet analysis. The performance efficiency of the algorithm is evaluated on the basis of PSNR values. The obtained results are compared on the basis of PSNR with gradient vector field and big bang optimization. The algorithms are analyzed in terms of performance with respect to accuracy in estimation of tumor region and computational efficiency of the algorithms.
Бесплатно
Wavelet Based Intentional Blurring Variance Scheme for Blur Detection in Barcode Images
Статья научная
Blur is an undesirable phenomenon which appears as one of the most frequent causes of image degradation. Automatic blur detection is extremely enviable to restore barcode image or simply utilize them. That is to assess whether a given image is blurred or not. To detect blur, many algorithms have been proposed. These algorithms are different in their performance, time complexity, precision, and robustness in noisy environments. In this paper, we present an efficient method blur detection in barcode images, with no reference perceptual blur metric using wavelets.
Бесплатно
Статья научная
Discrimination of protein coding regions called exons from noncoding regions called introns or junk DNA in eukaryotic cell is a computationally intensive task. But the dimension of the DNA string is huge; hence it requires large computation time. Further the DNA sequences are inherently random and have vast redundancy, hidden regularities, long repeats and complementary palindromes and therefore cannot be compressed efficiently. The objective of this study is to present an integrated signal processing algorithm that considerably reduces the computational load by compressing the DNA sequence effectively and aids the problem of searching for coding regions in DNA sequences. The presented algorithm is based on the Discrete Wavelet Transform (DWT), a very fast and effective method used for data compression and followed by comb filter for effective prediction of protein coding period-3 regions in DNA sequences. This algorithm is validated using standard dataset such as HMR195, Burset and Guigo and KEGG.
Бесплатно
Wavelet Based Some Julia Sets of Rational Maps Having Zhukovskii Function
Статья научная
The dynamics of rational maps and their properties are interesting because of the presence of poles and zeros. In this paper we have computed Julia sets of rational maps having Zhukovskii Function for which the double of the first derivative has no Herman rings. The data points out of the Julia set in Matlab workspace were imported to Matlab Signal Processing Tool for their analysis. We have sampled the data points with the sampling frequency of 8192 Hz and obtained complex signals. We have then applied the band pass filter to these complex signals. The effect of the band pass filter has generated complex analogue modulated signals.
Бесплатно
Wavelet Transform Techniques for Image Compression – An Evaluation
Статья научная
A vital problem in evaluating the picture quality of an image compression system is the difficulty in describing the amount of degradation in reconstructed image, Wavelet transforms are set of mathematical functions that have established their viability in image compression applications owing to the computational simplicity that comes in the form of filter bank implementation. The choice of wavelet family depends on the application and the content of image. Proposed work is carried out by the application of different hand designed wavelet families like Haar, Daubechies, Biorthogonal, Coiflets and Symlets etc on a variety of bench mark images. Selected benchmark images of choice are decomposed twice using appropriate family of wavelets to produce the approximation and detail coefficients. The highly accurate approximation coefficients so produced are further quantized and later Huffman encoded to eliminate the psychovisual and coding redundancies. However the less accurate detailed coefficients are neglected. In this paper the relative merits of different Wavelet transform techniques are evaluated using objective fidelity measures- PSNR and MSE, results obtained provide a basis for application developers to choose the right family of wavelet for image compression matching their application.
Бесплатно
Wavelet and Blend maps for texture synthesis
Статья научная
Blending is now a popular technology for large realtime texture synthesis .Nevertheless, creating blend map during rendering is time and computation consuming work. In this paper, we exploited a method to create a kind of blend tile which can be tile together seamlessly. Note that blend map is in fact a kind of image, which is Markov Random Field, contains multiresolution signals, while wavelet is a powerful way to process multiresolution signals, we use wavelet to process the traditional blend tile. After our processing steps, the result blend tile become smooth and suitable for tiling, with no important features lost. Using this kind blend tile, many computation resources for computing blend map during texture synthesizing is saved. The experimental results shows that our method may successfully process many traditional blend tiles.
Бесплатно
Wavelet based multimodal biometrics with score level fusion using mathematical normalization
Статья научная
Biometric based authentication is playing a very important role in various security related applications. A novel multimodal biometric verification based on fingerprint, palmprint and iris with matching score level fusion using Mathematical Normalization is proposed in this paper. In feature extraction stage of unimodal, features of each modality are extracted by applying wavelet decomposition using 6 different wavelet families and 35 respective wavelet family members. Further, the three optimal combinations of unimodal systems based on equal error rate achieved by wavelet(s) are chosen for development of multimodal biometric system. In matching score level fusion, along with well-known normalization techniques- Min-max, Tan-h and Z-score, the performance of multimodal systems are also analyzed using Mathematical Normalization (Math-norm) followed by product, weighted product, sum and average fusion rule. The experiments are conducted on database of 100 different subjects from publically available FVC2006, CASIA V1 and IITD database of fingerprint, palmprint and iris, respectively. The experimental results clearly show that Mathematical Normalization followed by weighted product has given promising accuracy with equal error rate (EER) of 0.325%.
Бесплатно
Wavelet, Gabor Filters and Co-occurrence Matrix for Palmprint Verification
Статья научная
Authentication through the palmprint is a field of biometrics. Palmprint-based personal verification has quickly entered the biometric family. It has become increasingly popular in the recent years due to its ease of acquisition, reliability and high user acceptance. In this paper, we present an authentication system based on the palmprint. We are particularly interested in the feature extraction step. Three feature extraction techniques based on the discrete wavelet transform, the Gabor filters and the co-occurrence matrix are evaluated. The support vector machine is used for the classification step. The results have been validated on the PolyU database related to 400 users. The best results have been achieved with the wavelet decomposition.
Бесплатно
Wavelet-NARM Based Sparse Representation for Bio Medical Images
Статья научная
Sparse representation based super resolution deals with the problem of reconstructing a high resolution image from one or several of its low resolution counterparts. In this case the low resolution image is modelled as the down-sampled version of its high resolution counterpart after blurring. When the blurring kernel is the Dirac delta function, i.e. the low resolution image is directly down sampled from its high resolution counterpart without blurring and the super-resolution problem becomes an image interpolation problem. In such cases, the conventional sparse representation models become less effective, because the data fidelity term fails to constrain the image local structures. In natural images, the given image patch can be modelled as the linear combination of nonlocal similar neighbours. In this paper image nonlocal self-similarity for image interpolation is introduced. More specifically, wavelet based a nonlocal autoregressive model (NARM) is proposed and taken as the data fidelity term in sparse representation model. Our experimental results on benchmark test images clearly demonstrate that the proposed wavelet-NARM based image interpolation method outperforms the reconstruction of edge structures and suppression of jaggy/ringing artefacts, achieving the best image interpolation results so far in terms of PSNR as well as perceptual quality metrics such as structural similarity index and structural content. The proposed method is applied on bio medical images to emphasis on diagnostic information.
Бесплатно
Wavelet-based Video Coding using Advanced Fractional Motion Estimation Technique
Статья научная
The purpose of this paper is to encode a color video by wavelet transformation. Therefore, we propose a new hybrid approach which combines a fractional motion estimation technique. Several studies were carried out to reduce the spatial and temporal redundancies, hence at the level of spatial video coding, we use a new approach based on sub-bands coding through a discrete wavelet transformation. This technique is based on the principle of the EZW algorithm of Shapiro. It proceeds by separating the encoding of the signs and the magnitudes of wavelet coefficients. Then, at the level of temporal compression, we propose a study of motion estimation with different accuracy based on image interpolation to improve the quality of predicted frame. Next, we present a representation reducing the size of the motion vector field and we compress it by two of entropic coding approaches namely Huffman coding and arithmetic coding. The proposed video codec was applied on a video sequence with different sizes (CIF and QCIF) and different dynamics. The obtained results, in terms of objective assessment (PSNR, the SSIM and VQM), were satisfactory compared with other video coding standards. We have also proposed a subjective evaluation and the results are compared to those obtained by H.264/AVC standard.
Бесплатно
Статья научная
Environmental pollution resulting from waste is a critical global challenge that significantly affects both the environment and public health, especially in countries like Indonesia. Effective waste management and recycling depend on accurately detecting and classifying different waste types. This study tackles this challenge by evaluating the YOLOv8s algorithm for object detection and conducting a comparative analysis of two mobile-optimized convolutional neural networks (CNNs), MobileNetV2 and EfficientNet, for waste classification. The YOLOv8s model established a promising baseline for detection, achieving a mean Average Precision (mAP@50) of 0.621 on the hold-out test set. MobileNetV2 proved to be the superior architecture in the classification task, attaining a higher accuracy of 94.4% compared to EfficientNet’s 87.8%. Additionally, MobileNetV2 demonstrated significantly greater computational efficiency, with a processing time of 229 ms per step, in contrast to EfficientNet’s 606 ms per step. These findings confirm that combining YOLOv8s for detection and MobileNetV2 for classification provides a robust and efficient pathway for developing automated waste management systems.
Бесплатно
Weighted Late Fusion based Deep Attention Neural Network for Detecting Multi-Modal Emotion
Статья научная
In the field of affective computing research, multi-modal emotion detection has gained popularity as a way to boost recognition robustness and get around the constraints of processing a multiple type of data. Human emotions are utilized for defining a variety of methodologies, including physiological indicators, facial expressions, as well as neuroimaging tactics. Here, a novel deep attention mechanism is used for detecting multi-modal emotions. Initially, the data are collected from audio and video features. For dimensionality reduction, the audio features are extracted using Constant-Q chromagram and Mel-Frequency Cepstral Coefficients (MM-FC2). After extraction, the audio generation is carried out by a Convolutional Dense Capsule Network (Conv_DCN) is used. Next is video data; the key frame extraction is carried out using Enhanced spatial-temporal and Second-Order Gaussian kernels. Here, Second-Order Gaussian kernels are a powerful tool for extracting features from video data and converting it into a format suitable for image-based analysis. Next, for video generation, DenseNet-169 is used. At last, all the extracted features are fused, and emotions are detected using a Weighted Late Fusion Deep Attention Neural Network (WLF_DAttNN). Python tool is used for implementation, and the performance measure achieved an accuracy of 97% for RAVDESS and 96% for CREMA-D dataset.
Бесплатно
What is the Truth: A Survey of Video Compositing Techniques
Статья научная
The compositing of videos is considered one of the most important steps on the post-production process. The compositing process combines several videos that may be recorded at different times or locations into a final one. Computer generated footages and visual effects are combined with real footages using video compositing techniques. High reality shots of many movies were introduced to the audience who cannot discover that those shots are not real. Many techniques are used for achieving high realistic results of video compositing. In this paper, a survey of video compositing techniques, a comparison among compositing techniques, and many examples for video compositing using existing techniques are presented.
Бесплатно
Статья научная
Scene text detection from natural images has been a prime focus from last few decades. Classification of foreground object components is an essential task in many scene text detection approaches under uncontrollable environment. As it heavily relies upon robust and discriminating features, several features have been engineered for component-level text non-text classification. Competency of such feature descriptors particularly in respect of deep features needs to be examined. In this paper, we present prospective feature descriptors applicable to component-level text non-text classification and examine their performance along with convolutional neural network based deep features. Series of experiments have been carried out on publicly available benchmark dataset(s) of multi-script document-type, scene-type, and combined text vs. non-text components. Interestingly, feature combination is found to put well-demonstrated deep features into tough competition on most datasets under consideration. For instance, on the combined text non-text classification problem, CNN based deep features yield 97.6%, whereas aggregated features produce an accuracy of 98.4%. Similar findings are obtained on other experiments as well. Along with the quantitative figures, results have been analyzed and insightful discussion is made to ascertain the conjectures drawn herein. This study may cater the need of leveraging potentially strong handcrafted feature descriptors.
Бесплатно
White Colour Hues in Displays and Lighting Systems Based on RGB and RGBW LEDs
Статья научная
In this paper, aspects of obtaining white colour hues for displays/monitors and lighting by using three- and four-components LED systems are discussed. Photometric equipment developed by us for multichannel LEDs control is used in an experimental study to verify theoretical calculations. Three-component RGB and four-component RGBW LED systems, which utilise the same RGB light sources and two white LEDs with warm and cold hues, are investigated. Results of testing of luminous efficacy of such systems at different values of light intensity and comparison of the corresponding circadian action factor as the value of impact of summarized RGB and RGBW white light on human circadian rhythms are presented. It is demonstrated that the four-component RGBW LED systems are more preferable for lighting and displays than the three-components RGB LED systems, because of significant higher luminous efficacy and slightly lower circadian factor over the entire range of correlated colour temperature from 2500K to 7000K studied.
Бесплатно
Wiener filter based noise reduction algorithm with perceptual post filtering for hearing aids
Статья научная
This paper presents a filter bank summation method to perform spectral splitting of input signal for binaural dichotic presentation along with dynamic range compression coupled with noise reduction algorithm based on wiener filter. This helps to compensate the effect of spectral masking, reduced dynamic range, and improves speech perception for moderate sensorineural hearing loss in the adverse listening conditions. We have considered cascaded structure of noise reduction technique; Filter Bank Summation (FBS) based amplitude compression and spectral splitting. Wiener filter produces the enhanced signal by removing unwanted noise. The signal is split into eighteen frequency bands, ranging from 0-5KHz, based on auditory critical bandwidths. To reduce the dynamic range, amplitude compression is carried out using constant compression factor in each of the bands. Subjective and objective assessment based on Mean Opinion Score (MOS) and Perceptual Evaluation of Speech Quality (PESQ) scores, respectively, are used to test the Perceived quality of speech for different Signal-to-Noise Ratio (SNR) conditions. Vowel Consonant Vowel (VCV) syllable /aba/ and sentences were used as the test material. The results of the listening tests showed MOS scores for processed speech sentence “sky that morning was clear and bright blue” (4.41, 4.2, 3.96, 3.6, 3.08 and 2.66) as compared with unprocessed speech MOS scores ( 4.53, 1.21, 1.16, 1.06, 0.8, 0.483) for SNR values of ∞, +6, +3, 0, -3 and -6 dB respectively, and PESQ values (Left Channel: 2.6192, 2.5355, 2.5646, 2.5513, 2.5221, and 2.4309; Right Channel: 2.5889, 2.3001, 2.3714, 2.4710, 2.3636, and 2.4712) for SNR values of ∞, +6, +3, 0, -3 and -6 dB respectively, indicating the improvement in the perceived quality for different SNR conditions. To evaluate the intelligibility of the perceived speech, listening test was carried out for hearing impaired (moderate Sensorineural Hearing Loss (SNHL)) persons in the presence of background noise using Modified Rhyme Test (MRT).The test material consists 50 sets of monosyllabic words of consonant-vowel-consonant (CVC) form with six words in each set. Each subject responded for a total of 1800 presentations (300 words x 6 different SNR conditions). Results of the listening tests (using MRT) showed maximum improvement of (27.299%, 23.95%, 24.503%, 23.602%, and 23.498%) in the speech recognition scores at SNR values of (-6dB, -3dB, 0dB, +3dB, +6dB) compared to unprocessed speech recognition scores. Reductions in response times compared to unprocessed speech response times at lower SNR values were observed. The decrease in response times at the SNR values of -6, -3, 0, +3 and+6 dB were 1.581, 1.41, 1.329, 1.279, and 1.01s, respectively, indicating improvement in intelligibility of the speech at lower SNR values.
Бесплатно
Wound Image Analysis Using Contour Evolution
Статья научная
The aim of the algorithm described in this paper is to segment wound images from the normal and classify them according to the types of the wound. The segmentation of wounds extravagates color representation, which has been followed by an algorithm of grayscale segmentation based on the stack mathematical approach. Accurate classification of wounds and analyzing wound healing process is a critical task for patient care and health cost reduction at hospital. The tissue uniformity and flatness leads to a simplified approach but requires multispectral imaging for enhanced wound delineation. Contour Evolution method which uses multispectral imaging replaces more complex tools such as, SVM supervised classification, as no training step is required. In Contour Evolution, classification can be done by clustering color information, with differential quantization algorithm, the color centroids of small squares taken from segmented part of the wound image in (C1,C2) plane. Where C1, C2 are two chrominance components. Wound healing is identified by measuring the size of the wound through various means like contact and noncontact methods of wound. The wound tissues proportion is also estimated by a qualitative visual assessment based on the red-yellow-black code. Moreover, involving all the spectral response of the tissue and not only RGB components provides a higher discrimination for separating healed epithelial tissue from granulation tissue.
Бесплатно