Статьи журнала - International Journal of Image, Graphics and Signal Processing

Все статьи: 1146

Simultaneous Image Fusion and Denoising based on Multi-Scale Transform and Sparse Representation

Simultaneous Image Fusion and Denoising based on Multi-Scale Transform and Sparse Representation

Tahiatul Islam, Sheikh Md. Rabiul Islam, Xu Huang, Keng Liang Ou

Статья научная

Multi-scale transform (MST) and sparse representation (SR) techniques are used in an image representation model. Image fusion is used especially in medical, military and remote sensing areas for high resolution vision. In this paper an image fusion technique based on shearlet transformation and sparse representation is proposed to overcome the natural defects of both MST and SR based methods. The proposed method is also used in different transformations and SR for comparison purposes. This research also investigate denoising techniques with additive white Gaussian noise into source images and perform threshold for de-noised into the proposed method. The image quality assessments for the fused image are used for the performance of proposed method and compared with others.

Бесплатно

Sine Cosine Taylor Like Technique for Connected Component Detector by ICNN Simulation

Sine Cosine Taylor Like Technique for Connected Component Detector by ICNN Simulation

S.Senthilkumar, Abd Rahni Mt Piah

Статья научная

Sine cosine Taylor like technique is employed to carry out connected component detector (CCD) simulation under improved cellular neural network (ICNN) architecture to yield better accuracy for hand written character and image recognition system. The principal simulation results reveal that this technique performs well in comparison with other techniques.

Бесплатно

Skin Color Segmentation in YCBCR Color Space with Adaptive Fuzzy Neural Network (Anfis)

Skin Color Segmentation in YCBCR Color Space with Adaptive Fuzzy Neural Network (Anfis)

Mohammad Saber Iraji, Azam Tosinia

Статья научная

In this paper, an efficient and accurate method for human color skin recognition in color images with different light intensity will proposed .first we transform inputted color image from RGB color space to YCBCR color space and then accurate and appropriate decision on that if it is in human color skin or not will be adopted according to YCBCR color space using fuzzy, adaptive fuzzy neural network(anfis) methods for each pixel of that image. In our proposed system adaptive fuzzy neural network(anfis) has less error and system worked more accurate and appropriative than prior methods.

Бесплатно

Sliding Window Based High Utility Item-Sets Mining over Data Stream Using Extended Global Utility Item-Sets Tree

Sliding Window Based High Utility Item-Sets Mining over Data Stream Using Extended Global Utility Item-Sets Tree

P. Amaranatha Reddy, MHM Krishna Prasad

Статья научная

High utility item-sets mining(HUIM)is a special topic in frequent item-sets mining(FIM). It gives better insights for business growth by focusing on the utility of items in a transaction. HUIM is evolving as a powerful research area due to its vast applications in many fields. Data stream processing, meanwhile, is an interesting and challenging problem since, processing very fast generating a huge amount of data with limited resources strongly demands high-performance algorithms. This paper presents an innovative idea to extract the high utility item-sets (HUIs) from the dynamic data stream by applying sliding window control. Even though certain algorithms exist to solve the same problem, they allow redundant processing or reprocessing of data. To overcome this, the proposed algorithm used a trie like structure called Extended Global Utility Item-sets tree (EGUI-tree), which is flexible to store and retrieve the mined information instead of reprocessing. An experimental study on real-world datasets proved that EGUI-tree algorithm is faster than the state-of-the-art algorithms.

Бесплатно

Software Implementation of CCSDS Recommended Hyperspectral Lossless Image Compression

Software Implementation of CCSDS Recommended Hyperspectral Lossless Image Compression

Dharam Shah, Kuhelika Bera, Sanjay Joshi

Статья научная

HyperSpectral Imagers (HySI) are used in the spacecraft or aircrafts to get minute characteristics of target element through capturing image in a large number of narrow and contiguous bands. HySI data represented as data cube with two dimensions representing spatial distribution and third dimension providing band information is huge in volume and challenging task to handle. Hence onboard compression becomes a necessary for optimal usage of onboard storage and downlink bandwidth. CCSDS recommended 123.0-B-1 standard[2] has been released with onboard compression scheme of hyperspectral data. The scheme is based on Fast Lossless algorithm and consists of two main functional blocks namely Predictor and Encoder. Predictor algorithm can be implemented in two modes 'Full Neighborhood Oriented' and 'Reduced Column Oriented'. Encoder algorithm also defines two options 'sample-adaptive' and 'block-adaptive'. We have developed a MATLAB based model implementing the compression scheme with all options defined by the standard. Decompression model is also developed for getting back actual data and end to end verification. Four sets of HySI data (AVIRIS, Hyperion, Chandrayan-1 and FTIS) have been applied as input to the developed model for evaluation of the model. Compression ratio achieved is between 2 to 3 and lossless compression is ensured for each set of data as Mean Square Error (MSE) is zero for all hyperspectral images. Also visual reconstruction of decompressed data matches with original ones. In this paper we have discussed algorithm implementation methodology and results.

Бесплатно

Solid Launcher Dynamical Analysis and Autopilot Design

Solid Launcher Dynamical Analysis and Autopilot Design

Ping Sun

Статья научная

The dynamics of a small solid launch vehicle has been investigated. This launcher consists of a liquid upper stage and three fundamental solid rocket boosters aligned in series. During the ascent flight phase, lateral jets and grid fins are adopted by the flight control system to stable the attitude of the launcher. The launcher is a slender and aerodynamically unstable vehicle with sloshing tanks. A complete set of six-degrees-of-freedom dynamic models of the launcher, incorporation its rigid body, aerodynamics, gravity, sloshing, mass change, actuator, and elastic body, is developed. Dynamic analysis results of the structural modes and the bifurcation locus are calculated on the basis of the presented models. This complete set of dynamic models is used in flight control system design. A methodology for employing numerical optimization to develop the attitude filters is presented. The design objectives include attitude tracking accuracy and robust stability with respect to rigid body dynamics, propellant slosh, and flex. Later a control approach is presented for flight control system of the launcher using both State Dependent Riccati Equation (SDRE) method and Fast Output Sampling (FOS) technique. The dynamics and kinematics for attitude stable problem are of typical nonlinear character. SDRE technique has been well applied to this kind of highly nonlinear control problems. But in practice the system states needed in the SDRE method are sometimes difficult to obtain. FOS method, which makes use of only the output samples, is combined with SDRE to accommodate the incomplete system state information. Thus, the control approach is more practical and easy to implement. The resulting autopilot can provide stable control systems for the vehicle.

Бесплатно

Sound Source Localization Ability in Hearing Aids: A Survey

Sound Source Localization Ability in Hearing Aids: A Survey

Jyoti M. Katagi, Pandurangarao N. Kulkarni

Статья научная

Ability to locate sound source in human acoustic system is a prime factor. The source of sound has various spectral, temporal and strength characteristics depending on where it is located. To identify the sound location, the listeners analyze these characteristics arising from various directions on the horizontal and the vertical surfaces. In noisy background, it is very difficult to understand the speech for individuals with sensorineural hearing loss. In order to reliably distinguish various sound sources and increase speech intelligibility in noisy conditions, binaural hearing is adopted. Diffraction induced by the pinnae, head, shoulders and torso changes the pressure waveform when sound waves travel from the audio source to the listener's eardrum. Two transfer functions that specify the relation between the sound pressures at the listener's right and left ear drums will catch these propagation effects. These spectral changes are recorded by Head Related Transfer Functions (HRTFs). Different hearing aid algorithms are to be studied to measure their effectiveness in improving speech perception through series of subjective evaluations involving subjects with sensorineural hearing loss with different types of loss characteristics under different listening conditions. We investigated the various proposed approaches, weighed in on their benefits and drawbacks and most importantly, examined whether and how the resulting HRTFs perceptual validity is evaluated. This paper brings out current research efforts on sound source localization ability in hearing aids, which includes use of Head Related Transfer Functions (HRTFs) for generating spatial sounds in elevation and azimuth plane, evaluating the effect of monaural and binaural hearing aid algorithms on source localization under different listening conditions on subjects with different hearing losses and also to assess the effectiveness of localization with type of hearing aids.

Бесплатно

Sparse representation and face recognition

Sparse representation and face recognition

M. Khorasani, S. Ghofrani, M. Hazari

Статья научная

Now a days application of sparse representation are widely spreading in many fields such as face recognition. For this usage, defining a dictionary and choosing a proper recovery algorithm plays an important role for the method accuracy. In this paper, two type of dictionaries based on input face images, the method named SRC, and input extracted features, the method named MKD-SRC, are constructed. SRC fails for partial face recognition whereas MKD-SRC overcomes the problem. Three extension of MKD-SRC are introduced and their performance for comparison are presented. For recommending proper recovery algorithm, in this paper, we focus on three greedy algorithms, called MP, OMP, CoSaMP and another called Homotopy. Three standard data sets named AR, Extended Yale-B and Essex University are used to asses which recovery algorithm has an efficient response for proposed methods. The preferred recovery algorithm was chosen based on achieved accuracy and run time.

Бесплатно

Spatial-temporal shape and motion features for dynamic hand gesture recognition in depth video

Spatial-temporal shape and motion features for dynamic hand gesture recognition in depth video

Vo Hoai Viet, Nguyen Thanh Thien Phuc, Pham Minh Hoang, Liu Kim Nghia

Статья научная

Human-Computer Interaction (HCI) is one of the most interesting and challenging research topics in computer vision community. Among different HCI methods, hand gesture is the natural way of human-computer interaction and is focused on by many researchers. It allows the human to use their hand movements to interact with machine easily and conveniently. With the birth of depth sensors, many new techniques have been developed and gained a lot of achievements. In this work, we propose a set of features extracted from depth maps for dynamic hand gesture recognition. We extract HOG2 for shape and appearance of hand in gesture representation. Moreover, to capture the movement of the hands, we propose a new feature named HOF2, which is extracted based on optical flow algorithm. These spatial-temporal descriptors are easy to comprehend and implement but perform very well in multi-class classification. They also have a low computational cost, so it is suitable for real-time recognition systems. Furthermore, we applied Robust PCA to reduce feature’s dimension to build robust and compact gesture descriptors. The robust results are evaluated by cross-validation scheme using a SVM classifier, which shows good outcome on challenging MSR Hand Gestures Dataset and VIVA Challenge Dataset with 95.51% and 55.95% in accuracy, respectively.

Бесплатно

Spatiotemporal Data Fusion using Dictionary Learning and Temporal Edge Primitives

Spatiotemporal Data Fusion using Dictionary Learning and Temporal Edge Primitives

J. Malleswara Rao, C. V. Rao, A. Senthil Kumar, B. Gopala Krishna, V. K. Dadhwal

Статья научная

Technological limitations restrict to acquire an image at high spatial and high temporal resolutions with space borne global sensors. In this paper, we propose a novel technique to create such images at ground-based data processing system. The Resourcesat-2 is one of the Indian Space Research Organization (ISRO) global missions and it carries Linear Imaging and Self-Scanning Sensors (LISS III and LISS IV) and an Advanced Wide-Field Sensor (AWiFS). The spatial resolution of LISS III is 23.5 m and that of AWiFS is 56 m. The temporal resolution of LISS III is 24 days and that of AWiFS is 5 days. Objective of the paper is to create a synthetic LISS III image at 23.5 m spatial and 5-day temporal resolutions. A synthetic LISS III image at time tk is created from an AWiFS image at time tk and a single AWiFS–LISS III image pair at time t0 which is acquired before or after the prediction time tk , here t0≠tk. The proposed method involves three phases. The first is super resolution phase. In this phase, two transition images are obtained for the time t0 and tk by improving AWiFS spatial resolution. The second is high pass modulation phase. In this phase, the high frequency details which are obtained in the difference of LISS III image and the transition image of time t0 are proportionally injected into the transition image at time tk. In composition of multi-temporal images of different spatial resolutions, spurious spatial discontinuities are inevitable. In the third phase, these spurious discontinuities are identified and smoothed with the spatial-profile-averaging method. The proposed method achieves better prediction accuracy when compared to the state-of-the art techniques.

Бесплатно

Speaker Emotion Recognition based on Speech Features and Classification Techniques

Speaker Emotion Recognition based on Speech Features and Classification Techniques

J. Sirisha Devi, Srinivas Yarramalle, Siva Prasad Nandyala

Статья научная

Speech Processing has been developed as one of the vital provision region of Digital Signal Processing. Speaker recognition is the methodology of immediately distinguishing who is talking dependent upon special aspects held in discourse waves. This strategy makes it conceivable to utilize the speaker's voice to check their character and control access to administrations, for example voice dialing, data administrations, voice send, and security control for secret information. A review on speaker recognition and emotion recognition is performed based on past ten years of research work. So far iari is done on text independent and dependent speaker recognition. There are many prosodic features of speech signal that depict the emotion of a speaker. A detailed study on these issues is presented in this paper.

Бесплатно

Speaker Identification using SVM during Oriya Speech Recognition

Speaker Identification using SVM during Oriya Speech Recognition

Sanghamitra Mohanty, Basanta Kumar Swain

Статья научная

In this research paper, we have developed a system that identifies users by their voices and helped them to retrieve the information using their voice queries. The system takes into account speaker identification as well as speech recognition i.e. two pattern recognition techniques in speech domain. The conglomeration of speaker identification task and speech recognition task provides multitude of facilities in comparison to isolated approach. The speaker identification task is achieved by using SVM where as speech recognition is based on HMM. We have used two different types of corpora for training the system. Gamma tone cepstral coefficients and mel frequency cepstral coefficients are extracted for speaker identification and speech recognition respectively. The accuracy of the system is measured from two perspective i.e. accuracy of speaker identity and accuracy of speech recognition task. The accuracy of the speaker identification is enhanced by adopting the speech recognition at the initial stage of speaker identification.

Бесплатно

Speaker Recognition in Mismatch Conditions: A Feature Level Approach

Speaker Recognition in Mismatch Conditions: A Feature Level Approach

Sharada V Chougule, Mahesh S. Chavan

Статья научная

Mismatch in speech data is one of the major reasons limiting the use of speaker recognition technology in real world applications. Extracting speaker specific features is a crucial issue in the presence of noise and distortions. Performance of speaker recognition system depends on the characteristics of extracted features. Devices used to acquire the speech as well as the surrounding conditions in which speech is collected, affects the extracted features and hence degrades the decision rates. In view of this, a feature level approach is used to analyze the effect of sensor and environment mismatch on speaker recognition performance. The goal here is to investigate the robustness of segmental features in speech data mismatch and degradation. A set of features derived from filter bank energies namely: Mel Frequency Cepstral Coefficients (MFCCs), Linear Frequency Cepstral Coefficients (LFCCs), Log Filter Bank Energies (LOGFBs) and Spectral Subband Centroids (SSCs) are used for evaluating the robustness in mismatch conditions. A novel feature extraction technique named as Normalized Dynamic Spectral Features (NDSF) is proposed to compensate the sensor and environment mismatch. A significant enhancement in recognition results is obtained with proposed feature extraction method.

Бесплатно

Speckle Reduction with Edge Preservation in B-Scan Breast Ultrasound Images

Speckle Reduction with Edge Preservation in B-Scan Breast Ultrasound Images

Madan Lal, Lakhwinder Kaur, Savita Gupta

Статья научная

Speckle is a multiplicative noise that degrades the quality of ultrasound images and its presence makes the visual inspection difficult. In addition, it limits the professional application of image processing techniques such as automatic lesion segmentation. So speckle reduction is an essential step before further processing of ultrasonic images. Numerous techniques have been developed to preserve the edges while reducing speckle noise, but these filters avoid smoothing near the edges to preserve fine details. The objective of this work is to suggest a new technique that enhances B-Scan breast ultrasound images by increasing the speckle reduction capability of an edge sensitive filter. In the proposed technique a local statics based filter is applied in the non homogeneous regions, to the output of an edge preserving filter and an edge map is used to retain the original edges. Experiments are conducted using synthetic test image and real time ultrasound images. The effectiveness of the proposed technique is evaluated qualitatively by experts and quantitatively in terms of various quality metrics. Results indicate that proposed method can reduce more noise and simultaneously preserve important diagnostic edge information in breast ultrasound images.

Бесплатно

Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review

Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review

Navneet Upadhyay, Abhijit Karmakar

Статья научная

The spectral subtraction method is a classical approach for enhancement of speech degraded by additive background noise. The basic principle of this method is to estimate the short-time spectral magnitude of speech by subtracting estimated noise spectrum from the noisy speech spectrum. This is also achieved by multiplying the noisy speech spectrum with a gain function and later combining it with the phase of the noisy speech. Besides reducing the background noise, this method introduces an annoying perceptible tonal characteristic in the enhanced speech and affects the human listening, known as remnant musical noise. Several variations and implementations of this method have been adopted in past decades to address the limitations of spectral subtraction method. These variations constitute a family of subtractive-type algorithms and operate in frequency domain. The objective of this paper is to provide an extensive overview of spectral subtractive-type algorithms for enhancement of noisy speech. After the review, this paper is concluded by mentioning a future direction of speech enhancement research from spectral subtraction perspective.

Бесплатно

Spectral and Time Based Assessment of Meditative Heart Rate Signals

Spectral and Time Based Assessment of Meditative Heart Rate Signals

Ateke Goshvarpour, Mousa Shamsi, Atefeh Goshvarpour

Статья научная

The objective of this article was to study the effects of Chi meditation on heart rate variability (HRV). For this purpose, the statistical and spectral measures of HRV from the RR intervals were analyzed. In addition, it is concerned with finding adequate Auto-Regressive Moving Average (ARMA) model orders for spectral analysis of the time series formed from RR intervals. Therefore, Akaike's Final Prediction Error (FPE) was taken as the base for choosing the model order. The results showed that overall the model order chosen most frequently for FPE was p = 8 for before meditation and p = 5 for during meditation. The results suggested that variety of orders in HRV models upon different psychological states could be due to some differences in intrinsic properties of the system.

Бесплатно

Speech Emotion Recognition based on SVM as Both Feature Selector and Classifier

Speech Emotion Recognition based on SVM as Both Feature Selector and Classifier

Amirreza Shirani, Ahmad Reza Naghsh Nilchi

Статья научная

The aim of this paper is to utilize Support Vector Machine (SVM) as feature selection and classification techniques for audio signals to identify human emotional states. One of the major bottlenecks of common speech emotion recognition techniques is to use a huge number of features per utterance which could significantly slow down the learning process, and it might cause the problem known as "the curse of dimensionality". Consequently, to ease this challenge this paper aims to achieve high accuracy system with a minimum set of features. The proposed model uses two methods, namely "SVM features selection" and the common "Correlation-based Feature Subset Selection (CFS)" for the feature dimensions reduction part. In addition, two different classifiers, one Support Vector Machine and the other Neural Network are separately adopted to identify the six emotional states of anger, disgust, fear, happiness, sadness and neutral. The method has been verified using Persian (Persian ESD) and German (EMO-DB) emotional speech databases, which yield high recognition rates in both databases. The results show that SVM feature selection method provides better emotional speech-recognition performance compared to CFS and baseline feature set. Moreover, the new system is able to achieve a recognition rate of (99.44%) on the Persian ESD and (87.21%) on Berlin Emotion Database for speaker-dependent classification. Besides, promising result (76.12%) is obtained for speaker-independent classification case; which is among the best-known accuracies reported on the mentioned database relative to its little number of features.

Бесплатно

Speech Enhancement Using Joint Time and DCT Processing for Real Time Applications

Speech Enhancement Using Joint Time and DCT Processing for Real Time Applications

Ravi Kumar Kandagatla, V. Jayachandra Naidu, P.S. Sreenivasa Reddy, Sivaprasad Nandyala

Статья научная

Deep learning based speech enhancement approaches provides better perceptual quality and better intelligibility. But most of the speech enhancement methods available in literature estimates enhanced speech using processed amplitude, energy, MFCC spectrum, etc along with noisy phase. Because of difficult in estimating clean speech phase from noisy speech the noisy phase is still using in reconstruction of enhanced speech. Some methods are developed for estimating clean speech phase and it is observed that it is complex for estimation. To avoid difficulty and for better performance rather than using Discrete Fourier Transform (DFT) the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) based convolution neural networks are proposed for better intelligibility and improved performance. However, the algorithms work either features of time domain or features of frequency domain. To have advantage of both time domain and frequency domain here the fusion of DCT and time domain approach is proposed. In this work DCT Dense Convolutional Recurrent Network (DCTDCRN), DST Convolutional Gated Recurrent Neural Network (DSTCGRU), DST Convolution Long Short term Memory (DSTCLSTM) and DST Convolutional Gated Recurrent Neural Network (DSTDCRN) are proposed for speech enhancement. These methods are providing superior performance and less processing difficulty when compared to the state of art methods. The proposed DCT based methods are used further in developing joint time and magnitude based speech enhancement method. Simulation results show superior performance than baseline methods for joint time and frequency based processing. Also results are analyzed using objective performance measures like Signal to Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI).

Бесплатно

Speech Enhancement based on Wavelet Thresholding the Multitaper Spectrum Combined with Noise Estimation Algorithm

Speech Enhancement based on Wavelet Thresholding the Multitaper Spectrum Combined with Noise Estimation Algorithm

P.Sunitha, K.Satya Prasad

Статья научная

This paper presents a method to reduce the musical noise encountered with the most of the frequency domain speech enhancement algorithms. Musical Noise is a phenomenon which occurs due to random spectral speaks in each speech frame, because of large variance and inaccurate estimate of spectra of noisy speech and noise signals. In order to get low variance spectral estimate, this paper uses a method based on wavelet thresholding the multitaper spectrum combined with noise estimation algorithm, which estimates noise spectrum based on the spectral average of past and present according to a predetermined weighting factor to reduce the musical noise. To evaluate the performance of this method, sine multitapers were used and the spectral coefficients are threshold using Wavelet thresholding to get low variance spectrum .In this paper, both scale dependent, independent thresholdings with soft and hard thresholding using Daubauchies wavelet were used to evaluate the proposed method in terms of objective quality measures under eight different types of real-world noises at three distortions of input SNR. To predict the speech quality in presence of noise, objective quality measures like Segmental SNR ,Weighted Spectral Slope Distance ,Log Likelihood Ratio, Perceptual Evaluation of Speech Quality (PESQ) and composite measures are compared against wavelet de-noising techniques, Spectral Subtraction and Multiband Spectral Subtraction provides consistent performance to all eight different noises in most of the cases considered.

Бесплатно

Speech Enhancement through Implementation of Adaptive Noise Canceller Using FHEDS Adaptive Algorithm

Speech Enhancement through Implementation of Adaptive Noise Canceller Using FHEDS Adaptive Algorithm

Ch.D.Umasankar, M. Satya Sai Ram

Статья научная

Speech analysis is the modelling and estimating of the different speech characteristics that would provide the importance on each set of criteria established on the real time applications. One such analytic section in enhancement process on speeches would improve the need of speech enhancement. This paper compares the performance analysis of our proposed Fast Hybrid Euclidean Direction Search (FHEDS) algorithm with other adaptive algorithms such as NHP and FEDS algorithm. These algorithms have been tested for their adaptive noise cancellation of speech signal corrupted by different noises such as Babble, Factory, Destroy Engine, Car, Fire Engine and Train Noises. Ensuring the design criteria with current design limits of the database and its analysis have been encapsulated with each phase of design with Noise model, improving the better performance aspects. The relative factors for comparisons have been tabulated with each set of the noise and clear speech data with proposed filter operation. The proposed model effectively reduces the noise for achieving better speech enhancement. The proposed model achieves high Signal-to-Noise Ratio (SNR) when compared to traditional models.

Бесплатно

Журнал