Speech Compression Based on Discrete Walsh Hadamard Transform

Автор: Noureddine Aloui, Souha Bousselmi, Adnane Cherif

Журнал: International Journal of Information Engineering and Electronic Business(IJIEEB) @ijieeb

Статья в выпуске: 3 vol.5, 2013 года.

Бесплатный доступ

This paper presents a new lossy compression algorithm for stationary signal based on Discrete Walsh Hadamard Transform (DWHT). The principle of compression algorithm consists in framing the original speech signal into stationary frames and applying the DWHT. Then, the obtained coefficients are thresholded in order to truncate all coefficients below a given thresholds values. Compression is achieved by efficient encoding of the string values of zeros. A comparative study of performance between the algorithms based on DWHT and Discrete Wavelet Transform (DWT) is performed in terms of some objective criteria: compression ratio (CR), signal to noise ratio, peak signal to noise ratio (SNR), normalized root mean square error (NRMSE) and CPU time. The simulation results show that the algorithm based on DWHT is characterized by a very low complexity implementation and improved CR, SNR, PSNR and NRMSE compared to the DWT algorithm and this for stationary frame.

Еще

Speech Compression, Discrete Walsh Hadamard Transform, Discrete Wavelet Transform

Короткий адрес: https://sciup.org/15013187

IDR: 15013187

Текст научной статьи Speech Compression Based on Discrete Walsh Hadamard Transform

Published Online September 2013 in MECS DOI: 10.5815/ijieeb.2013.03.07

Speech compression is a ripe subject of research. It is a necessity to satisfy the transfer requirements of speech signals via channel communication or storage device. The speech compression has many applications such as; multimedia applications and satellite communications.

The compression techniques can be broadly divided into two classes: lossless and lossy compression. The lossless compression which means that when speech is decompressed, the original data will be restored without any modification, however, the lossy compression does not allow the exact original data to be reconstructed for the compressed speech.

The compression methods can be classified into three functional categories [1]: the first is compression by direct method; the samples of the speech signal are directly handled provide compression. The second is compression by transformations such as; Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT) and Discrete Wavelet Transform (DWT). The third method is compression by parameters extraction; the input speech signal is analyzed to extract some parameters that are later used to reconstruct the signal.

During last decade, the DWT has emerged as a powerful mathematic tool for speech processing such as; audio/image compression [1], [2], [3], [4], [5] and denoising [6], [7]. The performance of speech compression by the DWT is very good compared with other techniques [2], [3] such as DCT used in MPEG (Moving Picture Experts Group) and JPEG (Joint Photographic Experts Group) and especially for non-stationary signal. However, for real-time processing the DWT present a high complexity implementation.

In above context, this paper presents a speech compression algorithm for stationary signal based on Discrete Walsh Hadamard Transform (DWHT). The proposed algorithm is characterized by a very low complexity implementation and improved compared to DWT technique.

The paper is organized as follows. Section 1 covers the Discrete Walsh Hadamard Transform theory. Section 2 presents a brief introduction for wavelet theory. Section 3 attempts to explain the methodology for speech compression using DWT and DWHT. Section 4 shows the results. Finally, section 5 gives conclusion and remarks.

II. Discrete Walsh Hadamard Transform

The Discrete Walsh-Hadamard transform (WHT) is an orthogonal transformation that decomposes a signal into a set of orthogonal, rectangular waveforms called Walsh functions [8], [9]. The Hadamard transform take only the binary value +1 or -1. The direct and inverse DWHT pair for a signal x(t) of length N are respectively expressed as follow:

Where, b ( n ) is the i^th bit in the binary representation of x ( n ) with r bits and p ( k ) is computed using [1]:

P o ⁽ k ) = b r - 1 ⁽ ^k )

У(n) = — ]Tx. WAL(n,i) n = 1,2,....,N-1 N~0

P 1 ^(k ) = b r - 1 ⁽ ^k ) ⁺ b r - 2 ⁽ ^k )

N - 1

X = Z y(n) WAL(n, i) n = 1,2,...., N-1 (2)

n = 0

P , ⁽ ^k ) = ^b, - 2 ⁽ ^k ) + ^b, - 3 ⁽ ^k )

x(n) and y(n) are respectily the original and reconstructed speech signal.

P r - 1 ⁽ ^k ) = ^b 1 ^(k ) + ^b o ⁽ ^k )

v^-' i = r —1

WAL ( n , i ) = ( - 1 )^Z i =o b i⁽ ⁿ ) ^p ^- ⁽ k ) (3)

III. Discrete Wavelet Transform

The Discrete Wavelet Transform (DWT) is a powerful mathematic tool for time-frequency analysis of non-stationary signals. It uses multi-resolution filter banks for the signal analysis. The general form of DWT at L-level is written in terms of L detail coefficients d(k), and the Lth level approximation coefficients c(k) can be expressed as [5]:

f (t ) = Z cL ( k ) ФL ( t ) + ZZ dj ( k X ( t ) k j=1 k

Where, the functions ф ( t ) and ф ( t ) are respectively known as the scaling function and the mother wavelet. The approximation and detail coefficients at level j are given by:

C j + 1 ( k ) = Z h o ( m - 2 k ) C ( m ) (6)

m dj+i(k) = Z hi(m — 2k) C (m) (7)

h ( k ) and h ( k ) are known as wavelet filters.

IV. Methodology for Speech Compression

In this research work, speech compression algorithm based on transformation is performed using three most commonly used steps: applying transformation (DWHT or DWT), truncate coefficients (thresholding) and quantization followed by entropy encoding (Figure 2).

the largest absolute value coefficients. In this case, the threshold value ( thr ) is manually adjusted and is chosen from coefficients ( 0 < thr < C_{KalM ax} ), where C is the maximum value of the DWHT coefficients or DWT Coefficients.

Step3: Signal compression using transformation is achieved by truncated small valued coefficients and efficient encoding them. There are many way to encode coefficients; one way, is to store the thresholded coefficients with their respective positions in the DWHT vector [1]. In this work, for encoding coefficients, the consecutive zero valued coefficients are encoded with two bytes: One byte to indicate the start sequence of zeros in the coefficients vector and the other byte is used to represent the number of consecutive zeros (example: Figure 1) [2]. After encoding coefficients, a quantization process (Uniform, scalar or vector quantization) is performed followed by an entropy encoder (Huffman or arithmetic coding) to eliminate any redundancy caused by quantization.

Fig. 2: Discrete Walsh Hadamard Transform based methodology for speech compression

V. Tests and Results

In this section, a MATLAB program has been developed for implement the speech compression codec based on DWHT described in this paper. To evaluate the efficiency of the developed algorithm a comparative performance study between the DWHT and DWT algorithms given in [1], [2], [3] and [4] used for speech compression is made in terms of: computation time (CPU time), Compression Ratio (CR), Signal to Noise Ratio (SNR), Peak Signal to Noise Ratio (PSNR) and Normalized Root Mean Square Error (NRMSE). In the simulation steps, all used source waveform files are taken from TIMIT Database, sampled at 16 kHz.

The obtained results are calculated using the following formulas:

■ Signal to Noise Ratio (SNR):

SNR =

Z x (n )2

Zlx(n) - y(n )I2

■ Peak Signal to Noise Ratio (PSNR) :

Step1: In this step, the input speech signal is divided into stationary frames and then transformation method (DWHT or DWT) is applied of each frame in order to extract coefficients.

Step2: After performing the transformation method of the speech frame, com41pression involves truncating the obtained coefficients below a given threshold values. For truncate the small valued coefficients, a global thresholding is applied. It’s consists in taking the obtained coefficients for each speech frame and keeping

PSNR = 10log10

Nx (n )2

J|x(n)- y(n)||2 ?

■ Normalized Root Mean Square Error (NRMSE):

PSNR = 10log10

Nx (n )2

J Iх (n)- y(n )||2

• Compression Ratio (CR):

size of original signal

size of compressed signal

Where, x(n) and y(n) are respectively the original and the reconstructed speech signal, N is the length of the reconstructed speech signal and Ц (n) is the mean of the speech signal.

Table 1 and figure 3, illustrate the performance evaluation of the proposed algorithm for speech compression based on DWHT.

Table 1: Performances evaluation using DWHT for speech compression

Source waveform files	Threshold values	CR	SNR	PSNR	NRMSE
sx19.wav	0.0027	4.5028	18.0160	39.1079	0.1257
sx27.wav	0.0028	4.2140	18.1416	39.0820	0.1239
sx29.wav	0. 0009	8.0193	24.3324	51.3917	0.0607
sx30.wav	0.0022	3.7276	18.5287	39.1578	0.1185
sx37.wav	0.0027	4.2200	19.6431	38.5735	0.1042
sx41.wav	0.0016	3.5660	18.3578	42.7530	0.1208
sx46.wav	0. 0028	4.0968	18.1045	39.1243	0.1244
sx56.wav	0. 0020	3.0804	18.0620	40.6448	0.1250
sx63.wav	0. 0025	3.6526	17.0268	38.6134	0.1408
sx77.wav	0. 0027	4.1803	18.6579	38.8914	0.1167
sx81.wav	0.0025	5.4464	18.8741	40.2100	0.1138
sx87.wav	0. 0028	3.4364	19.0717	37.9964	0.1113
sx93.wav	0.0030	4.3077	17.7421	38.1067	0.1297

In the simulation steps, the speech signal is divided into frames of size 256 and global thresholding is applied. From the table 1, it is clear that the developed algorithm based on DWHT provides an important compression ratio while keeping a good quality of the reconstructed speech signal. Figure 3 illustrates the plot of original and reconstructed speech signal (“Critical equipment needs proper maintenance”) using DWHT algorithm.

Table 2 illustrates the comparative performance of the proposed algorithm and the algorithm based on based on DWT. For the DWT algorithm given in [2], [4], [5] and [6], the used mother wavelet is “db10”, five decomposition levels and global thresholding are applied.

Table 2: Comparison of performance between DWHT and DWT

Source waveform files	Algorithms	Threshold values	CPU Time (s)	CR	SNR	PSNR	NRMSE
sx19.wav	DWT	0.0650	3.7969	3.9789	18.2507	39.3426	0.1223
sx19.wav	DWHT	0.0025	1.3750	4.2609	18.5368	39.6287	0.1183
sx27.wav	DWT	0.0600	2.5156	3.5111	18.6414	39.5818	0.1169
sx27.wav	DWHT	0.0025	0.8125	3.9672	19.0219	39.9624	0.1119
sx30.wav	DWT	0.0520	1.9375	3.1525	18.3430	38.9721	0.1210
sx30.wav	DWHT	0.0022	0.6719	3.7276	18.5287	39.1578	0.1185
sx31.wav	DWT	0.0430	2.6406	2.7657	17.6014	39.5243	0.1318
sx31.wav	DWHT	0.0022	0.8906	3.2498	18.0094	39.9323	0.1258
sx34.wav	DWT	0.0400	4.6094	2.5369	18.5023	40.4045	0.1188
sx34.wav	DWHT	0.0021	1.6250	2.9132	18.7166	40.6188	0.1159
sx37.wav	DWT	0.0620	2.6563	3.6009	19.7017	38.6321	0.1035
sx37.wav	DWHT	0.0027	0.9063	4.2200	19.6431	38.5735	0.1042
sx38.wav	DWT	0.0365	4.2188	3.1461	18.8678	42.0462	0.1139
sx38.wav	DWHT	0.0017	1.5156	3.4667	18.9621	42.1406	0.1127
sx81.wav	DWT	0.0650	2.3594	4.4344	18.7374	40.0733	0.1156
sx81.wav	DWHT	0.0025	0.7969	5.4464	18.8741	40.2100	0.1138

From the above table , it is evident that the total CPU time (in seconds) used by MATLAB for running DWHT algorithm is decreased to about 65,5% compared to the DWT algorithm. In addition, it can be seen that the global performances (CR, SNR, PSNR and NRMSE) are significantly improved.

Table 3 and figures, illustrate the variation of performances study of the developed based on DWHT for different values of frame size. From the figures, it can be seen that the quality of the reconstructed speech signal using the proposed algorithm is degraded for large frame size (non-stationary signal).

Table 3: Comparison of performance using different values of frame size

Source waveform files	Mesures	Algorithms	Frame size
Source waveform files	Mesures	Algorithms	32	64	128	256	512
sx19.wav	CR	DWT	0.8370	1.5044	2.5510	3.9789	5.5215
	CR	DWHT	2.3230	2.7883	3.4113	4.2609	5.5861
	SNR	DWT	17.4444	17.9365	18.1304	18.2507	18.2806
	SNR	DWHT	26.6662	23.8570	21.0363	18.5368	15.8379
	PSNR	DWT	38.5438	39.0334	39.2223	39.3426	39.3726
	PSNR	DWHT	47.7656	44.9539	42.1282	39.6287	36.9299
	NRMSE	DWT	0.1342	0.1268	0.1240	0.1223	0.1219
	NRMSE	DWHT	0.0464	0.0641	0.0888	0.1183	0.1615
sx31.wav	CR	DWT	0.6683	1.1713	1.8898	2.7657	3.6109
	CR	DWHT	1.9829	2.2825	2.6595	3.2498	4.2784
	SNR	DWT	17.4669	17.4930	17.5950	17.6014	17.5889
	SNR	DWHT	27.2253	24.1645	21.0686	18.0094	14.9258
	PSNR	DWT	39.4159	39.4383	39.5328	39.5243	39.4818
	PSNR	DWHT	49.1743	46.1098	43.0064	39.9323	36.8186
	NRMSE	DWT	0.1339	0.1335	0.1319	0.1318	0.1320
	NRMSE	DWHT	0.0435	0.0619	0.0884	0.1258	0.1794

(a)

(e)

(b)

(f)

(c)

(g)

(d)

(h)

Список литературы Speech Compression Based on Discrete Walsh Hadamard Transform

Hatem Elaydi. Speech compression using Wavelets [J]. site.iugaza.edu.ps/helaydi/files/2010/02/Elaydi.pdf. 2010.
A. Kumar, G.K. Singh, G. Rajesh, K. Ranjeet. The optimized wavelet filters for speech compression[J]. Int J Speech Technol (Springer), 2012.
G. Rajesh, A. Kumar and K. Ranjeet. Speech Compression using Different Transform Techniques[C]. IEEE International Conference on Computer and Communication Technology(ICCCT), 2011. 46-151.
Kinsner, W. and Langi, A. Speech and Image Signal Compression with Wavelets[C]. IEEE Wescanex Conference Proceedings, IEEE, New York, NY, 1993. 368-375.
Yousef M. Hawwar, Ali M. Reza, Robert D. Turne. Filtering (Denoising) in the Wavelet Transform Domain[J]. University of Wisconsin-Milwaukee, Department of Electrical Engineering and Computer Science, 2000.
Xiang-Yang Wang, Hong-Ying Yang, Zhong-Kai Fu. A New Wavelet-based image denoising using undecimated discrete wavelet transform and least squares support vector machine[J]. Expert Systems with Applications: An International Journal, 2010. 7040-7049.
Mohammed Bahoura, Hassan Ezzaidi. FPGA-Implementation of Discrete Wavelet Transform with Application to Signal Denoising[J]. Springer, Circuits, Systems, and Signal Processing, 2012. 987-1015.
K.G. Beauchamp. Applications of Walsh and Related Functions-With an Introduction to Sequency Theory[A]. Academic Press, 1984.
T. Beer. Walsh Transforms[J]. American Journal of Physics 49(5)., May 1981.

Еще

Статья научная

Speech Compression Based on Discrete Walsh Hadamard Transform

Текст научной статьи Speech Compression Based on Discrete Walsh Hadamard Transform

II. Discrete Walsh Hadamard Transform

III. Discrete Wavelet Transform

IV. Methodology for Speech Compression

V. Tests and Results

Список литературы Speech Compression Based on Discrete Walsh Hadamard Transform