American Journal of Signal Processing
p-ISSN: 2165-9354 e-ISSN: 2165-9362
2011; 1(1): 1-5
doi: 10.5923/j.ajsp.20110101.01
M. B. Dehkordi
Speech Processing Research Lab, Elec. and Comp. Eng. Dept., Yazd University, Yazd, Iran
Correspondence to: M. B. Dehkordi , Speech Processing Research Lab, Elec. and Comp. Eng. Dept., Yazd University, Yazd, Iran.
| Email: | ![]() |
Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.
Microphone arrays are today employed to specify the sound source locations in numerous real time applications such as speech processing in large rooms or acoustic echo cancellation. Signal sources may exist in the near field or far field with respect to the microphones. Current Neural Networks (NNs) based source localization approaches assume far field narrowband sources. One of the important limitations of these NN-based approaches is making balance between computational complexity and the development of NNs; an architecture that is too large or too small will affect the performance in terms of generalization and computational cost. In the previous analysis, saliency subject has been employed to determine the most suitable structure, however, it is time-consuming and the performance is not robust. In this paper, a family of new algorithms for compression of NNs is presented based on Compressive Sampling (CS) theory. The proposed framework makes it possible to find a sparse structure for NNs, and then the designed neural network is compressed by using CS. The key difference between our algorithm and the state-of-the-art techniques is that the mapping is continuously done using the most effective features; therefore, the proposed method has a fast convergence. The empirical work demonstrates that the proposed algorithm is an effective alternative to traditional methods in terms of accuracy and computational complexity.
Keywords: Sampling, Sound Source, Neural Network, Pruning, Multilayer Perceptron, Greedy Algorithms
Cite this paper: M. B. Dehkordi , "Sound Source Localization with CS Based Compressed Neural Network", American Journal of Signal Processing, Vol. 1 No. 1, 2011, pp. 1-5. doi: 10.5923/j.ajsp.20110101.01.
[2] fig. 1. In this equation
is the minimum wavelength of the source signal, and D is the microphone array length. With this condition, incoming waves are approximately planar. So, the time delay of the received signal between the reference microphone and the
microphone would be [15]:![]() | (1) |
is the distance between two microphones,
is the DOA, and
is the velocity of sound in air. Therefore,
is the amount of time that the signal traverses the distance between any two neighboring microphones, Figure. 1 and 2 illustrates this fact.![]() | Figure 1. Estimation of far-field source location. |
![]() | Figure 2. Estimation of near-field source location. |
microphone would be[15] Figure. 2:![]() | (2) |
is the distance between source and the first (reference) microphone[15].![]() | Figure 3. Multilayer Perceptron neural network for sound source localization. |
is the signal received at the
microphone and
is the reference microphone
. We can write the signal at the
microphone in terms of the signal at the first microphone signal as follow:![]() | (3) |
and sensor
like below:![]() | (4) |
![]() | (5) |
and
to
(for
) and thus to the DOA. Therefore our aim is to use an MLP neural network to approximate this mapping.We summarized our algorithm for computing a real-valued feature vector of length (2(M-1)+K)K, for k dominant frequencies and M sensors below:Preprocessing algorithm for computing a real-valued feature vector:1. Calculate the N-point FFT of the signal at each sensor.2. For
2.1. Find the k FFT coefficients in absolute value for sensor M with compressive sampling.2.2. Multiply the
FFT coefficient for sensor
with the conjugate of the FFT coefficient at the same indices for sensor
to calculate the instantaneous estimate of the cross-power spectrum.2.3. Normalize all the estimates by dividing there absolute values.3. Construct a feature vector that contains the real and imaginary part of cross-power spectrum coefficient and their corresponding FFT indices.We utilized two-layer Perceptron neural network and trained it according to fast back propagation training algorithm[7]. For training network we use a simulated dataset of received signals. We modeled received signal as a sum of cosines with random frequencies and phases. We write received sampled signal at sensor
as below:![]() | (6) |
is the number of cosines (we assumed
),
is the frequency of the
cosine,
is the initial phase of the
cosine,
is the time delay between the reference microphone (
) and the
microphone,
is white Gaussian noise, and
is uniformly distributed over[200Hz,2000Hz] and
is uniformly distributed over[0,
]. We generate 100 independent sets of 128 sampled vectors, and then calculate feature vectors. A total of 3600 input-output pairs are used to train the MLP.In training step, after making learning dataset and calculating feature vectors, we use compressive sampling algorithm to decrease feature vectors dimension. In testing step for a new received sampled signal, we calculate feature vectors and estimate DOA of sound source. Our experiments show that errors in classification and in approximation have direct relation with number of hidden neurons. Figure. 4 show these relations for far field and near field sources.![]() | Figure 4. Relation between number of hidden neurons and error. |
and a dictionary
(the columns of D are referred to as the atoms), we seek a vector solution
satisfying:![]() | (7) |
(known as
norm), is the number of non-zero coefficient of
.Several iterative algorithms have been proposed to solve this minimization problem (Greedy Algorithms such as Orthogonal Matching Pursuit (OMP) or Matching Pursuit (MP) and Non-convex local optimization like FOCUSS algorithm[16].
was smaller than S, then this matrix called a
matrix.2. If the number of rows that contain nonzero elements in a matrix was smaller than S then this matrix is called a
matrix.We assume that the training input patterns are stored in a matrix I, and the desired output patterns are stored in a matrix O, then the mathematical model for training of the neural network can be extracted in the form of the following expansion:![]() | (8) |
the output matrix of is neural network,
is the output matrix of hidden layer
, are weight matrix of two layers and
are bias terms.In conclusion, our purpose is to design a neural network with least number of hidden neurons (or weights) that has the minimum increase in error given by
. When we minimize a weight matrix (
or
), the behaviour acts like setting, in mathematical viewpoint, the relating elements in
or
to zero. Deduction from above shows that the goal of finding the smallest number of weights in NNs within a range of accuracy can consider to be equal to finding an
Matrix
or
. So we can write problem as below:![]() | (9) |
which most of its rows are zeros. So with definition of
matrix we can rewrite the problem as below:![]() | (10) |
![]() | (11) |
![]() | (12) |
is input matrix of the hidden layer for the compressed neural network. Comparing these equations with 7 we can conclude that these minimization problems can be written as CS problems. In these CS equations
,
and
was used as the dictionary matrixes and
and
are playing the role of the signal matrix. The process of compressing NNs can be regarded as finding different sparse solutions for weight matrix
or
.
|
| [1] | R. Reed, "Pruning algorithms-a survey", IEEE Transactions on Neural Networks, vol. 4, pp. 740-747, May, 1993 |
| [2] | P. Lauret, E. Fock, T. A. Mara, "A node pruning algorithm based on a Fourier amplitude sensitivity test method", IEEE Transactions on Neural Networks, vol. 17, pp. 273-293, March, 2006 |
| [3] | B. Hassibi and D. G. Stork, "Second-order derivatives for network pruning: optimal brain surgeon," Advances in Neural Information Processing Systems, vol. 5, pp. 164-171, 1993 |
| [4] | L. Preehcit, I. Proben, A set of neural networks benchmark problems and benchmarking rules. University Karlsruher, Gerrnany, Teeh. Rep, 21/94, 2004 |
| [5] | M. Hagiwara, "Removal of hidden units and weights for back propagation networks," Proceeding IEEE International Joint Conference on Neural Network, vol. 1, pp. 351-354, Aug. 2002 |
| [6] | M Zhou Yan; Wu Xia, "Quantum Neural Network Algorithm Based on Multi-agent in Target Fusion Recognition System," Computational and Information Sciences (ICCIS), Dec, 2010 |
| [7] | Zhaozhao Zhang; Junfei Qiao, "A node pruning algorithm for feedforward neural network based on neural complexity," ICICIP, Aug, 2010 |
| [8] | E. J. Candes, J. Rombcrg, T. Tao, "Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information," IEEE Transactions on Information Theory, vol. 52, pp. 489-509, Jan. 2006. |
| [9] | J. Haupt, R. Nowak, "Signal reconstruction from noisy random projections," IEEE Transaction on Information Theory, vol. 52, pp. 4036-4048, Aug. 2006 |
| [10] | Y. H. Liu, S. W. Luo, A. J. Li, "Information geometry on pruning of neural network," International Conference on Machine Learning and Cybernetics, Shanghai, Aug. 2004 |
| [11] | J. Yang, A. Bouzerdoum, S. L. Phung, "A neural network pruning approach based on compressive sampling," Proceedings of International Joint Conference on Neural Networks, pp. 3428-3435, New Jersey, USA, Jun. 2009 |
| [12] | H. Rauhut, K. Sehnass, P. Vandergheynst, "Compressed sensing and redundant dictionaries," IEEE Transaction on Information Theory, vol. 54, pp. 2210-2219, May. 2008 |
| [13] | T. Xu and W. Wang, "A compressed sensing approach for underdetermined blind audio source separation with sparse representation," 2009 |
| [14] | J. Laurent, P. Yand, and P. Vandergheynst, "Compressed sensing: when sparsity meets sampling," Feb. 2010 |
| [15] | G. Arslan, F. A. Sakarya, B. L. Evans, "Speaker localization for far field and near field wideband sources using neural networks," Proc. IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing, vol. 2, pp. 569-573, Antalya, Turkey, Jun. 1999 |
| [16] | Yang Geng; Jongdae Jung; Donggug Seol, Sch. of Inf. Technol., Korea Univ. of Technol. & Educ., Cheonan, " Sound-source localization system based on neural network for mobile robots", IJCNN, JUN 2008 |
| [17] | B. Hassibi and D. G. Stork, "Second-order derivatives for network pruning: optimal brain surgeon,” Advances in Neural Information Processing Systems, vol. 5, pp. 164-171, 1993 |
| [18] | M. Hagiwara, "Removal of hidden units and weights for back propagation networks," Proc. IEEE Int. Joint Conf. Neural Network, pp. 351-354, 1993 |
| [19] | S. H. Chagas, L. L. de. Oliveira, J. Baptista, S. Martins, "Review of Localization Schemes Using Artificial Neural Networks in Wireless Sensor Networks", 26th South Symposium on Microelectronics, april 2011 |
| [20] | SNNS software, available at http://www.ra.cs.unituebingen.de/SNNS |