Generalized Cross Correlation With Phase Transform Information Technology Essay
The sound signal from a source is captured by a pair of microphones. The analog signal from the microphone must be amplified and converted into digital for further processing. Hence, an analog to digital converter external to the FPGA is used. After the analog signal is digitized, the signal is fed into FPGA for source localization. The digital signals are first saved into input buffer. In our case, we have used First in First out (FIFO) as a memory controller. The signals in time domain are then converted into frequency domain by using Fast Fourier Transform (FFT).
Generalized cross correlation (GCC) is used for the estimation of time difference of arrival (TDOA) of the sound signals from the microphone pair. After TDOA is estimated, we then calculate the direction from which the sound signal is arriving which we refer to as direction of arrival (DOA). The appropriate degree is used by the stepper motor controller module which controls the rotation and speed of the stepper motor. The signal from stepper motor controller module is fed into stepper motor driver circuit which is responsible in energizing the appropriate coils of the stepper motor in particular sequence so as to rotate the stepper motor to the direction of the sound source.
4.2 Time difference of arrival (TDOA) estimation:
Consider a sound source and a pair of microphones located at some distance from the source. As sound travels through air with constant speed, it takes time to arrive from source to the microphone location. The farther a microphone is located from the sound source, the longer it will take for the sound to reach the microphone. Two microphones located at different places will receive the same sound (plus noise) at a different time. The TDOA method takes advantage of this ‘delayed’ arrival to estimate the location of the source. [ref mediatech4]
For a source signal s(t) propagated through a noisy and reverberant environment, the sensor signals of two spatially separated microphones and and can be expressed as:
(4.1)
(4.2)
where is the attenuation factor, and are the time delays, and the additive term includes the channel noise in the microphone system as well as any ambient noise for the sensor. This noise is assumed to be uncorrelated with .
The TDOA estimation calculates the relative delay of the time delays and of the two recorded microphone signals and defined as: [ref 10.1.1.9.3680]
(4.3)
Now formulae (4.1) and (4.2) can be written as
(4.4)
(4.5)
The Fourier transform of captured signal from microphones and is expressed as:
(4.6)
(4.7)
HELLO!! I am here!
Fourier Transform
GCC-PHAT
Inverse Fourier Transform
Max value to give TDOA
xi(t)
xj(t)
Fig 4.2 Block for estimating TDOA
Cross correlation between the two signals is calculated to estimate the time difference of arrival.
The cross correlation of signals and is
(4.8)
The cross correlation, reaches its maximum at the time delay (∆t).
is the fourier transform of the cross correlation spectrum and is obtained from the individual spectra of and :
(4.9)
where is the Fourier transform of and is the complex conjugate of the Fourier transform of .
And may be expanded using equation (4.6)
(4.10)
It is assumed that the average energy of the captured signals is significantly greater than the interfering noise, such that
(4.11)
The last three terms of equation (4.10) are negligible compared to first term based on the above assumption. Expression (4.10) now reduces to:
(4.12)
The can be found by evaluating
(4.13)
where IFT stands for inverse Fourier transform
Generalized Cross Correlation with Phase Transform (GCC-PHAT)
A major limitation of the approach is that it is highly influenced by noise and reverberation. To minimize its influence on TDOA estimation, we need to whiten the cross correlation by using Generalized Cross Correlation with Phase Transform (GCC-PHAT)
(4.14)
And its time domain counterpart is:
(4.15)
Where FT stands for Fourier transform and IFT stands for inverse Fourier transform
The output of the GCCPHAT function is delta like with peak at .
The above analysis is based on analog signals and stationary . For digital processing, the are sampled and become discrete sequences . Also the are not stationary as the source of sound is liable to change location. For analysis, finite frames are also required due to computational constraints. Hence the usual windowing techniques are applied and the sampled signals are broken into analysis frames. [10.1.1.43.5850]
After conversion into the discrete finite sequence domain, equation (4.15) becomes:
(4.16)
Direction of arrival (DOA) estimation
It is possible to compute the position of the source through geometrical calculation once TDOA is estimated. We consider a simple model based on far field assumption.
Microphone i
Microphone j
source direction
¦
±
Figure 4.3 Computing source direction from TDOA
Consider the Figure 4.3, which illustrates the case of 2 microphone array with a source in the far field. Using cosine law, we state that:
(4.16)
where is a unit vector pointing in the direction of the source and is the vector that goes from microphone i to microphone j. It can also be stated that:
(4.17)
where c is the velocity of sound.
Taking microphone i as reference, we place stepper motor in the position of microphone i. The angle of sound signal arrival is given by (90-¦) as shown in Figure 4.4. Also, from the figure, (90-¦) = ±.
¦
±
stepper motor position
90-¦
Microphone i
Microphone j
source direction
Figure 4.4 Stepper motor position and estimation of direction of sound source.
Fast Fourier Transform:
Frequency analysis of discrete time signals is usually and most suitably performed on a digital signal processor, which may be a specially designed digital hardware or a general purpose digital computer. We convert the time-domain sequence to an equivalent frequency domain representation to perform frequency analysis on a discrete-time signal {x[n]}.
We consider the representation of a sequence {x[n]} by samples of its spectrum X(·). Such a frequency domain representation leads to the Discrete Fourier Transform (DFT), which is a powerful computational tool for performing frequency analysis of discrete time signals.
In the view of the importance of the DFT in various digital signal processing applications, such as correlation analysis, linear filtering, spectrum analysis, its efficient computation is a topic that has received considerable attention by many mathematicians, engineers, and applied scientists[ ].
The number of complex multiplication and addition operations required by the simple forms both the Discrete Fourier Transform (DFT) and Inverse Discrete Fourier Transform (IDFT) is of order N2 as there are N data points to calculate, each of which requires N complex arithmetic operations.
For length n input vector x, the DFT is a length n vector X, with n elements:
(4.18)
Basically, DFT computes the sequence {X(k)} of N complex valued numbers given another sequence of data {x(n)} of length N, according to the formula
(4.19)
Where,
Similarly, Inverse Discrete Fourier Transform (IDFT) is given by,
(4.20)
The DFT and IDFT involve essentially the same type of computations. Therefore, efficient computational algorithms for the DFT also apply to the efficient computation of the IDFT.
Direct computation of X(k) involves N complex multiplications or each value of k, 4N real multiplications and N-1 complex additions (4N-2 real additions). Hence to compute all N values of the DFT requires N2 complex multiplications and N2-N complex additions. We may say they have algorithmic complexity O(N2) and hence is not a very efficient method. Hence, DFT will not be very useful for the majority of practical DSP applications if we can’t do any better than this.
However, there are a number of different ‘Fast Fourier Transform’ (FFT) algorithms that enable the calculation the Fourier transform of a signal much faster than a DFT. FFT exploits the symmetry and periodicity properties of the phase factor WN. In particular these two properties are:
Symmetry property:
(4.21)
Periodicity property:
(4.22)
As the name suggests, FFTs are algorithms for quick calculation of discrete Fourier transform of a data vector. The FFT is a DFT algorithm which reduces the number of computations needed for N points from O(N 2) to O(N log2 N). The response of an FFT looks like a ‘sinc’ function (sinx) / x, if the function to be transformed is not harmonically related to the sampling frequency,
Radix 2 algorithm:
It is one of the commonly used FFT algorithms. The ‘Radix 2’ algorithms are useful if N is a regular power of 2 (N=2p). If we assume that algorithmic complexity provides a direct measure of execution time and that the relevant logarithm base is 2 then as shown in Table 4.1, ratio of execution times for the (DFT) vs. (Radix 2 FFT) (denoted as ‘Speed Improvement Factor’) increases tremendously with increase in N. [fft.pdf]
Number of points,
N
Complex multiplications in direct computation
N2
Complex multiplications in FFT algorithm
(N/2)log2N
Speed improvement factor
4
16
4
4.0
8
64
12
5.3
16
256
32
8.0
32
1024
80
12.8
64
4096
192
21.3
128
16384
448
36.6
256
65536
1024
64.0
512
262144
2304
113.8
1024
1048576
5120
204.8
Table 4.1 Comparison of execution times, DFT and Radix -2 FFT
There are two different radix 2 algorithms, “Decimation in Time” (DIT) and “Decimation in frequency” (DIF) algorithm. They both rely on the recursive decomposition of an N point transform into 2 (N/2) point transforms. This decomposition process can be applied to any composite (non prime) N. The method is particularly simple if N is divisible by 2 and if N is a regular power of 2, the decomposition can be applied repeatedly until the trivial ‘1 point’ transform is reached.
Divide and conquer method is used to obtain the radix -2 decimation in frequency FFT. Figure 4.5 shows the first stage of the 8 point DIF algorithm. The decimation causes shuffling in data.
-1
-1
-1
-1
x[0]
x[1]
x[2]
x[4]
x[3]
x[5]
x[6]
x[7]
X[0]
X[1]
X[2]
X[3]
X[5]
X[4]
X[6]
X[7]
N/2 Point DFT
N/2 Point DFT
g[0]
g[2]
g[1]
g[3]
h[0]
h[1]
h[2]
h[3]
WN0
WN1
WN2
WN3
Figure 4.5 First stage of 8 point Decimation in Frequency Algorithm
The entire process involves v = log2N stages of decimation, where each stage involves N/2 butterflies of the type shown in the Figure 4.6. Here is the Twiddle factor.
a
b
A=a+b
B=(a-b)W’N
W’N
Figure 4.6 Butterfly scheme
Consequently, the computation of N-point DFT via this algorithm requires (N/2) log2N complex multiplications. For illustrative purposes, the eight-point decimation-in frequency algorithm is shown in the Figure 4.7 below. We observe that the output sequence occurs in bit-reversed order with respect to the input.
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
X[6]
X[7]
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
WN0
WN0
WN0
WN0
WN0
WN2
WN0
WN2
WN0
WN1
WN2
WN3
x[0]
x[1]
x[2]
x[3]
x[4]
x[5]
x[6]
x[7]
Figure 4.7 8 point decimation in frequency algorithm
Matlab Simulation
Figure 4.8 Simulink model for verification of TDOA algorithm
Simulink model is used to verify the algorithm for calculating the time difference of arrival (TDOA). Samples of audio signal is read from a wave file “hello_8000.wav” at 8kHz Windowing is done to take certain number of samples of the signal each time for further processing in order to estimate the TDOA . Audio signals are taken through two channels, where each channel assumes to be the signal from each microphone with certain delay samples added to one channel. From Figure 4.8 we can observe that 25 delay samples have been considered.
Noise has been added externally using Gaussian Noise Generator block to check the resistance of the system against ambient noise. Our system showes good resistance to noise after analyzing with different noise conditions added to two channels.
Figure 4.9 Plot of samples of time delay as viewed in scope
The plot for samples of time delay obtained for simulation time of 20 sec is shown in figure 4.9. We see that a straight line is obtained at delay sample of 25. But, we can also see variations in between, with maximum value of delay samples shooting to 128. But the maximum probability we obtained is of 25 delay samples.
Figure 4.10 Plot of time delay (secs) as viewed in scope
Corresponding to the delay samples, we then calculate the time delay in seconds between the signals in the two channels. For 25 delay samples, we obtain the time delay as 0.003125 sec. The plot for the time delay obtained is shown in figure 4.10.
Figure 4.11 Stepper motor drive model
Figure 4.11 shows a Simulink model of stepper motor drive using 2 phase hybrid stepper motor. The motor phases are fed by two H-bridge MOSFET PWM converters. 28 V DC voltage source provides the voltage required for the driver module. The movement of stepper motor drive is controlled by the STEP and DIR signals generated from the Signal Builder block. Square-wave current references are generated using the current amplitude and the step frequency parameters specified in the dialog window of the Signal Builder block.
The STEP signal from the Signal Builder block controls the movement of the stepper motor. A positive value of STEP signal will rotate the motor whereas a zero value will stop the rotation. The DIR signal controls the direction in which the motor rotates. A positive value of DIR rotates the motor is one direction while a negative value will impose the reverse direction.
Consider simulation time of 0.25 sec. The STEP and DIR input is shown in figure 4.12. The resulting waveforms are given in figure 4.13. The stepper motor rotates in positive direction for 0.1 sec, stops for 0.05 sec and then rotates in the opposite direction for 0.05 sec and then stops.
Figure 4.12 Signal Builder for generating the STEP and DIR signals
Figure 4.13 Waveforms as viewed in the scope
STEPPER MOTOR
A stepper motor is an electromechanical device which converts electrical pulses into discrete mechanical movements. The shaft or spindle of a stepper motor rotates in discrete step increments when electrical command pulses are applied to it in the proper sequence. The motors rotation has several direct relationships to these applied input pulses. The sequence of the applied pulses is directly related to the direction of motor shafts rotation. The speed of the motor shafts rotation is directly related to the frequency of the input pulses and the length of rotation is directly related to the number of input pulses applied. [motorbas.pdf]
Figure 3.13 Stepper motor
Whenever controlled movement is required, stepper motor can be a good choice. They are effective in applications where you need to control rotation angle, position, speed and synchronism. Stepper motors have found their place in different applications like printers, plotters, medical equipment, hard disk drives, automotive and many more. One of the most important advantages of a stepper motor is that it can be accurately controlled in an open loop system. Open loop control means no feedback information about position is needed. The position is known simply back tracking the input step pulses. Hence, taking into account these advantages, we have used a stepper motor for pointing to the direction of a sound source.
A magnetic flux is developed in the stator when a phase winding of a stepper motor is energized with current. The direction of this flux is determined by the right hand rule. Consider figure , when phase B is energized with winding current in the direction shown, rotor aligns itself to minimize the flux opposition. In this case the motor would rotate clockwise so that its south pole aligns with the north pole of the stator B at position 2 and its north pole aligns with the south pole of stator B at position 6. Hence, in order to rotate the motor, we must provide a sequence of energizing the stator windings providing a rotating magnetic flux which the rotor follows due to magnetic attraction. [motorbas.pdf]
Figure 3.14 Stator and rotor of stepper motor showing the current direction
We have used a stepper motor with “half step drive” stepping mode has. In this case, every second step only one phase is energized and during the other steps, one phase on each stator is energized. The stator is energized according to the sequence:
and the rotor steps from position:
Table: Excitation sequences for different drive modes
Stepper motor driver circuit:
Q1
+
R2
R3
D3
D5
D2
Q2
D1
D4
B1
R4
D6
GND2
GND1
R1
Stepper Coil
Figure. Driver Circuit for stepper motor coil
Components:
Resistors:
R1=330 Ω
R2=2.2 kΩ
R3=2.2 kΩ
R4=2.2 kΩ
Q1 is optoisolator, PC 817
Diodes D1, D4 are 1N 4007
Diodes D2, D3 are 15 volt zener diode
Diodes D5 and D6 are LEDs
Q2 is IRF540N n-channel MOSFET
B1 is 12 V supply for motor
GND1 is FPGA ground
GND2 is motor supply ground
Operation:
The MOSFET generally remains off due to the pull down of the gate voltage by the series connection of the resistor R3 and LED D5. When the input from FPGA is of high logic the diode of the optoisolator is turned on biasing the transistor, increasing the gate voltage which turns the MOSFET on. This way the winding of the stepper motor is energized. When the input from the FPGA is of low logic, the MOSFET is turned off, which de-energizes the winding.
MOSFET is highly sensitive to high voltage and gets damaged if the gate to source voltage exceeds 20V. Hence, for protection 15V zener diode is used. The protection for high reverse voltage is obtained by using diodes D1 and D4.
MULTIPLIER
Figure
The Multiplier core can be configured in either parallel architecture or constant-coefficient architecture. In parallel architecture, the multiplier accepts inputs on buses A and B and generates the product of these two values. While in constant-coefficient architecture, the multiplier accepts the data on the A input bus and multiplies it by a user defined constant value.
The multiplier core generates fixed point parallel multipliers and constant coefficient multipliers for two’s complement signed or unsigned data. It supports input ranging from 1 to 64 bits wide and outputs ranging from 1 to 128 bits wide.
Input signal:
A[17:0] : A operand input bus, 18 bits wide
B[17:0] : B operand input bus, 18 bits wide (parallel multipliers only)
CLK: Rising edge clock input
Output signal:
P[35:0]: Product output – bit 35 downto bit 0
6. EPILOGUE
6.1 System Required
To deploy any system, both supporting hardware and software must be good enough to make the system work properly. The major requirements in our system are mainly FPGA, microphones, audio amplifiers, stepper motor, MOSFETs and the necessary software for VHDL coding and simulation.
Hardware:
Microphones
Audio amplifiers
Connectors and cables
Proto board
Resistors, capacitors
MOSFETs
Power supply
Software:
Xilinx ISE
Modelsim
Matlab
Proteus
Windows XP, Vista
COST INCURRED
We have presented here, the cost incurred during the undertaking of the project.
Input section:
S.No.
Component
Quantity
Rate (NRs)
Cost (NRs)
1
Microphone
2
10
20
2
ADC (0808 CCN)
1
450
450
3
Buffer (HEF4050BP)
2
50
100
4
Audio Amplifier (LM 386)
2
25
50
5
Resistor
1 pack
50
50
6
Capacitor
1 pack
200
200
7
Miscellaneous
400
Total
1,270
Output section:
S.No.
Component
Quantity
Rate (NRs)
Cost (NRs)
1
Stepper motor
1
500
500
2
12 V battery
1
1000
1,100
3
Optoisolator (PC 817)
4
25
100
4
MOSFET (IRF540N)
4
45
180
5
15 V zener diode
8
5
40
6
Resistor
1 pack
50
50
7
Diode
20
1
20
8
Miscellaneous
250
Total
2,240
TOTAL COST (NRs) = 3,510
Note:
Spartan 3E starter kit was provided by the Department of Electronics and Computer Engineering.
Communication and transportation cost has not been included.
Order Now