# Generalized Cross Correlation With Phase Transform Information Technology Essay

The sound signal from a source is captured by a pair of microphones. The analog signal from the microphone must be amplified and converted into digital for further processing. Hence, an analog to digital converter external to the FPGA is used. After the analog signal is digitized, the signal is fed into FPGA for source localization. The digital signals are first saved into input buffer. In our case, we have used First in First out (FIFO) as a memory controller. The signals in time domain are then converted into frequency domain by using Fast Fourier Transform (FFT).

Generalized cross correlation (GCC) is used for the estimation of time difference of arrival (TDOA) of the sound signals from the microphone pair. After TDOA is estimated, we then calculate the direction from which the sound signal is arriving which we refer to as direction of arrival (DOA). The appropriate degree is used by the stepper motor controller module which controls the rotation and speed of the stepper motor. The signal from stepper motor controller module is fed into stepper motor driver circuit which is responsible in energizing the appropriate coils of the stepper motor in particular sequence so as to rotate the stepper motor to the direction of the sound source.

4.2 Time difference of arrival (TDOA) estimation:

Consider a sound source and a pair of microphones located at some distance from the source. As sound travels through air with constant speed, it takes time to arrive from source to the microphone location. The farther a microphone is located from the sound source, the longer it will take for the sound to reach the microphone. Two microphones located at different places will receive the same sound (plus noise) at a different time. The TDOA method takes advantage of this ‘delayed’ arrival to estimate the location of the source. [ref mediatech4]

For a source signal s(t) propagated through a noisy and reverberant environment, the sensor signals of two spatially separated microphones and and can be expressed as:

(4.1)

(4.2)

where is the attenuation factor, and are the time delays, and the additive term includes the channel noise in the microphone system as well as any ambient noise for the sensor. This noise is assumed to be uncorrelated with .

The TDOA estimation calculates the relative delay of the time delays and of the two recorded microphone signals and defined as: [ref 10.1.1.9.3680]

(4.3)

Now formulae (4.1) and (4.2) can be written as

(4.4)

(4.5)

The Fourier transform of captured signal from microphones and is expressed as:

(4.6)

(4.7)

HELLO!! I am here!

Fourier Transform

GCC-PHAT

Inverse Fourier Transform

Max value to give TDOA

xi(t)

xj(t)

Fig 4.2 Block for estimating TDOA

Cross correlation between the two signals is calculated to estimate the time difference of arrival.

The cross correlation of signals and is

(4.8)

The cross correlation, reaches its maximum at the time delay (âˆ†t).

is the fourier transform of the cross correlation spectrum and is obtained from the individual spectra of and :

(4.9)

where is the Fourier transform of and is the complex conjugate of the Fourier transform of .

And may be expanded using equation (4.6)

(4.10)

It is assumed that the average energy of the captured signals is significantly greater than the interfering noise, such that

(4.11)

The last three terms of equation (4.10) are negligible compared to first term based on the above assumption. Expression (4.10) now reduces to:

(4.12)

The can be found by evaluating

(4.13)

where IFT stands for inverse Fourier transform

## Generalized Cross Correlation with Phase Transform (GCC-PHAT)

A major limitation of the approach is that it is highly influenced by noise and reverberation. To minimize its influence on TDOA estimation, we need to whiten the cross correlation by using Generalized Cross Correlation with Phase Transform (GCC-PHAT)

(4.14)

And its time domain counterpart is:

(4.15)

Where FT stands for Fourier transform and IFT stands for inverse Fourier transform

The output of the GCCPHAT function is delta like with peak at .

The above analysis is based on analog signals and stationary . For digital processing, the are sampled and become discrete sequences . Also the are not stationary as the source of sound is liable to change location. For analysis, finite frames are also required due to computational constraints. Hence the usual windowing techniques are applied and the sampled signals are broken into analysis frames. [10.1.1.43.5850]

After conversion into the discrete finite sequence domain, equation (4.15) becomes:

(4.16)

Direction of arrival (DOA) estimation

It is possible to compute the position of the source through geometrical calculation once TDOA is estimated. We consider a simple model based on far field assumption.

Microphone i

Microphone j

source direction

## ¦

## ±

Figure 4.3 Computing source direction from TDOA

Consider the Figure 4.3, which illustrates the case of 2 microphone array with a source in the far field. Using cosine law, we state that:

(4.16)

where is a unit vector pointing in the direction of the source and is the vector that goes from microphone i to microphone j. It can also be stated that:

(4.17)

where c is the velocity of sound.

Taking microphone i as reference, we place stepper motor in the position of microphone i. The angle of sound signal arrival is given by (90-¦) as shown in Figure 4.4. Also, from the figure, (90-¦) = ±.

## ¦

## ±

stepper motor position

90-¦

Microphone i

Microphone j

source direction

Figure 4.4 Stepper motor position and estimation of direction of sound source.

Fast Fourier Transform:

Frequency analysis of discrete time signals is usually and most suitably performed on a digital signal processor, which may be a specially designed digital hardware or a general purpose digital computer. We convert the time-domain sequence to an equivalent frequency domain representation to perform frequency analysis on a discrete-time signal {x[n]}.

We consider the representation of a sequence {x[n]} by samples of its spectrum X(·). Such a frequency domain representation leads to the Discrete Fourier Transform (DFT), which is a powerful computational tool for performing frequency analysis of discrete time signals.

In the view of the importance of the DFT in various digital signal processing applications, such as correlation analysis, linear filtering, spectrum analysis, its efficient computation is a topic that has received considerable attention by many mathematicians, engineers, and applied scientists[ ].

The number of complex multiplication and addition operations required by the simple forms both the Discrete Fourier Transform (DFT) and Inverse Discrete Fourier Transform (IDFT) is of order N2 as there are N data points to calculate, each of which requires N complex arithmetic operations.

For length n input vector x, the DFT is a length n vector X, with n elements:

(4.18)

Basically, DFT computes the sequence {X(k)} of N complex valued numbers given another sequence of data {x(n)} of length N, according to the formula

(4.19)

Where,

Similarly, Inverse Discrete Fourier Transform (IDFT) is given by,

(4.20)

The DFT and IDFT involve essentially the same type of computations. Therefore, efficient computational algorithms for the DFT also apply to the efficient computation of the IDFT.

Direct computation of X(k) involves N complex multiplications or each value of k, 4N real multiplications and N-1 complex additions (4N-2 real additions). Hence to compute all N values of the DFT requires N2 complex multiplications and N2-N complex additions. We may say they have algorithmic complexity O(N2) and hence is not a very efficient method. Hence, DFT will not be very useful for the majority of practical DSP applications if we can’t do any better than this.

However, there are a number of different ‘Fast Fourier Transform’ (FFT) algorithms that enable the calculation the Fourier transform of a signal much faster than a DFT. FFT exploits the symmetry and periodicity properties of the phase factor WN. In particular these two properties are:

Symmetry property:

(4.21)

Periodicity property:

(4.22)

As the name suggests, FFTs are algorithms for quick calculation of discrete Fourier transform of a data vector. The FFT is a DFT algorithm which reduces the number of computations needed for N points from O(N 2) to O(N log2 N). The response of an FFT looks like a ‘sinc’ function (sinx) / x, if the function to be transformed is not harmonically related to the sampling frequency,

Radix 2 algorithm:

It is one of the commonly used FFT algorithms. The ‘Radix 2’ algorithms are useful if N is a regular power of 2 (N=2p). If we assume that algorithmic complexity provides a direct measure of execution time and that the relevant logarithm base is 2 then as shown in Table 4.1, ratio of execution times for the (DFT) vs. (Radix 2 FFT) (denoted as ‘Speed Improvement Factor’) increases tremendously with increase in N. [fft.pdf]

Number of points,

N

Complex multiplications in direct computation

N2

Complex multiplications in FFT algorithm

(N/2)log2N

Speed improvement factor

4

16

4

4.0

8

64

12

5.3

16

256

32

8.0

32

1024

80

12.8

64

4096

192

21.3

128

16384

448

36.6

256

65536

1024

64.0

512

262144

2304

113.8

1024

1048576

5120

204.8

Table 4.1 Comparison of execution times, DFT and Radix -2 FFT

There are two different radix 2 algorithms, “Decimation in Time” (DIT) and “Decimation in frequency” (DIF) algorithm. They both rely on the recursive decomposition of an N point transform into 2 (N/2) point transforms. This decomposition process can be applied to any composite (non prime) N. The method is particularly simple if N is divisible by 2 and if N is a regular power of 2, the decomposition can be applied repeatedly until the trivial ‘1 point’ transform is reached.

Divide and conquer method is used to obtain the radix -2 decimation in frequency FFT. Figure 4.5 shows the first stage of the 8 point DIF algorithm. The decimation causes shuffling in data.

-1

-1

-1

-1

x[0]

x[1]

x[2]

x[4]

x[3]

x[5]

x[6]

x[7]

X[0]

X[1]

X[2]

X[3]

X[5]

X[4]

X[6]

X[7]

N/2 Point DFT

N/2 Point DFT

g[0]

g[2]

g[1]

g[3]

h[0]

h[1]

h[2]

h[3]

WN0

WN1

WN2

WN3

Figure 4.5 First stage of 8 point Decimation in Frequency Algorithm

The entire process involves v = log2N stages of decimation, where each stage involves N/2 butterflies of the type shown in the Figure 4.6. Here is the Twiddle factor.

a

b

A=a+b

B=(a-b)W’N

W’N

Figure 4.6 Butterfly scheme

Consequently, the computation of N-point DFT via this algorithm requires (N/2) log2N complex multiplications. For illustrative purposes, the eight-point decimation-in frequency algorithm is shown in the Figure 4.7 below. We observe that the output sequence occurs in bit-reversed order with respect to the input.

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

WN0

WN0

WN0

WN0

WN0

WN2

WN0

WN2

WN0

WN1

WN2

WN3

x[0]

x[1]

x[2]

x[3]

x[4]

x[5]

x[6]

x[7]

Figure 4.7 8 point decimation in frequency algorithm

Matlab Simulation

Figure 4.8 Simulink model for verification of TDOA algorithm

Simulink model is used to verify the algorithm for calculating the time difference of arrival (TDOA). Samples of audio signal is read from a wave file “hello_8000.wav” at 8kHz Windowing is done to take certain number of samples of the signal each time for further processing in order to estimate the TDOA . Audio signals are taken through two channels, where each channel assumes to be the signal from each microphone with certain delay samples added to one channel. From Figure 4.8 we can observe that 25 delay samples have been considered.

Noise has been added externally using Gaussian Noise Generator block to check the resistance of the system against ambient noise. Our system showes good resistance to noise after analyzing with different noise conditions added to two channels.

Figure 4.9 Plot of samples of time delay as viewed in scope

The plot for samples of time delay obtained for simulation time of 20 sec is shown in figure 4.9. We see that a straight line is obtained at delay sample of 25. But, we can also see variations in between, with maximum value of delay samples shooting to 128. But the maximum probability we obtained is of 25 delay samples.

Figure 4.10 Plot of time delay (secs) as viewed in scope

Corresponding to the delay samples, we then calculate the time delay in seconds between the signals in the two channels. For 25 delay samples, we obtain the time delay as 0.003125 sec. The plot for the time delay obtained is shown in figure 4.10.

Figure 4.11 Stepper motor drive model

Figure 4.11 shows a Simulink model of stepper motor drive using 2 phase hybrid stepper motor. The motor phases are fed by two H-bridge MOSFET PWM converters. 28 V DC voltage source provides the voltage required for the driver module. The movement of stepper motor drive is controlled by the STEP and DIR signals generated from the Signal Builder block. Square-wave current references are generated using the current amplitude and the step frequency parameters specified in the dialog window of the Signal Builder block.

The STEP signal from the Signal Builder block controls the movement of the stepper motor. A positive value of STEP signal will rotate the motor whereas a zero value will stop the rotation. The DIR signal controls the direction in which the motor rotates. A positive value of DIR rotates the motor is one direction while a negative value will impose the reverse direction.

Consider simulation time of 0.25 sec. The STEP and DIR input is shown in figure 4.12. The resulting waveforms are given in figure 4.13. The stepper motor rotates in positive direction for 0.1 sec, stops for 0.05 sec and then rotates in the opposite direction for 0.05 sec and then stops.

Figure 4.12 Signal Builder for generating the STEP and DIR signals

Figure 4.13 Waveforms as viewed in the scope

STEPPER MOTOR

A stepper motor is an electromechanical device which converts electrical pulses into discrete mechanical movements. The shaft or spindle of a stepper motor rotates in discrete step increments when electrical command pulses are applied to it in the proper sequence. The motors rotation has several direct relationships to these applied input pulses. The sequence of the applied pulses is directly related to the direction of motor shafts rotation. The speed of the motor shafts rotation is directly related to the frequency of the input pulses and the length of rotation is directly related to the number of input pulses applied. [motorbas.pdf]

Figure 3.13 Stepper motor

Whenever controlled movement is required, stepper motor can be a good choice. They are effective in applications where you need to control rotation angle, position, speed and synchronism. Stepper motors have found their place in different applications like printers, plotters, medical equipment, hard disk drives, automotive and many more. One of the most important advantages of a stepper motor is that it can be accurately controlled in an open loop system. Open loop control means no feedback information about position is needed. The position is known simply back tracking the input step pulses. Hence, taking into account these advantages, we have used a stepper motor for pointing to the direction of a sound source.

A magnetic flux is developed in the stator when a phase winding of a stepper motor is energized with current. The direction of this flux is determined by the right hand rule. Consider figure , when phase B is energized with winding current in the direction shown, rotor aligns itself to minimize the flux opposition. In this case the motor would rotate clockwise so that its south pole aligns with the north pole of the stator B at position 2 and its north pole aligns with the south pole of stator B at position 6. Hence, in order to rotate the motor, we must provide a sequence of energizing the stator windings providing a rotating magnetic flux which the rotor follows due to magnetic attraction. [motorbas.pdf]

Figure 3.14 Stator and rotor of stepper motor showing the current direction

We have used a stepper motor with “half step drive” stepping mode has. In this case, every second step only one phase is energized and during the other steps, one phase on each stator is energized. The stator is energized according to the sequence:

and the rotor steps from position:

Table: Excitation sequences for different drive modes

Stepper motor driver circuit:

Q1

## +

R2

R3

D3

D5

D2

Q2

D1

D4

B1

R4

D6

GND2

GND1

R1

Stepper Coil

Figure. Driver Circuit for stepper motor coil

Components:

Resistors:

R1=330 â„¦

R2=2.2 kâ„¦

R3=2.2 kâ„¦

R4=2.2 kâ„¦

Q1 is optoisolator, PC 817

Diodes D1, D4 are 1N 4007

Diodes D2, D3 are 15 volt zener diode

Diodes D5 and D6 are LEDs

Q2 is IRF540N n-channel MOSFET

B1 is 12 V supply for motor

GND1 is FPGA ground

GND2 is motor supply ground

Operation:

The MOSFET generally remains off due to the pull down of the gate voltage by the series connection of the resistor R3 and LED D5. When the input from FPGA is of high logic the diode of the optoisolator is turned on biasing the transistor, increasing the gate voltage which turns the MOSFET on. This way the winding of the stepper motor is energized. When the input from the FPGA is of low logic, the MOSFET is turned off, which de-energizes the winding.

MOSFET is highly sensitive to high voltage and gets damaged if the gate to source voltage exceeds 20V. Hence, for protection 15V zener diode is used. The protection for high reverse voltage is obtained by using diodes D1 and D4.

MULTIPLIER

Figure

The Multiplier core can be configured in either parallel architecture or constant-coefficient architecture. In parallel architecture, the multiplier accepts inputs on buses A and B and generates the product of these two values. While in constant-coefficient architecture, the multiplier accepts the data on the A input bus and multiplies it by a user defined constant value.

The multiplier core generates fixed point parallel multipliers and constant coefficient multipliers for two’s complement signed or unsigned data. It supports input ranging from 1 to 64 bits wide and outputs ranging from 1 to 128 bits wide.

Input signal:

A[17:0] : A operand input bus, 18 bits wide

B[17:0] : B operand input bus, 18 bits wide (parallel multipliers only)

CLK: Rising edge clock input

Output signal:

P[35:0]: Product output – bit 35 downto bit 0

6. EPILOGUE

6.1 System Required

To deploy any system, both supporting hardware and software must be good enough to make the system work properly. The major requirements in our system are mainly FPGA, microphones, audio amplifiers, stepper motor, MOSFETs and the necessary software for VHDL coding and simulation.

Hardware:

Microphones

Audio amplifiers

Connectors and cables

Proto board

Resistors, capacitors

MOSFETs

Power supply

Software:

Xilinx ISE

Modelsim

Matlab

Proteus

Windows XP, Vista

COST INCURRED

We have presented here, the cost incurred during the undertaking of the project.

Input section:

S.No.

Component

Quantity

Rate (NRs)

Cost (NRs)

1

Microphone

2

10

20

2

ADC (0808 CCN)

1

450

450

3

Buffer (HEF4050BP)

2

50

100

4

Audio Amplifier (LM 386)

2

25

50

5

Resistor

1 pack

50

50

6

Capacitor

1 pack

200

200

7

Miscellaneous

400

Total

1,270

Output section:

S.No.

Component

Quantity

Rate (NRs)

Cost (NRs)

1

Stepper motor

1

500

500

2

12 V battery

1

1000

1,100

3

Optoisolator (PC 817)

4

25

100

4

MOSFET (IRF540N)

4

45

180

5

15 V zener diode

8

5

40

6

Resistor

1 pack

50

50

7

Diode

20

1

20

8

Miscellaneous

250

Total

2,240

TOTAL COST (NRs) = 3,510

Note:

Spartan 3E starter kit was provided by the Department of Electronics and Computer Engineering.

Communication and transportation cost has not been included.

Order Now