### Refine

#### Document Type

- Conference Proceeding (5)
- Article (2)
- Doctoral Thesis (1)

#### Keywords

- MEMS microphones (1)
- MVDR beamforming (1)
- Multichannel Wiener filter (1)
- Speech signal processing (1)
- Sprachakustik (1)
- Sprachverarbeitung (1)
- Wind noise reduction (1)

#### Institute

Simon Grimm examines new multi-microphone signal processing strategies that aim to achieve noise reduction and dereverberation. Therefore, narrow-band signal enhancement approaches are combined with broad-band processing in terms of directivity based beamforming. Previously introduced formulations of the multichannel Wiener filter rely on the second order statistics of the speech and noise signals. The author analyses how additional knowledge about the location of a speaker as well as the microphone arrangement can be used to achieve further noise reduction and dereverberation.

This work studies a wind noise reduction approach for communication applications in a car environment. An endfire array consisting of two microphones is considered as a substitute for an ordinary cardioid microphone capsule of the same size. Using the decomposition of the multichannel Wiener filter (MWF), a suitable beamformer and a single-channel post filter are derived. Due to the known array geometry and the location of the speech source, assumptions about the signal properties can be made to simplify the MWF beamformer and to estimate the speech and noise power spectral densities required for the post filter. Even for closely spaced microphones, the different signal properties at the microphones can be exploited to achieve a significant reduction of wind noise. The proposed beamformer approach results in an improved speech signal regarding the signal-to-noise-ratio and keeps the linear speech distortion low. The derived post filter shows equal performance compared to known approaches but reduces the effort for noise estimation.

In this paper we propose a method to determine the active speaker for each time-frequency point in the noisy signals of a microphone array. This detection is based on a statistical model where the speech signals as well as noise signals are assumed to be multivariate Gaussian random variables in the Fourier domain. Based on this model we derive a maximum-likelihood detector for the active speaker. The decision is based on the a posteriori signal to noise ratio (SNR) of a speaker dependent max-SNR beamformer.

This paper studies suitable models for the identification of nonlinear acoustic systems. A cascaded structure of nonlinear filters is proposed that contains several parallel branches, consisting of polynomial functions followed by a linear filter for each order of nonlinearity. The second order of nonlinearity is additionally modelled with a parallel branch, containing a Volterra filter. These are followed by a long linear FIR filter that is able to model the room acoustics. The model is applied to the identification of a tube power amplifier feeding a guitar loudspeaker cabinet in an acoustic room. The adaptive identification is performed by the normalized least mean square (NLMS) algorithm. Compared with a generalized polynomial Hammerstein (GPH) model, the accuracy in modelling the dedicated real world system can be improved to a greater extend than increasing the order of nonlinearity in the GPH model.

The multichannel Wiener filter (MWF) is a well-established noise reduction technique for speech processing. Most commonly, the speech component in a selected reference microphone is estimated. The choice of this reference microphone influences the broadband output signal-to-noise ratio (SNR) as well as the speech distortion. Recently, a generalized formulation for the MWF (G-MWF) was proposed that uses a weighted sum of the individual transfer functions from the speaker to the microphones to form a better speech reference resulting in an improved broadband output SNR. For the MWF, the influence of the phase reference is often neglected, because it has no impact on the narrow-band output SNR. The G-MWF allows an arbitrary choice of the phase reference especially in the context of spatially distributed microphones.
In this work, we demonstrate that the phase reference determines the overall transfer function and hence has an impact on both the speech distortion and the broadband output SNR. We propose two speech references that achieve a better signal-to-reverberation ratio (SRR) and an improvement in the broadband output SNR. Both proposed references are based on the phase of a delay-and-sum beamformer. Hence, the time-difference-of-arrival (TDOA) of the speech source is required to align the signals. The different techniques are compared in terms of SRR and SNR performance.