Acoustic Echo Cancellation Software Windows
Acoustic Echo Cancellation Software and Noise Reduction Solutions. SoliCall is a leading provider of software products focused on improving sound quality in VoIP. SoliCall is specializing in the fields of acoustic echo cancellation, noise reduction and noise suppression with the added value of identification of the speaker. In addition to improving subjective quality, this process increases the capacity achieved through silence suppression by preventing echo from traveling across a network. These methods are commonly called acoustic echo suppression (AES) and acoustic echo cancellation (AEC), and more rarely line echo cancellation.
I used to have a motherboard with Realtek HD Audio drivers. Although the driver and bundled software were clunky, annoying, and redundant, surprisingly it had features for Noise Suppression and Echo Cancellation when using microphones. I have been taking this for granted as I work in data centers throughout the week, and I need to make calls over skype/google. Unfortunately the receiving end hears the blasting noise of air conditioners in the background. I later switched to a computer that uses VIA drivers, also with a bundle of clunky drivers.
Unfortunately, they don't have any of these features. So I've been looking hard, but can't find any software that can perform the noise cancellation. It seems like it would need to hook in the driver level, but maybe this can be a generic filter that is used by applications that need to interface with the microphone. Any information would be useful. Closed as off-topic by,,,, May 26 '14 at 19:10 This question appears to be off-topic. The users who voted to close gave this specific reason: • 'Questions seeking product, service, or learning material recommendations are off-topic because they become outdated quickly and attract opinion-based answers.
Instead, describe your situation and the specific problem you're trying to solve. Share your research.
On how to properly ask this type of question.' – Nifle, Tog, Kevin Panko, Ramhound, soandos If this question can be reworded to fit the rules in the, please. Drivers Ngs Robbie 2 0 Xposur.
There are basically two scenarios here, noise reduction and echo suppression, which are very different beasts. I am more familiar with echo suppression (AEC), and to do this properly it needs to be done at the lowest level possible as close to the captured mic input and playback hardware as possible. Ideally the device driver would be the perfect place to put it, but few windows drivers support it probably because most good AEC algorithms cost money to license.
It can be done at the program/application level if you can access the buffer immediately as it has been recorded, and the buffer that has just been played. This is usually done in the application, and each application does it slightly differently if at all. So the answer to the echo cancellation is No I do not believe there is generic software or filters that can do AEC. Noise suppression is a different issue and requires some well understood algorithms be applied to the recorded microphone data before passing onto the application, so in this case I think it would be possible to apply a filter to the mic and generic application level software (or drivers) should be available, unfortunately I am not aware of any.
(deg) 200 25 30 Device.Audio.Acoustics.MicDistortion The mic distortion and noise limit is important to meet for a couple reasons • Ensure the voice is relatively undistorted before entering the speech recognizer • Keep non-linearities on the echo path minimal for good echo cancellation performance The distortion is recommended to be measured using SDNR (pulsed noise signal-to-distortion-and-noise ratio), although THD targets are also given. More information about the SDNR test method can be found from IEEE 269-2010 Annex L. Frequency Premium Standard THD SDNR THD SDNR 250 2.50% >=32 3.20% >=30 1000 2.50% >=32 3.20% >=30 4000 2.50% >=32 3.20% >=30 5000 4.00% >=28 4.00% >=28 6000 6.30% >=24 6.30% >=24 Note This requirement extends up to ½ the effective bandwidth, at which point the first harmonic is beyond the nyquist rate. Device.Audio.Acoustics.MicBandwidth The sample rate of the capture signal is the primary factor in determining the effective bandwidth of the speech signal. As the speech platform uses 16 kHz acoustic models in the speech recognizer a 16 kHz minimum sample rate is recommended. 300 Hz is the effective lower end of the speech recognizer, however 200 Hz is the recommended acoustical limit for devices also targeting voice communications. Filtering can also alter the effective bandwidth of the device; such as an analog FIR lowpass filter in the ADC, a digital band pass filter at a later stage in the pipeline, or even attenuation due to the response of the microphone element or electrical system.
These factors should be considered during design. The speech platform utilizes 8 kHz acoustic models only to provide support for legacy Bluetooth audio devices. Device.Audio.Acoustics.RenderDistortion The loudspeaker distortion limit is important to meet for the following reason: • Keep non-linearities on the echo path minimal for good echo cancellation performance The distortion is recommended to be measured using SDNR (pulsed noise signal-to-distortion-and-noise ratio), although THD targets are also given. More information about the SDNR test method can be found from IEEE 269-2010 Annex L. Frequency Premium Standard Level: -22dBFS Level: -16dBFS Level: -22dBFS Level: -16dBFS THD SDNR THD SDNR THD SDNR THD SDNR 300 6.3% >=24 6.3% >=24 N/A N/A N/A N/A 500 6.3% >=24 6.3% >=24 N/A N/A N/A N/A 600 5% >=26 5% >=26 10% >=20 10% >=20 800 5% >=26 5% >=26 8% >=22 8% >=22 1000 4% >=28 5% >=26 6.3% >=24 6.3% >=24 1500 4% >=28 5% >=26 5% >=26 6.3% >=24 3000 4% >=28 5% >=26 5% >=26 6.3% >=24 4000 5% >=26 5% >=26 5% >=26 6.3% >=24 5000 5% >=26 5% >=26 6.3% >=24 6.3% >=24 6000 5% >=26 5% >=26 6.3% >=24 6.3% >=24 Note Only applies to devices with built-in loudspeakers. Device.Audio.Acoustics.RenderPlacement To enable the acoustic echo canceller to work well the device speakers should be placed at a maximum distance from the microphones, or place directivity nulls towards loudspeakers. Appendix A Calculations Jitter and Drift Jitter We define jitter as the absolute range of observed samples (or reported timestamps) about the nominal sample (or timestamp).
For example, in the case of a normal distribution of samples about the nominal sample, the absolute jitter is defined as the following: Drift We define drift as the percent difference between the nominal clock rate and the actual clock rate over a period of time sufficient to observe the drift. Ambient Noise Gain The isotropic ambient noise gain for a given frequency is the volume of the microphone array beam: Where: V is the microphone array work volume—that is, the set of all coordinates, (direction, elevation, distance). Is the microphone array beam directivity pattern—that is, the gain as a function of the frequency and incident angle. An example for one frequency is shown on Figure 1. An example in one plane is shown on Figure 2.
The total ambient noise gain NG in decibels is given by: Where: is the noise spectrum is the preamplifier frequency response (ideally flat between 200 and 7,000 Hz, with falling slopes from both sides going to zero at 80 and 7,500 Hz respectively). Is the sampling rate (typically 16 kHz for voice applications). Ambient noise gain gives the proportion of the noise floor RMS in relation to the output of the microphone array and to the output of an omnidirectional microphone. A lower value is better, and 0 dB means that the microphone array does not suppress ambient noise at all. A-Weighted Ambient Noise Gain Because humans hear different frequencies differently, many acoustic parameters are weighted by using a standardized A-weighting function. The A-weighted total ambient noise gain NGA in decibels is given by: Where: is the standard A-weighting function; other parameters are the same as above. A-weighted ambient noise gain gives the proportion of the noise floor in relation to the output of the microphone array and to the output of an omnidirectional microphone as they would be compared by a human.
In this case, -6 dB NGA means that a human would say that the noise on the output of a microphone array is half that of an omnidirectional microphone. Directivity Index Another parameter to characterize the beamformer is the directivity index, DI. In considering the following formula for calculating DI, note that cos θ is used when θ is defined to be - π/2 and π/2 at the poles, and 0 at the equator.
These limits match the definitions of φ and θ in Appendix B of 'How to Build and Use Microphone Arrays for Windows Vista,' a companion document. And these limits also match the definitions for wHorizontalAngle ( φ) and wVerticalAngle ( θ) in the kernel streaming interface definitions. This is the power function for a given frequency f and direction ( φ, θ), with a fixed radius:, This is the average power over all directions (the whole sphere): This is the power in the 'best' direction, called the Main Response Axis: Dividing the power in the 'best' direction by the average power gives an indication of directionality for a particular frequency.
Averaging this ratio over all frequencies gives the Directivity Index. The directivity index characterizes how well the microphone array detects sound in the direction of the MRA while suppressing sounds that come from other directions, such as additional sound sources and reverberation. The DI is measured in decibels, where 0 dB means no directivity at all (omnidirectional microphone). A higher number means better directivity. An ideal cardioid microphone should have DI of 4.8 dB, but in practice cardioid microphones have a DI below 4.5 dB.
Appendix B References Title Link Useful definitions and metrics Microphone Array Support in Windows ITU-p.10 Reference Terms 3GPPTerminal acoustic characteristics ETSI UMTS Speech telephony terminal acoustic test specification (3GPP TS 26.132 version 11.4.0 Release 11) ETSI EG 202 396-1 Appendix C MIC ARRAY GEOMETRY This section describes the process in which one can develop a suitable microphone array geometry descriptor with a worked example. The content in this section is based on the following MSDN topics: • • • • structure • structure Note that good mic array design is a function of many parameters other than just the number of mics, and is highly dependent on the device integration and usage. For design considerations and implementation guidelines (and many other very informative best practices), refer to Microphone Array Support in Windows.
The mic array descriptor is used to parameterize beamformer and sound source localizer behavior in the Microsoft, and 3 rd party, speech enhancement pipelines. The audio driver must implement the property. Then, the System.Devices.MicrophoneArray.Geometry property can be accessed via the Windows.Devices.Enumeration API.
The USB audio driver will support this property for USB microphone arrays that have the appropriate fields set in the USB descriptor. The Driver Configuration Verification tool can be used for verification on the device (OEMVerification tool provided in the toolchain). Example Application: Laptop with Front Facing, 2-Channel Mic Array In this example, a laptop has two channels that are on the screen, near the top bezel and facing (ported) forwards: Details: • Mic0 is the leftmost microphone when facing the device and appears as Channel 1 in a multi-channel waveform. Note that this is applicable for this example and not necessarily required for all products. • Mic1 is the rightmost microphone when facing the device.
• The microphones are separated by 90 mm physically on the device. • The desired 'virtual microphone' is located at the camera position in this example (desired virtual microphone locations are at the discretion of IHVs). The virtual microphone is the origin around which beamforming and signal processing are defined. A general guideline would suggest placing this in the middle of the microphone array. • The microphone module is omni-directional, however when integrated and ported in this device, its polar response is closer to a sub-cardioid microphone type. Value Name Polar Response Value for Geometry Descriptor KSMICARRAY_MICTYPE_OMNIDIRECTIONAL Omni directional 0 KSMICARRAY_MICTYPE_SUBCARDIOID Sub cardioid 1 KSMICARRAY_MICTYPE_CARDIOID Cardioid 2 KSMICARRAY_MICTYPE_SUPERCARDIOID Super cardioid 3 KSMICARRAY_MICTYPE_HYPERCARDIOID Hyper cardioid 4 KSMICARRAY_MICTYPE_8SHAPED 8-shaped 5 KSMICARRAY_MICTYPE_VENDORDEFINED 0x0F 0x0F • The microphones are all ported parallel to the x-axis (i.e., not pointing left or right, nor pointing up nor down). The coordinate system is illustrated below, where X points directly towards the user: Relevant Information for Geometry Descriptor: The KSAUDIO_MICROPHONE_COORDINATES structure would appear as follows: Mic Type x y z Elevation Angle Direction Angle Member Name usType wXCoord xYCoord wZCoord wVerticalAngle wHorizontalAngle Mic0 1 0 -45 0 0 0 Mic1 1 0 45 0 0 0 • The coordinates for Mic0 and Mic1 are relative to the virtual microphone, i.e., [0,0,0].
Therefore, the coordinates of these channels define where the origin, or virtual microphone would appear on the device. • Mic0 is 45mm to the left of the desired Y-Origin (Y = -45), and Mic1 is 45mm to the right of the desired Y-Origin (Y = 45).
Their porting is also located at the same point along the x-axis, which is coincidental with the desired virtual microphone location (therefore X = 0 for both Mic0 and Mic1). • The microphones are both pointing forwards to the user (parallel to x-axis, perpendicular to y and z axes). Therefore, vertical and horizontal angles are both zero (like the virtual microphone). Note that the values used for angle are expressed in 1/10000 th of a radian, e.g., +45 degrees = 0.7854 rad *10000 = 7854.