Overview [Top]
This app note takes a look at an acoustic camera demonstration program using the miniDSP UMA-16 USB microphone array. The program detects the primary source of sound and overlays it on a live video display.
The demo application we are using is on github at:
A technical paper describing this work is located here:
A very special Kudos for the team at TU Berlin for their great work on leveraging the powerful Acoular SDK together with the UMA16 in a ready to go application.
Contents
- What you will need
- A few notes
- Install the software
- Running the software
- Modify the code!
- Wrapping up
What you will need [Top]
-
A miniDSP UMA-16 microphone array. This is a 16-microphone rectangular array with a USB interface.
-
A USB webcam. We have a low-cost 1080p USB camera available as an optional purchase on the UMA-16 web page.
-
A Linux or macOS computer. While the application was developed for Linux, we were able to run it on macOS.
A few notes [Top]
This application is a proof of concept. The processing rate is not fast enough, for example, to follow a moving object. The detected frequency band is also quite narrow.
The UI shows two modes: Deep Learning and Beamforming. Deep Learning mode does not work, due to some missing code. So stick to Beamforming mode.
The code was developed on Linux, and we were able to run it on macOS. We do not know if it will run on Windows.
However, all the code is there! So you can change the code, adjust things, and learn how this application works. Have fun!
Install the software [Top]
To install, you will first need the "anaconda" package manager. If you don't already have it installed, go to https://www.anaconda.com/download/success and download the installer for Miniconda for your computer. (Miniconda is much smaller than the full Anaconda distribution.)
In a terminal window (Linux or macOS), enter the following commands:
conda create -n acoustic_camera python=3.11
conda activate acoustic_camera
Now you will need to clone the github repository and install necessary dependencies:
cd [your directory of choice]
git clone https://github.com/rabeaifeanyi/acoustic-camera.git
cd acoustic-camera
pip install -r requirements.txt
Running the software [Top]
To run the program (Linux or macOS):
cd acoustic-camera (again)
export OPENBLAS_NUM_THREADS=1
python start.py
The program will start running and then open a browser window:

Note: if you're on macOS, Safari make not work so well. Open this URL in Brave or Chrome instead: http://127.0.0.1:5000.
Here are the parameters you can set:

- Deep Learning / Beamforming
-
This switches between two sound source localization methods:
- Deep Learning: Uses a neural network model for more accurate localization
- Beamforming: Uses traditional acoustic beamforming algorithms
However, Deep Learning didn't work for us due to some missing program code, so leave this on the Beamforming setting.
- Save Time Data / Discard
-
Controls whether raw audio data is saved to disk during measurements.
- Real X, Real Y, Real Z
-
Expected true coordinates of the sound source for comparison/validation (in meters). See the section Matching microphone to camera display for additional notes on the Z co-ordinate.
- Frequency (Hz)
-
The target frequency to analyze (default: 4000 Hz). In beam-forming mode, this is the center frequency of a third-octave filter – at the default setting of 4000 Hz, you could say it analyzes roughly 3550 – 4500 Hz. The beamforming algorithm then processes this filtered signal in the time domain.
- Threshold
-
The minimum signal level threshold for detection (default: 40)
- Show Microphone Geometry
-
Toggles visibility of the microphone array positions on the plot.
- Show Origin
-
Shows/hides the coordinate system origin markers.
- Start
-
Begins acoustic measurement and real-time processing. When running, this changes to Stop to halt the measurement.
When you're ready, click the Start button. Position a sound source in the camera's field of view and see how well it works! Here are a couple of examples from our test setup:


Note: If the application appears to work but the beamforming does not, check the terminal where you started the program for the message "Could not find the UMA-16 device." If you see that, modify the code as described in Better microphone search below and restart the program.
Modify the code[Top]
You can modify the code to experiment and to tailor the application as you wish.
Better microphone search[Top]
In some cases, the application may not find your UMA-16 microphone. If it does not find it, it will default to the device at index 0. So, sometimes it will work anyway, by chance.
To be more certain of finding the UMA-16, we modified the function get_uma16_index in the file config/funcs_devices.py. This modified version searches for more microphone names, and if it does not find a name match for the UMA-16, it searches for a device with 16 input channels. Locate the function and replace it with this version:
def get_uma16_index():
"""
Get the index of the UMA-16 microphone array.
Returns:
int: Index of the UMA-16 microphone array if found, otherwise None.
"""
devices = sd.query_devices()
device_index = None
# Look for UMA-16 by various possible names
uma16_names = ["nanoSHARC micArray16", "UMA16v2", "UMA16", "UMA-16"]
for index, device in enumerate(devices):
device_name = device["name"]
for uma_name in uma16_names:
if uma_name in device_name:
device_index = index
print(f"\nUMA-16 device found: {device_name} at index {device_index}\n")
return device_index
# If not found by name, look for devices with exactly 16 input channels
for index, device in enumerate(devices):
if device.get('max_input_channels', 0) == 16:
device_index = index
print(f"\nUMA-16 device detected by channel count: {device['name']} at index {device_index} (16 channels)\n")
return device_index
print("Could not find the UMA-16 device.")
return device_index
Changing microphone orientation[Top]
We wanted to have the UMA-16 in a different orientation. In the file data_processing/process.py, we changed the definitions of the functions _get_maximum_coordinates and _beamforming_generator to this:
def _get_maximum_coordinates(self, data):
""" Get the maximum coordinates of the beamforming results
"""
max_val_index = np.argmax(data)
max_x, max_y = np.unravel_index(max_val_index, data.shape)
x_coord = self.x_min + max_x * self.increment
x_coord = -x_coord
y_coord = self.y_min + max_y * self.increment
return [y_coord], [-x_coord] # <<= Rotate
def _beamforming_generator(self):
""" Beamforming-Generator """
gen = self.bf_out.result(num=1)
count = 0
while not self.beamforming_stop_event.is_set():
try:
res = ac.L_p(next(gen))
res = res.reshape(self.grid_dim)[::-1,::-1] # <<= Flip both axes
res = np.rot90(res) # <<= Rotate 90 degrees counterclockwise
count += 1
with self.beamforming_result_lock:
self.beamforming_results['results'] = res
self.beamforming_results['max_x'], self.beamforming_results['max_y'] = self._get_maximum_coordinates(res)
self.beamforming_results['max_s'] = np.max(res)
except StopIteration:
print("Generator has been stopped.")
break
except Exception as e:
print(f"Exception in _beamforming_generator: {e}")
break
print(f"Beamforming: Calculated {count} results.")
Displaying peak signal dot[Top]
You can display a dot at the location of maximum sound intensity (in addition to the spectral overlay showing the strength of the detected sound). To do this, in the file data_processing/dashboard.py, change the definition of the function update_beamforming to:
def update_beamforming(self):
beamforming_data = self.processor.get_beamforming_results()
self.acoustic_camera_plot.update_plot_beamforming(beamforming_data)
self.acoustic_camera_plot.update_plot_beamforming_dots(beamforming_data) # <<= Add this
x_val = beamforming_data['max_x'][0]
y_val = beamforming_data['max_y'][0]
self.coordinates_display.text = f"X: {x_val}<br>Y: {y_val}"
self.level_display.text = f"Level: {beamforming_data['max_s']}"
There's now a small dot visible. To make it larger, in ui/plotting.py, change:
self.beamforming_plot = fig.scatter(
x = 'x',
y = 'y',
marker='circle',
source=self.beamforming_dot_cds,
)
to:
self.beamforming_plot = fig.scatter(
x = 'x',
y = 'y',
marker='circle',
source=self.beamforming_dot_cds,
size=16, # <<= make the dot larger
)
Matching microphone to camera display[Top]
Finally, you may need to adjust for the angle of view of your camera so that the visual camera display matches the beamforming overlay and the peak intensity dot.
A simple way is to just set the Real Z parameter until everything lines up. In the example screenshot above, we have set Real Z to 5, even though the actual distance is about 1.7 meters.
An alternative, if you want the Real Z parameter to match your actual physical distance, is to change two lines in data_processing/process.py. They are both the same. In our case, we changed:
self.beamforming_grid = ac.RectGrid(x_min=self.x_min, x_max=self.x_max, y_min=self.y_min, y_max=self.y_max, z=z, increment=self.increment)
to:
self.beamforming_grid = ac.RectGrid(x_min=self.x_min, x_max=self.x_max, y_min=self.y_min, y_max=self.y_max, z=z*3, increment=self.increment)
Wrapping up [Top]
And that's it! Have fun experimenting with the acoustic camera, and let us know how you go in our forum.