Sound is a versatile medium. In addition to being one of the primary means of communication for us humans, it serves numerous purposes for organisms across the animal kingdom. Particularly, many animals use sound to localize themselves and navigate in their environment. Bats, for example, emit ultrasonic sound pulses to move around and find food in the dark. Similar behavior can be observed in Beluga whales to avoid obstacles and locate one other.
Various animals also have a tendency to cluster together into swarms, forming a unit greater than the sum of its parts. Famously, bees agglomerate into swarms to more efficiently search for a new colony. Birds flock to evade predators. These behaviors have caught the attention of scientists for quite some time, inspiring a handful of models for crowd control, optimization and even robotics.
A key challenge in building robot swarms for practical purposes is the ability for the robots to localize themselves, not just within the swarm, but also relative to other important landmarks. Consider a scenario where a robot swarm must autonomously deploy across the surface of a table, perform some function, and then come back to its original location, for example to recharge.
We can approach this in many ways, but they each have their drawbacks. We can put cameras on the robots, for example, but that would likely incur a significant power consumption and raise privacy concerns. Alternatively, we can use some dedicated, external localization infrastructure, such as a table-mounted projector, or a special surface, but that would require an external user to initially set up this infrastructure everytime the swarm is used. Can we have a swarm localization solution that uses no external infrastructure and is less intrusive to the user’s privacy?
To address these questions, we introduce the acoustic swarm. This is a swarm of seven centimeter-scale (3.0cm x 2.6cm x 3.0cm) robots, which use only sound to distribute and navigate across a surface, without using any external infrastructure. This swarm can be used to enable downstream distributed sensing tasks, such as concurrent speaker localization and separation in smart homes.
How it works
As the name suggests, the robots in the acoustic swarm use only sound to localize themselves and navigate across a common table. Each robot is equipped with 1 speaker and 2 microphones. Internally, the robots send each other timing information to agree on a common global clock, which never drifts more than 16 microseconds. The robots emit high-frequency chirps in a coordinated manner that other robots can hear to estimate the distances between pairs of robots. Once that is done, the swarm uses multiple such 1-dimensional distance estimates to obtain 2-dimensional position estimates for each robot. Combined with data from an on-board inertial measurement unit (IMU), the robots move out of and back to a 3D printed base station.
Suppose the robotic swarm is distributed across a surface. To localize themselves, each robot emits a 32ms long 15-30kHz frequency modulated continuous wave signal. In that time frame, each robot records using their microphones and uses channel estimation to get the time-of-flight of this chirp signal. Since we know the speed of sound in air, we can use this time difference to calculate the distance between the chirping robot and every other robot in the swarm.
To obtain the 2D positions from this distance calculation mechanism, the robots coordinate to send out chirps one by one. Once we have the distance between every pair of robots, we use the outlier-aware SMACOF algorithm to obtain the 2D geometry which best fits the observed distances. However, such a method is only accurate up to some rotation of the positions, i.e. there is no global reference frame to disambiguate the overall orientation of the swarm.
To establish a global reference frame, we turn to the base station. When the robots are deployed, we ensure that one robot is kept inside the base station, which has checkpoints with known locations. To bind the swarm rotation to this global reference frame, this one robot inside the base station moves along the checkpoints, emitting chirps along the way. Since the robot’s position at these checkpoints is known and fixed, we can extrapolate the positions of all other robots. Such a mechanism also gives a more robust 2D positioning scheme, as we would have more distance estimates to work with. In fact, such a localization mechanism can be shown to work well even in the presence of strong nearby acoustic reflectors such as walls and objects on the surface.
We can use this localization mechanism to navigate robots across the surface. Specifically, if all robots have known positions, a single robot can move to a target position by constantly emitting chirps and using triangulation to obtain the current position. The robots start out inside the base station, and they use the 2D localization mechanism to maneuver around the base station and outwards along equally-partitioned angles. While moving outwards, they use a pair of proximity-sensing photointerruptors to avoid falling off of edges and their IMU to detect collisions with objects.
To return to the base station, they once again localize themselves, and move back to the base station entry. Once there, they perform a small “dance” to calibrate their current rotation estimate using the acoustic positioning mechanism, and enter the base station from its entry ramp. Finally, they use their photointerruptors to dock and conclude their excursion.
The acoustic swarm gives us a platform with which we can deploy various exciting distributed sensing applications in common, cluttered environments. For example, we can use the microphones on the robots to form a giant distributed microphone array with known microphone locations. This setup, which would autonomously deploy and retract, can be used to enable new applications such as speaker localization and/or separation. Alternatively, one could also leverage the speakers on these robots as a large, distributed speaker array, opening up new possibilities in personalized sound zones, or dividing up areas in a room, such that listeners in different areas hear different sounds.
Additionally, the mobility of the robots also offers interesting implications. Specifically, different acoustic scenarios may require a different microphone arrangement to be used optimally. Thus, such a dynamic microphone array could potentially automatically reconfigure itself to optimize its shape on-the-fly to the given information about the current setting, such as the speaker locations or the room acoustic profile.