Spatial audio is already in operation in sectors such as cinema and video games. However, bringing it into the everyday life of mobile calls involves overcoming a series of technical difficulties. These are, for example, carrying out spatial sound processing in real time, the hardware limitations of common smartphones and ensuring a guarantee of compatibility between devices.
The IVAS codec, acronym for Immersive Voice and Audio Services, is going to become a key piece of new mobile communication in the coming years. This is what Nokia, the Finnish technology company, believes, which points to a near future in which spatial audio or 3D audio can be integrated into everyday calls, raising the quality standard of telephone communication.
Although there are several technical limitations that make it difficult to bring this technology to mobile calling, recent advances in IVAS technology may make it possible to address these barriers.
The IVAS codec was standardized last June, by the 3GPP group (association of telecommunications and technology companies), which already implies important progress for different companies and brands to work together towards its adoption.
The technical difficulties
In order to make it a standard, a new parametric audio format has been created, called Metadata-Assisted Spatial Audio (MASA). The MASA has been designed specifically for devices with limited features, such as smartphones.
The IVAS codec integrates a renderer, which supports binaural audio with head tracking and multi-speaker playback using the MASA format. With all this, it is possible for the smartphones we use to work with 3D audio.
Another challenge when it comes to including spatial audio on smartphones is the network saturation problems that sometimes occur. The 3GPP IVAS standard codec supports bit rates ranging from 13.2 to 512 kbit/s, ensuring good immersive audio quality even on congested networks. The problems of acoustic echo or ambient noise have also been solved through the application of machine learning.
The next thing to work on, as indicated by Nokia, is the integration of the IVAS codec into the 5G network.
What does spatial audio provide on mobile phones?
But would listening to spatial audio in calls or voice notes on your mobile make that much difference? Is it really something necessary? Basically, spatial audio would allow us to hear the other person in a way that is closer to reality.
In conjunction with head tracking, 3D audio would provide us with more sound information about what is happening on the interlocutor’s side, which will make the experience more immersive and which, in addition, will join virtual reality technologies in which big technology companies and video game developers are working. It can also improve group calls and the experience in messaging apps.
According to Nokia, the innovation represented by the IVAS codec is comparable to the advance represented by the EVS codec, introduced in 2014, which brought us the feature known as HD Voice+.