Audio and Video Software Industry News: 2023 Q1 Review

It’s been an exciting start to the year in the audio and video software industry. As we reflect on the first quarter of 2023, we’re eager to bring you up to speed on the latest developments that have occurred since the release of our 2022 review.

Our team has been keeping a watchful eye on the industry’s happenings and we’re thrilled to share our curated list of the most interesting events with you. From updates and trends discussed at conferences like the ISE, Streaming Media Connect, RTC@Scale and Understanding Latency, to the latest software news and releases from AV companies, we’ve got you covered. We’re particularly intrigued by the AI craze that continues to dominate the software industry and the push towards interoperability between the industry’s main competitors and some AV solutions have wound up in our Mercs and Teslas. So sit back, relax, and enjoy our 2023 Q1 review of the audio and video software industry.

ISE 2023 (31 January–3 February)

The Integrated Systems Europe (ISE) is a global event for AV and systems integration professionals. Held from 31st January–3rd February in Barcelona, Spain, ISE 2023 showcased the world’s leading technology innovators and solutions providers. Here are some highlights from the event.

Sennheiser product showcase | Greater inclusivity with MobileConnect

Seinheiser has developed a hardware and software solution called MobileConnect which allows users to take part in large meetings using their phone as a listening and microphone device to communicate with each other. The communication between the participant’s phone and Seinheiser’s hardware happens via a wireless network.

AVI SPL Q&A | Shifting Focus Towards Deigning Around User Experience

AVI SPL representative states that AV as an industry is shifting from platform-centric design towards human-centric design.

The Wrap | A great ISE or the greatest ISE?

Editors from Inavate and Hiddenwires analyze, discuss, and summarize the technology and overall atmosphere at this year’s event.

Streaming Media Connect (14–17 February)

An online event dedicated to sharing innovative approaches used by the world’s leading organizations and experts in live streaming, OTT, content delivery, next-gen TV, and much more. Here are some of the key takeaways from the event.

The State of Codecs 2023

Igor Oreper from Bitmovin, Alex Zambelli from Warner Bros Discovery, Kevin Yao from AWS, and Andrey Norkin from Netflix discussed the current state of codecs, current trends, and the current view of the future of codecs together with moderator Jan Ozer from Streaming Media. Here are the main things we learned:

  • Netflix recommends H.264 as the most widely supported codec, while VP9 is used mainly for Android or native video playback on PCs, and HEVC is used for 4k and HDR.

  • AWS’s goal is to balance codec efficiency with image quality.

  • Google Chrome recently announced HEVC support, but it is still in the early stages of adoption.

  • AV1 hardware decoders are becoming more common on Android, but software decoding for AV1 is resource-intensive.

  • Experts predict AV1 could become mainstream within the next year.

  • VVC is promising but currently lacks hardware and software support, and may take three to five years to become popular.

  • AWS is exploring ARM-based encoders to reduce energy consumption.

  • LCEVC enhances existing codecs to improve image quality, while EVC offers an attractive licensing strategy.

  • Netflix is working on AV2 or AVM codecs, which are expected to improve performance by about 20% compared to AV1.

deadmau5 on the Future of Live Concert Streaming

Warning: This talk contains strong language and smoking.

deadmau5 talks about his views and experiences of live streaming together with Eric Schumacher-Rasmussen from Streaming Media.

Approaching Zero Latency: How Close Can We Get

Magnus Svensson from Eyevinn Technology speaks about the obstacles of reaching near zero latency and how low we can actually get with Tim Siglin from Help Me Stream Research Foundation and Loke Dupont from TV2. The topics he covers include:

  • How latency is born?

Latency in stream playback devices can introduce unexpected delays. Caching to avoid buffering increases latency. Camera switching and ad selection also cause latency. Some countries have a mandatory “dump period” of 7 to 10 seconds to mute inappropriate language. Adding graphic layers also creates latency. Large-scale events can slow down streams.

  • Methods and obstacles to achieving ultra-low latency

Lowering caching and fine-tuning properties can reduce latency. Effort and testing required. Content delivery chain affects results.

  • HTTP streaming – drawbacks and alternatives

HTTP replaced RTMP and RTSP for on-demand streaming. HTTP adds latency to live content. WebRTC offers a promising alternative but prioritizes latency over quality. WebRTC is being used more like traditional broadcast tech. Proprietary protocols lack hardware and software support.

  • What do viewers like or hate?

Viewers hate buffering and prefer stable audio quality over video.

  • Stream synchronization – next big issue to consider

Stream synchronization to gain more traction over ultra-low latency.

Latency, Quality, Privacy and Simplicity: Redefining Video Production in the Cloud

Marcelo Moyano and Richard Smrt from StreamVoodoo talk about their product StreamVoodoo and discuss its pros over existing solutions in terms of achieving ultra low latency (as they themselves stated from NY to LA in 60ms) without compensating the quality of the stream.

They went further on to describe their product. They stated that StreamVoodoo is completely independent from other providers (e.g. AWS) and they do their own thing. They drew inspiration from Apple in terms of product simplicity. You can share an invite link and users can join your stream via that link – no need for accounts and all that hassle. StreamVoodoo is backed by deadmau5 and Alex Lindsay.

You can check out their page on StreamVoodoo.

The Future of Interactive Streaming

Nadine Krefetz from Reality Software leads a talk with Alex Lindsay from 090 Media, Brian Miller from LG Electronics, Jennifer Kent from Parks Associates and Oliver Lietz from nanocosmos.

They all went back and forth discussing their experiences with user interactivity and how it is perceived in their field of work. Everyone agreed that interactivity is what keeps the user more engaged due to them taking or influencing the event itself. One thing they noted is that users prefer not to leave the stream whilst doing something else in the background (chatting, checking other TV channels, etc.).

Understanding Latency (6–8 March)

Understanding Latency is a web series and knowledge sharing event that invites world-renowned experts on network latency to discuss the importance of understanding and improving latency, and showcase the increasing interest among Application/OS Developers, Network Vendors, and Operators in improving latency and overall network quality. Here are some of the main topics discussed during the event:

Quality of Outcome (QoO) by Magnus Olden

An interesting quality measuring approach, QoO tries to measure the likelihood of a perfect video session depending on the network quality. Unfortunately, the speaker doesn’t go into great detail on how the probability is calculated, specifically what factors influence it and so on.

A Real-Time Video Perspective on Network Quality by Varun Singh

The speaker discusses the importance of balancing quality and interactivity in video conferencing, and explains how bandwidth and latency affect this balance. Depending on the application, you may prioritize quality or interactivity. Managing large group calls can be challenging, so it’s important to consider worst-case scenarios when building systems. Understanding these trade-offs is crucial for providing the best user experience.

RTC@Scale (29 March)

The event featured talks themed around four main topic areas: Core AV in RTC, Scaling RTC for Large Calls, ML & Simulations for Large Calls, and RTC for Metaverse. Here are the key topics covered:

Blind Quality Assessment for Real-Time Visual Communications | Dr. C.-C. Jay Kuo

Dr. C.- C. Jay Kuo from the University of Southern California shared his team’s breakthrough in video quality assessment for real-time communication. He explained that video quality assessment is important in controlling the quality of video communication and that there are two applications—professional video streaming and user-generated content. The former uses high-quality video as a reference, while the latter doesn’t have a reference, so a blind quality assessment is required.

  • Jay Kuo’s team used two methodologies for machine learning—traditional feature extraction and deep learning neural net approach.

  • The blind image quality assessment pipeline uses a no-supervision stage and a supervised feature selection and regression prediction stage

  • Their pipeline outperformed traditional methods and provides accurate predictions with high correlation to human scores

Scaling for Large Group Calls | Yun Zhang and Bin Liu

Yun and Bing from MITRE Remote Present Networking Team discuss the challenges of audio and video experience for large group calls in this presentation. The main challenges they identify are:

  • Supporting many participants with multiple audio and video streams.

  • Introducing a server-side scenario for audio scaling, including Dominant Speaker Detection and volume capping.

  • Optimizing audio capping through Dominant Speaker Detection.

  • Collaborating with clients to inspect audio content and determine stream volume.

  • Considering critical factors such as end-to-end encryption, real-time decision making, and handling DTX, frame size, and stabilized audio for decision and PLC send.

Real-time audio at Meta scale | Sriram Srinivasan

Sriram discusses challenges related to audio quality in VoIP (Voice over Internet Protocol) calls. Although VoIP is not a new technology, ensuring high audio quality remains a challenge, especially when dealing with nonlinearities and low bandwidth connections. Sriram covers various components of an audio pipeline, such as echo cancellation, noise suppression, and audio device management. Sriram also highlights the importance of addressing the no-audio problem and reliability issues in audio pipelines. Additionally, Sriram discusses the challenges of scaling large group calls while maintaining quality, such as using proximity-based approaches to strike a balance between scalability and quality.

Using Machine Learning to Enrich and Analyze Real-time Communications | Sid Rao

Sid discusses how the Amazon Chime SDK uses machine learning to enhance audio and video quality in real-time communication networks. He explains how machine learning can be used to upgrade audio sessions on existing telecommunications platforms, and how a redundancy mechanism can be used to solve last mile lossy network problems. Sid provides examples of how machine learning can help users derive insights from their audio and video sessions, including identifying users in a multi-party audio session and using identity data to enrich other machine learning applications. He concludes with an invitation to collaborate on the Opus Codec in an open-source manner.

RTC@Metaverse – Bringing RTC to Metaverse | Raman Walia and Shreyas Basarge

Raman describes the technical details of Metaverse, a third dimension virtual reality platform, and its challenges. There are three types of avatars (human representations) and each comes with its own tradeoffs in terms of quality, fidelity, realism, and computational resources. Real-time data, including skeleton movements and joint positions, is transferred to create these optars, and machine learning techniques are used to compress facial expressions. Volumetric video, the most realistic form of human representation, is difficult to transfer due to the large amount of data required. Raman also discusses the challenges of transferring the real-time world state, including object identification and ownership, replication, and serialization. His goal is to innovate and operate at scale with a hard latency bound of around 50 milliseconds.

Top stories

Reviews

Microsoft

Zoom

When this view is enabled, an active speaker view will be displayed when a remote participant speaks. Once the participant stops speaking for 15 seconds, the meeting will revert to gallery view or the background screen until another participant starts talking.

Google Meet

Chime

Now you can find the speaker by providing a short speech sample from the call audio. You can identify and label speakers in real time thanks to this.

Agora

RingCentral

BlueJeans

Gateway for Microsoft Teams – Custom Features Update

  • Now has a setting where users may bypass the lobby. (Feb 3)

  • Users now may switch to a Large Gallery view. (Feb 3)

  • Fit-to-frame support by default. (Feb 3)

Web Client SDK

  • Custom layouts and individual stream support (Feb 17)

  • Control the individual video streams and build customized layouts

  • Choose the resolution for a participant

  • Prioritize the active speaker

  • Show up to 25 video tiles

  • Build support for pinning video tiles of particular participants in the meeting

BlueJeans Mobile

  • Retain audio focus while in a meeting (Mar 3)

  • Set your join via Phone settings (Mar 16)

Daily.co

Daily.co now automatically switches to SFU topology once a third participant joins the call.

Dialpad

Jan 2023 Dialpad Meeting News

  • Guest Mode

Mobile users can now join a Dialpad meeting without the fuss of creating an account or signing-in by joining the meeting as a Guest on the mobile apps.

Mar 2023 Dialpad Meeting News

  • Self-view option on all layouts

Self-view options are now available across all the layouts and not uniquely in dynamic views.

  • Faster join time

They claim to have reduced it by a whole second!

Dolby.io

 Hopin

You can now change the stream layout freely. They explain it very well with videos.

Kaltura Meetings

Attendees can now enjoy a full-screen presentation during a session by switching to the content-focused view. After the session ends, the content will automatically break into chapters, allowing viewers to customize their viewing experience.

Lucent Rainbow

  • Bubbles & web conferences

It’s now possible to add a Rainbow participant into a bubble conference by calling them directly, without any invitation. This Rainbow user is added to the bubble with the Member role and will be visible in the bubble for the duration of the conference only.

Webex

Vonage

TikTok

Rocket Chat

  • Video conferences now have guest mode.

GoTo

TrueConf

Telegram

Tata

Voiceflex

Antmedia

Antmedia describes how you can use DuetMaster to play a MIDI instrument together with someone in a P2P video call.

2023 Q1 Wrap-Up

Looking back at the first quarter of 2023 in the audio and video software industry, we’ve witnessed a flurry of discussions and events that shed light on the industry’s latest advancements. From talks surrounding video codecs and ultra-low latency to debates about the future of AV products, this quarter has been marked by exciting developments. Alongside new product releases and features from established competitors, we’ve seen a surge of features utilizing AI.

Looking ahead to the next quarter, we’re eager to see what new conferences and events will emerge and how they’ll shape the industry’s trajectory. Above all, we’re intrigued by the widespread use of AI and its potential to transform the market. As always, we’ll continue to closely follow the industry’s developments and bring you the latest updates. Thank you for joining us for our Q1 review, and we look forward to exploring the future of AV software with you.

Want to find out more about our audio and video quality testing services—or have any questions for our team? Reach out to us now!

Teodors Drazlovskis & Nikolajs Varlamovs
Teodors Drazlovskis & Nikolajs Varlamovs
Articles: 1