Shauli Zacks
Published on: August 4, 2025
Sandro Gauci, CEO of Enable Security, has been at the forefront of VoIP and real-time communications security for over two decades. His journey began in Malta during the early 2000s, where he was likely the island’s only full-time security researcher at the time. Frustrated by the limited opportunities for hands-on penetration testing and determined to dig deeper into the overlooked risks of voice protocols like SIP and RTP, Sandro launched Enable Security in 2008.
Today, Enable Security is best known for SIPVicious PRO and the ESAP (Enable Security Attack Platform), both of which are purpose-built to uncover and test vulnerabilities in real-time communications systems—where generic security tools often fall short. In this SafetyDetectives interview, Sandro shares insights from years of specialized security assessments, explains why traditional defenses don’t work for voice and video protocols, and explores the new wave of threats posed by AI and deepfake technology in real-time comms.
Can you tell us about your journey in cybersecurity and what led you to found Enable Security?
My cybersecurity journey began straight out of school in 2000, when I was fortunate to be recruited as a security researcher for a software company that was shifting its focus toward information security. My foundation was shaped by the hacker e-zines and full-disclosure culture of the 90s – that’s essentially how I trained myself in security. At the time, I was based in Malta – a small island nation where I believe I was the only security researcher on the island.
While that early role provided valuable experience, I eventually wanted greater control over my career trajectory and the opportunity to perform penetration testing across diverse organizations rather than serving as an internal security researcher for a single company. Since there weren’t many companies in Malta that matched this vision, I decided to go freelance, ultimately founding Enable Security in 2008. This journey took me from Malta to the UK and eventually to Germany.
Throughout this transition, I kept encountering the same critical blind spots across organizations: communication systems like VoIP and SIP were consistently treated as afterthoughts in security assessments, despite their protocol complexity creating unique and exploitable attack surfaces. Generic security tools barely scratched the surface of these specialized protocols.
While we had been conducting VoIP security work for some time, founding Enable Security allowed us to pursue a focused mission: specializing deeply in real-time communications security and developing both tooling (like SIPVicious) and methodologies that expose vulnerabilities conventional assessments miss, before adversaries discover them.
Enable Security is widely recognized for SIPVicious PRO. What specific gaps in VoIP or SIP security does it address that traditional security tools might miss?
SIPVicious PRO was specifically built to understand voice protocols on their own terms, natively speaking protocols like SIP and RTP that generic scanners simply cannot engage with effectively. Rather than trying to force traditional network or web application security approaches onto these specialized protocols, we developed tools that work within the unique constraints and characteristics of real-time communications.
For us, SIPVicious PRO functions as a Swiss Army knife for various VoIP protocol-specific attacks and represents one key component in our broader security arsenal. Today, we integrate it within what we call the ESAP (Enable Security Attack Platform) where we prepare comprehensive security tests, particularly for VoIP and real-time communications, as well as DoS tests for other protocols.
The key differentiator lies in customization and repeatability. Unlike inflexible traditional scanners, ESAP allows us to reproduce vulnerabilities discovered during penetration tests and security consultancy work, tailored to each client’s specific setup and requirements, such as custom SIP headers or unique protocol implementations. This approach ensures that once vulnerabilities are identified and fixed, we can verify they don’t reoccur through repeatable, targeted testing.
VoIP systems are increasingly integrated into cloud services and unified communication platforms. What are the most common mistakes organizations make when deploying these technologies?
The most significant mistake we observe is organizations expecting cloud protection mechanisms (such as CDNs, network firewalls and DDoS scrubbers) to provide adequate protection for VoIP and WebRTC protocols. This fundamental misunderstanding stems from treating voice protocols like web traffic, when they have completely different requirements and characteristics.
VoIP and WebRTC protocols are extremely sensitive to latency, making them incompatible with inline security mechanisms that might not understand their specific protocol requirements. Traditional network or web security appliances often introduce more harm than protection, particularly when it comes to DoS protection, because they create latency and processing delays that can render voice communications unusable.
Organizations need to recognize that real-time communications require security approaches designed specifically for these protocols, rather than attempting to retrofit security solutions that fundamentally conflict with the low-latency requirements essential for quality voice and video communications.
You’ve conducted security assessments for a wide range of communication products. What patterns or recurring issues tend to come up across vendors?
The most persistent and challenging problems in real-time communications products relate to Denial of Service vulnerabilities. Due to the inherent complexity of signaling and media protocols involved in these systems, it’s extremely difficult to avoid application-level DoS issues entirely.
This challenge is compounded by the unforgiving nature of voice and video communications when it comes to availability and quality. Customers will quickly abandon providers that deliver even slightly degraded service (choppy audio or dropped calls), making DoS vulnerabilities not just a security concern but a critical business risk.
The combination of protocol complexity, the difficulty of implementing robust DoS protection without impacting performance, and the zero-tolerance environment for service degradation makes this a recurring area of concern across virtually every vendor and platform we assess.
Real-time communications require low latency and high availability. How do you balance security hardening with the performance demands of systems like SIP or WebRTC?
The key to balancing security with performance lies in leveraging flexible systems, often open-source applications, that allow for precise customization and optimization. These platforms provide the granular control necessary to implement security hardening without introducing latency or compromising high availability requirements.
Given the internal complexity of various VoIP platforms, having access to the underlying architecture and the ability to fine-tune security implementations becomes crucial. Proprietary, black-box solutions rarely offer the flexibility needed to achieve this balance effectively.
The approach requires deep understanding of both the security landscape and the specific performance characteristics of real-time protocols. Security measures must be implemented with full awareness of their impact on latency, jitter, and overall system responsiveness – because in real-time communications, security controls that degrade user experience ultimately undermine the business value of the entire system.
Equally important is having the right people who can debug security incidents in a SIP or WebRTC environment. Finding or training personnel with this specialized expertise isn’t easy, but it makes a huge difference to the outcome of any incident. When something goes wrong, you need people who understand both the security implications and the protocol-specific nuances to respond effectively.
Looking ahead, what do you see as the biggest challenges for securing voice and video communications—especially with the rise of AI, deepfakes, and real-time voice synthesis?
It’s going to be really interesting to watch this unfold. We’re entering an era where attacks that previously couldn’t scale due to human and software limitations suddenly have the potential to scale dramatically through multi-modal LLMs and voice AI capabilities.
The implications go far beyond traditional protocol security concerns. When attackers can leverage AI to conduct sophisticated social engineering attacks at scale, or when deepfake technology can convincingly impersonate trusted individuals in real-time voice communications, we’re dealing with a completely new threat environment.
I’m actively monitoring this space because the combination of AI capabilities with real-time communications creates both serious risks and interesting opportunities. In fact, this is exactly the type of emerging threat we explore in our monthly RTCSec newsletter at enablesecurity.com/newsletter, where we dive deep into VoIP and WebRTC security developments, providing commentary and fostering discussion about how these evolving technologies impact real-time communications security.
The challenge will be developing security frameworks that can detect and mitigate AI-enhanced attacks while preserving the authentic, low-latency communication experiences that users expect. The traditional boundaries between technical security vulnerabilities and social engineering attacks are blurring, requiring security professionals to think about protecting not just the infrastructure, but the integrity of the communications themselves. These are topics we will be analyzing and discussing with our community as these threats continue to evolve.