UK Independent. Sourced. Primary. · Est. 2024
Home Bills VoIP Explained: How Voice Over Internet Protocol Actually Works
Bills

VoIP Explained: How Voice Over Internet Protocol Actually Works

VoIP turns speech into digital packets and sends them across the internet using protocols such as SIP and codecs such as G.711. This guide breaks down how a VoIP call is built, routed and kept clear.

CT
Chandraketu Tripathi
Finance Editor, Kaeltripton
Published 5 Jun 2026
Last reviewed 5 Jun 2026
✓ Fact-checked
VoIP Explained: How Voice Over Internet Protocol Actually Works
Advertisement
BROADBAND & TELECOMS
KEY FACTS
  • VoIP stands for Voice over Internet Protocol and carries calls as data packets rather than over a dedicated voice circuit.
  • Session Initiation Protocol (SIP) is the signalling standard most commonly used to set up, manage and end VoIP calls.
  • The G.711 codec defined by the ITU encodes voice at 64 kilobits per second per direction without compression.
  • The G.729 codec compresses voice to around 8 kilobits per second, using far less bandwidth than G.711.
  • Jitter, packet loss and insufficient bandwidth are the main causes of degraded VoIP call quality.
TL;DR

VoIP converts speech into digital packets, uses SIP to set up the call and a codec such as G.711 to encode the audio, then routes the packets over the internet to the other party in near real time.

Last reviewed: June 2026

What VoIP Is

Every time a call is placed over a broadband line rather than a traditional copper circuit, Voice over Internet Protocol is doing the work in the background. VoIP is the family of technologies that lets the human voice travel across the same internet that carries web pages and video. Rather than holding open a dedicated electrical path between two telephones for the duration of a call, VoIP breaks the conversation into small digital packets and sends them across a shared network, reassembling them at the far end fast enough that the two people hear each other in something very close to real time.

This matters because the United Kingdom is in the middle of retiring its analogue telephone network. Openreach is moving every line onto an internet protocol platform, with the all-IP migration scheduled to complete in 2027, which means VoIP is becoming the default rather than a specialist alternative. Understanding how it works helps explain why call quality depends on the broadband connection and why a VoIP line behaves differently from the old copper service.

How Audio Becomes Digital Packets

A telephone call starts as sound: pressure waves that a microphone turns into a continuously varying electrical signal. VoIP cannot send that continuous signal directly, so it samples the audio many times per second and converts each sample into a number, a process known as analogue-to-digital conversion. The stream of numbers is then grouped into packets, each carrying a small slice of the conversation along with addressing information that tells the network where the packet should go.

Those packets travel independently across the internet and may even take different routes to reach the destination. At the far end the receiving device places them back in order, converts the numbers back into an electrical signal and drives a speaker so the listener hears the original voice. Because each packet is timestamped, the receiver can detect when packets arrive late or out of sequence, which is central to keeping the conversation intelligible. The whole cycle of capture, packetise, transmit and reassemble happens continuously throughout the call.

The Role of SIP

Sending audio packets is only half of a telephone call. Something has to set the call up, make the other phone ring, agree how the audio will be encoded and tear the call down at the end. That signalling job is most commonly handled by the Session Initiation Protocol, usually shortened to SIP. SIP is the language two systems use to say who is calling whom, to negotiate the parameters of the session and to manage events such as answering, holding, transferring and hanging up.

It helps to separate the two streams. SIP carries the control messages, the equivalent of dialling and ringing, while the actual voice travels in a separate media stream once the call is connected. This separation is why a VoIP system can advertise which codecs it supports during call setup and then settle on one both ends understand before any speech is exchanged. SIP also underpins SIP trunking, the method by which businesses connect their telephone systems to the wider network over IP.

Codecs: G.711, G.729 and the Trade-Offs

A codec is the component that encodes the sampled audio into digital form and decodes it again at the other end, and the choice of codec shapes both quality and bandwidth. The G.711 codec, standardised by the International Telecommunication Union, encodes voice at 64 kilobits per second per direction without compression, which delivers clear audio at the cost of higher bandwidth. The G.729 codec compresses the same voice to roughly 8 kilobits per second, using far less of the connection but applying more processing to do so.

The practical trade-off is straightforward. Where bandwidth is plentiful, an uncompressed codec such as G.711 keeps the audio faithful. Where bandwidth is constrained, a compressed codec such as G.729 fits more simultaneous calls into the same link at the price of slightly more aggressive encoding. The table below sets out the main technical components of a VoIP call and what each one does.

ComponentRole in a VoIP call
SIPSets up, manages and ends the call session
G.711 codecEncodes voice at 64 kbit/s per direction, no compression
G.729 codecCompresses voice to around 8 kbit/s to save bandwidth
Media streamCarries the encoded audio packets between parties
Jitter bufferSmooths out packets that arrive at uneven intervals

How Calls Are Routed

Once SIP has agreed the session and a codec has been chosen, the audio packets are addressed and pushed onto the network. They pass through the home or office router, across the broadband provider's network and on through the internet or a provider's voice network until they reach the destination. The packets carry the addressing needed for each network device to forward them towards their target, in the same way that any other internet traffic is routed, which is why VoIP does not need a dedicated circuit reserved end to end.

Because the packets share the network with everything else, the route they take and the conditions along it can vary moment to moment. A jitter buffer at the receiving end holds incoming packets briefly so that small variations in arrival time can be smoothed before the audio is played back. This buffering is a deliberate compromise: a little delay is traded for steadier, more intelligible sound. The receiver also handles packets that arrive out of order or not at all, which leads directly to the factors that determine quality.

What Affects Quality

Three technical factors dominate VoIP call quality. Jitter is the variation in the time between packets arriving; high jitter makes audio sound choppy because the receiver struggles to reassemble a steady stream. Packet loss occurs when packets fail to arrive at all, producing gaps or clipped words. Insufficient bandwidth, where the connection cannot carry the call alongside other traffic, forces packets to queue or be dropped and degrades the conversation. Latency, the overall delay end to end, can also make a call feel awkward even when the audio itself is clear.

These factors explain why a VoIP line is only as good as the connection beneath it. Prioritising voice traffic on the local network, ensuring enough headroom in the broadband link and using a wired rather than congested wireless path all help keep jitter and loss low. The codec choice interacts with this too, since a compressed codec needs less bandwidth but leaves less margin for error if the connection is already strained.

Frequently Asked Questions

What is VoIP?

VoIP stands for Voice over Internet Protocol, a family of technologies that carries telephone calls as digital data packets over the internet rather than over a dedicated analogue circuit. The voice is digitised, packetised, transmitted and reassembled at the far end in near real time. It is becoming the standard as the UK retires its analogue network by 2027.

How does a VoIP call travel from one phone to another?

The speaker's voice is sampled and converted into digital packets, which are addressed and sent across the router, the broadband network and the wider internet to the destination. Each packet may take its own route and is reassembled in order at the far end. A jitter buffer smooths arrival times before the audio is played back.

What is SIP?

SIP, the Session Initiation Protocol, is the signalling standard most commonly used to set up, manage and end VoIP calls. It handles the control messages, such as making the phone ring and negotiating the codec, while the voice itself travels in a separate media stream. SIP also underpins SIP trunking used by businesses.

What bandwidth does VoIP need?

The bandwidth depends on the codec: the uncompressed G.711 codec uses 64 kilobits per second per direction, while the compressed G.729 codec uses around 8 kilobits per second. Additional overhead from packet headers adds to these figures in practice. Enough headroom is needed so voice traffic is not squeezed by other use of the connection.

What causes poor VoIP call quality?

The main causes are jitter, the uneven arrival of packets, packet loss, where packets fail to arrive, and insufficient bandwidth on the connection. High latency can also make a call feel awkward even when the audio is otherwise clear. Prioritising voice traffic and ensuring a stable, uncongested connection help keep quality high.

DISCLAIMERKael Tripton Ltd is not authorised or regulated by the Financial Conduct Authority. This article is for informational purposes only and does not constitute financial, legal, or professional advice. Always seek independent professional advice before making financial decisions. Kael Tripton Ltd, registered in England and Wales (No. 17177071), is registered with the ICO under ZC135439.
Advertisement

Editorial Disclaimer

The content on Kaeltripton.com is for informational and educational purposes only and does not constitute financial, investment, tax, legal or regulatory advice. Kaeltripton.com is not authorised or regulated by the Financial Conduct Authority (FCA) and is not a financial adviser, mortgage broker, insurance intermediary or investment firm. Nothing on this site should be construed as a personal recommendation. Rates, figures and product details are indicative only, subject to change without notice, and should always be verified directly with the relevant provider, HMRC, the FCA register, the Bank of England, Ofgem or other appropriate authority before any financial decision is made. Past performance is not a reliable indicator of future results. If you require regulated financial advice, please consult a qualified adviser authorised by the FCA.

CT
Chandraketu Tripathi
Finance Editor · Kaeltripton.com
Chandraketu (CK) Tripathi, founder and lead editor of Kael Tripton. 22 years in finance and marketing across 23 markets. Writes on UK personal finance, tax, mortgages, insurance, energy, and investing. Sources: HMRC, FCA, Ofgem, BoE, ONS.

Stay ahead of your money

Free UK finance guides, rate changes and money-saving tips — straight to your inbox. No spam, unsubscribe anytime.

Read More

Get Kael Tripton in your Google feed

⭐ Add as Preferred Source on Google