Precision Time Protocol

High precision clock synchronization that computes latency and offset

Paul Krzyzanowski

February 15, 2021

Goal: Synchronize clocks on a local-area network to sub-microsecond precision.

Precision Time Protocol

The Precision Time Protocol, PTP, is designed to synchronize clocks on a local area network (LAN) to sub-microsecond precision. The underlying assumption is that the accurate time source is on the same LAN as the systems that are synchronizing their clocks. That differs from NTP, which assumes that the time server is remote. An advantage of synchronizing via the LAN is that latency becomes far more predictable. There are no queuing delays due to routers (recall that routers have to store an entire message before forwarding it to the next router) and physical distances tend to be far smaller. Ethernet switches do create some latency, of course, and there is a chance that a packet may be queued if a destination is receiving other messages. Switch latency, however, tend to be on the order of a few microseconds and queuing can be nonexistent if there isn’t much traffic.

To further minimize delay and jitter, high-precision PTP implementations will try to generate timestamps at the very lowest levels of the network stack such as at the MAC layer (Ethernet transceiver) right before sending the packet out.

Best master clock

In a network of computers running PTP, one system has to be elected as the master clock. This system is deemed to have the most accurate clock and other systems are slaves that synchronize from the master.

A best master clock selection process is used to determine which system has the best clock to use for synchronization. This is done via an election where systems present information about their clocks and the best clock is selected from the following ranked attributes:

  1. Priority 1 (an admin-defined hint, allowing the administrator to force the use of a certain system as the master)
  2. Clock class (type of clock)
  3. Clock accuracy
  4. Clock variance: an estimate of stability based on past syncs
  5. Priority 2 (another admin-defined hint, allowing the preference of one system over another if all other values are equivalent)
  6. Unique ID (serves as a tie-breaker on the chance that all other values are equivalent among systems)

PTP messages

PTP synchronizes a clock by exchanging three messages:

  1. The master initiates the sync by sending a sync message.

  2. The slave sends a delay request message back to the master.

  3. The master responds with a delay response message.

A delay request message isn’t quite what it sounds: it is not a query asking the master what the network delay is. The master has no clue. Rather, it is an additional message that allows the slave to get an accurate estimate of the network delay.

PTP synchronization
PTP synchronization

Let’s look at an example. We’ll use hours and minutes instead of microseconds to make the numbers more tangible and feel like time values.

In our example, the clocks on the two systems are not synchronized. In the figure, we note the times for each event with each system’s local clock.

Sync message

The master initiates synchronization at T1, which is 10:00 at the master, by sending a sync message to the slave. That time happens to be 8:00 on the slave’s clock. The slave receives the message at time T2, which is 8:15 on the slave’s clock. It happens to be 10:15 on the master’s clock but the slave doesn’t know that.

The slave now has T1 from the master (10:00) and knows its value of T2 (8:15). The difference between them (ms_difference) is the clock offset plus the delay of sending the message from the master to the slave. We can express this as:

ms_difference = T2 - T1 = offset + ms_delay

In our example, we know that the clock offset is -2:00 (the client is behind by two hours) and the propagation time, ms_delay, is 0:15. The value of offset + ms_delay is -2:00 + 0:15, or -1:45. The slave does not know the delay or offset yet. All it can compute is

ms_difference = T2 - T1 = 8:15 - 10:00 = -1:45

Delay request and response messages

Now the slave sends a delay request message back to the master. This message simply asks the master to tell the slave what time it received this message. Suppose the slave sends the message at T3 (8:45) and the master receives it at T4 (10:55). The master sends back the value of T4 (10:55).

The math is similar to that of the sync message. The slave has T4 from the master (10:55) and knows its value of T3 (8:45). The difference between them (sm_difference) is the negative clock offset plus the delay of sending the message from the master to the slave. The offset is negative because the message went in the opposite direction. We can express this as:

sm_difference = T4 - T3 = -offset + sm_delay

Again, let’s look at the numbers. The clock offset is still -2:00. The message propagation time here was now 0:10. The value of -offset + sm_delay is 2:00 + 0:10 = 2:10. Again, the slave does not know the delay or offset but it can compute:

sm_difference = T4 - T3 = 10:55 - 8:45 = 2:10

Two equations, three unknowns

With these messages, the client has two equations and three unknowns to work with:

ms_difference: T2 - T1 = offset + ms_delay
sm_difference: T4 - T3 = -offset + sm_delay

That isn’t solvable, of course. However, LANs have symmetric bandwidth and we can assume that the latency of sending a message from the master to the slave is the same as that of sending a message from the slave to the master. There will be variations due to process and packet scheduling as well as possible queuing delays in the Ethernet switch, but these will usually be small - a few microseconds.

Hence, if we assume that ms_delay and sm_delay are equivalent, we have:

ms_difference: T2 - T1 = offset + delay
sm_difference: T4 - T3 = -offset + delay

Now, with two equations and two unknowns, we can solve for the offset:

offset = (ms_difference - sm_difference) ÷ 2 = ((T2 - T1) - (T4 - T3)) ÷ 2
offset = (T2 - T1 - T4 + T3) ÷ 2

In our example, with slightly asymmetric delays, we get:

offset = (8:15 - 10:00 - 10:55 + 8:45) ÷ 2 = ()-1:45 - 2:10) ÷ 2 = -3:55 ÷ 2 = -1:57.5

With the asymmetric delay in this example, PTP gave averaged out the message latencies, resulting in an offset of -1:57.5 instead of the correct value of -2:00.

The way the PTP equations were set up, the offset is subtracted from the current time, so the slave will have to decrease its clock by -1:57.5, or advance it by 1:57.5. This results in the system setting its clock to

new_time = current_time - offset new_time = 8:55 - -1:57.5 8:55 + 1:57.5 = 10:52.5

We can also compute the one-way message delay:

delay = (ms_difference + sm_difference) ÷ 2 = ((T2 - T1) + (T4 - T3)) ÷ 2
offset = (T2 - T1 + T4 - T3) ÷ 2

PTP vs. NTP

NTP and PTP are similar in some ways. The vast majority of systems use SNTP (the simple network time protocol) to synchronize their clocks. This is a simplified form of NTP that generally used by systems that will not provide clock synchronization services to others. NTP encourages a client to probe multiple servers to find the best clock, which includes taking into account the latency and jitter of the connection to the clock server. With SNTP, systems are configured to communicate with a single NTP server. NTP clock servers use a full version of the protocol, in which they send messages back and forth and keep history of past synchronizations. This allows them keep a history of clock drift (to estimate the growing inaccuracy of a clock) and to try to estimate the one-way latency much like PTP does. When operating in SMTP mode, servers don’t maintain any state (history) of past synchronizations.

NTP and SNTP usually enables clients to compute a clock value that is accurate to a few milliseconds. PTP has the same goal (clock synchronization that includes taking latency into account). However, there are a few key differences in its implementation.

  1. To obtain nano- or picosecond accuracy, PTP is usually implemented in system firmware (e.g., a dedicated microcontroller or in the network adapter). This avoids the added variable latency of the operating system’s scheduler as data moves through the network stack and is eventually received by the time server.
  2. PTP assumes that all communications occur on the same local area network to avoid the overhead of packet routing.
  3. All the systems in a PTP network are considered peers, any of which may contain an accurate time source. There’s an election process to choose a master - the system that will determine the time for all other clocks in this system.
  4. With NTP, each client decides whether to synchronize and how frequently to do so. With PTP, the elected master controls the synchronization; it decides when to send messages to each system in the network for that system to synchronize its clock.

References

Last modified November 1, 2022.
recycled pixels