How NTP Works

Last update: 20-Apr-2012 20:48 UTC

Related Links

Table of Contents


Introduction and Overview

Note: This page and its dependencies contain a technical description of the Network Time Protocol (NTP) architecture and operation. It is intended for administrators, operators and monitoring personnel. Additional information for nontechnical readers can be found in the white paper Executive Summary: Computer Network Time Synchronization. While this page and its dependencies are primarily concerned with NTP, additional information on related protocols can be found in the white paper IEEE 1588 Precision Time Protocol (PTP).

NTP time synchronization services are widely available in the public Internet. The public NTP subnet currently includes several thousand servers in most countries and on every continent of the globe, including Antarctica, and sometimes in space and on the sea floor. These servers support a total population estimated at over 25 million computers in the global Internet.

The NTP subnet operates with a hierarchy of levels, where each level is assigned a number called the stratum. Stratum 1 (primary) servers at the lowest level are directly synchronized to national time services via satellite, radio or telephone modem. Stratum 2 (secondary) servers at the next higher level are synchronized to stratum 1 servers and so on. Normally, NTP clients and servers with a relatively small number of clients do not synchronize to public primary servers. There are several hundred public secondary servers operating at higher strata and are the preferred choice.

This page presents an overview of the NTP implementation included in this software distribution. We refer to this implementation as the reference implementation only because it was used to test and validate the NTPv4 specification RFC-5905. It is best read in conjunction with the briefings and white papers on the Network Time Synchronization Research Project page. An executive summary suitable for management and planning purposes is on the white paper Executive Summary: Computer Network Time Synchronization.

gif

Figure 1. NTP Daemon Processes and Algorithms

The overall organization of the NTP architecture is shown in Figure 1. It is useful in this context to consider the implementation as both a client of upstream (lower stratum) servers and as a server for downstream (higher stratum) clients. It includes a pair of peer/poll processes for each reference clock or remote server used as a synchronization source. Packets are exchanged between the client and server using the on-wire protocol described in the white paper Analysis and Simulation of the NTP On-Wire Protocols. The protocol is resistant to lost, replayed or spoofed packets.

The poll process sends NTP packets at intervals ranging from 8 s to 36 hr. The intervals are managed as described on the Poll Process page to maximize accuracy while minimizing network load. The peer process receives NTP packets and performs the packet sanity tests described on the Event Messages and Status Words page and flash status word. The flash status word reports in addition the results of various access control and security checks described in the white paper NTP Security Analysis. A sophisticated traffic monitoring facility described on the Rate Management and the Kiss-o'-Death Packet page protects against denial-of-service (DoS) attacks.

Packets that fail one or more of these tests are summarily discarded. Otherwise, the peer process runs the on-wire protocol that uses four raw timestamps: the origin timestamp T1 upon departure of the client request, the receive timestamp T2 upon arrival at the server, the transmit timestamp T3 upon departure of the server reply, and the destination timestamp T4 upon arrival at the client. These timestamps, which are recorded by the rawstats option of the filegen command, are used to calculate the clock offset and roundtrip delay samples:

offset = [(T2 - T1) + (T3 - T4)] / 2,
delay = (T4 - T1) - (T3 - T2).

In this description the transmit timestamps T1 and T3 are softstamps measured by the inline code. Softstamps are subject to various queuing and processing delays. A more accurate measurement uses drivestamps, as described on the NTP Interleaved Modes page.

algorithm described on the Clock Filter Algorithm page selects the offset and delay samples most likely to produce accurate results. Those servers that have passed the sanity tests are declared selectable. From the selectable population the statistics are used by the algorithm described on the Clock Select Algorithm page to determine a number of truechimers according to Byzantine agreement and correctness principles. From the truechimer population the algorithm described on the Clock Cluster Algorithm page determines a number of survivors on the basis of statistical clustering principles. The algorithms described on the Mitigation Rules and the prefer Keyword page combine the survivor offsets, designate one of them as the system peer and produces the final offset used by the algorithm described on the Clock Discipline Algorithm page to adjust the system clock time and frequency. The clock offset and frequency, are recorded by the loopstats option of the filegen command. For additional details about these algorithms, see the Architecture Briefing on the Network Time Synchronization Research Project page.

NTP Timescale and Data Formats

NTP clients and servers synchronize to the Coordinated Universal Time (UTC) timescale used by national laboratories and disseminated by radio, satellite and telephone modem. This is a global timescale independent of geographic position. There are no provisions for local time zone or daylight savings time; however, these functions can be performed by the operating system on a per-user basis.

The UT1 timescale, upon which UTC is based, is determined by the rotation of the Earth about its axis, which is gradually slowing down relative to International Attomic Time (TAI). In order to rationalize UTC with respect to TAI, a leap second is inserted at intervals of about 18 months, as determined by the International Earth Rotation Service (IERS). The historic insertions are documented in the leap-seconds.list file, which can be downloaded from the NIST FTP server. This file is updated at intervals not exceeding six months. Leap second warnings are disseminated by the national laboratories in the broadcast timecode format. These warnings are propagated from the NTP primary servers via other server to the clients by the NTP on-wire protocol. The leap second is implemented by the operating system kernel, as described in the white paper The NTP Timescale and Leap Seconds.

There are two NTP time formats, a 64-bit timestamp format and a 128-bit date format. The date format is used internally, while the timestamp format is used in packet headers exchanged between clients and servers. The timestamp format spans 136 years, called an era. The current era began on 1 January 1900, while the next one begins in 2036. Details on these formats and conversions between them are in the white paper The NTP Era and Era Numbering. However, the NTP protocol will synchronize correctly, regardless of era, as long as the system clock is set initially within 68 years of the correct time. Further discussion on this issue is in the white paper NTP Timestamp Calculations. Ordinarily, these formats are not seen by application programs, which convert these NTP formats to native Unix or Windows formats.

Statistics Budget

Each NTP synchronization source is characterized by the offset and delay samples measured by the on-wire protocol using the equations above. The dispersion sample is initialized with the sum of the server precision and the client precision as each update is received. The dispersion increases at a rate of 15 μs/s after that. For this purpose, the precision is equal to the latency to read the system clock. The offset, delay and dispersion are called the sample statistics.

In a window of eight (offset, delay, dispersion) samples, the algorithm described on the Clock Filter Algorithm page selects the sample with minimum delay, which generally represents the most accurate offset statistic. The selected offset sample determines the peer offset and peer delay statistics. The peer dispersion is a weighted average of the dispersion samples in the window. These quantities are recalculated as each update is received from the source. Between updates, both the sample dispersion and peer dispersion continue to grow at the same rate, 15 μs/s. Finally, the peer jitter is determined as the root mean square (RMS) of the offset samples in the window relative to the selected offset sample. The peer statistics are recorded by the peerstats option of the filegen command. Peer variables are displayed by the rv command of the ntpq program.

The clock filter algorithm continues to process updates in this way until the source is no longer reachable. Reachability is determined by an eight-bit shift register, which is shifted left by one bit as each poll packet is sent, with 0 replacing the vacated rightmost bit. Each time a valid update is received, the rightmost bit is set to 1. The source is considered reachable if any bit is set to 1 in the register; otherwise, it is considered unreachable. When a source becomes unreachable, a dummy sample with "infinite" dispersion is inserted in the shift register at each poll, thus displacing old samples. This causes the peer dispersion to increase eventually to infinity.

The composition of the source population and the system peer selection is redetermined as each update from each source is received. The system peer and system variables are determined as described on the Mitigation Rules and the prefer Keyword page. The system variables are updated from the system peer variables of the same name and the system stratum set one greater than the system peer stratum. The system statistics are recorded by the loopstats option of the filegen command. System variables are displayed by the rv command of the ntpq program.

Although it might seem counterintuitive, a cardinal rule in the selection process is, once a sample has been selected by the clock filter algorithm, older samples are no longer selectable. This applies also to the clock select algorithm. Once the peer variables for a source have been selected, older variables of the same or other sources are no longer selectable. The reason for these rules is to limit the time delay in the clock discipline algorithm. This is necessary to preserve the optimum impulse response and thus the risetime and overshoot.

This means that not every sample can be used to update the peer variables, and up to seven samples can be ignored between selected samples. This fact has been carefully considered in the discipline algorithm design with due consideration for feedback loop delay and minimum sampling rate. In engineering terms, even if only one sample in eight survives, the resulting sample rate is twice the Nyquist rate at any time constant and poll interval.

Quality of Service

Of interest in the following discussion is how an NTP client determines the system performance using a source population including reference clocks and remote servers. This is determined for each source from two statistics, peer jitter and peer distance. Peer jitter is determined from various jitter components as described above. It represents the nominal error in determining the clock offset. Peer distance represents the maximum error of the estimate due to all causes.

The peer distance statistic is computed as one-half the root delay to the primary source of time; i.e., the primary reference clock, plus the root dispersion. The root variables are included in the NTP packet header received from each source. When calculating peer delay , the root delay is the sum of the root delay in the packet and the peer delay, while the root dispersion is the sum of the root dispersion in the packet and the peer dispersion.

A source is considered selectable only if its peer distance is less than the select threshold, by default 1.5 s, but can be changed according to client preference using the maxdist option of the tos command. When an upstream server loses all sources, its peer distance apparent to dependent clients continues to increase. The clients are not aware of this condition and continue to accept synchronization as long as the peer distance is less than the select threshold.

The -peer distance statistic is also used by the select, cluster and mitigation algorithms. In this respect, it is called the root synchronization distance shortened to root distance. The root distance is used in the following ways.

The root distance functions as a metric in the selection and weighting of the various available sources. The strategy using the metric is to minimize the root distance, or equivalently to minimize the maximum error in cases of multiple sources. As used in the NTP design, this is described as a minimax strategy.

The algorithms described on the Mitigation Rules and the prefer Keyword page deliver several important statistics, including system offset and system jitter. These statistics are determined by the clock combine algorithm. System offset is best interpreted as the maximum-likelihood estimate of the system clock offset, while system jitter is best interpreted as the expected error of this estimate.

Maximum system error is determined from delay and dispersion contributions and represents the worst-case error due to all causes. In order to simplify discussion, certain minor contributions to the maximum error statistic are ignored. If the precision time kernel support is available, both the estimated error and maximum error are reported to user programs via the ntp_adjtime() kernel system call. See the Kernel Model for Precision Timekeeping page for further information.