- We have extracted the NTP codebase from BitKeeper
This was not a trivial achievement. One of our team members has more experience doing messy repository conversions than anyone else alive, including gnarly and venerable projects like groff and Emacs. Producing a really high-quality conversion took him ten weeks of concentrated effort. You can read more about how to spot a high-quality repository conversion; our NTP history has all the traits of good ones.
- We have significantly hardened the code against buffer overruns
To prevent buffer overruns, we have replaced all unsafe string functions (strcpy/strcat/strtok/sprintf/vsprintf/gets) with safe versions that take a buffer bound.
- We have removed about a hundred seventy thousand lines of obsolete code
Less code means less attack surface. Here are some things we’ve found to remove:
45.0 KLOC (20%)
In-tree copy of libevent2
29.0 KLOC (13%)
Undocumented, deprecated, obsolete, or broken refclocks
24.0 KLOC (11%)
Unused ISC library code
13.0 KLOC (5%)
10.0 KLOC (4%)
Removal of legacy Windows code
9.0 KLOC (4%)
Code obsolete because we assume a POSIX.1-2001/C99 base
9.0 KLOC (4%)
Removal of ntpdc
3.0 KLOC (1%)
Move of sntp/ntpdig to Python
2.0 KLOC (2%)
Replacement of ntpdate by a shell wrapper around ntpdig
4.0 KLOC (1%)
Removal of Autokey.
7.0 KLOC (3%)
Replacement of C ntpq in Python
15.0 KLOC (7%)
2.0 KLOC (1%)
Simplification of async-DNS lookup.
173.0 KLOC (74%)
58.0 KLOC (25%)
231.0 KLOC (100%)
Total at conversion time
(Totals will not sum exactly due to rounding errors. Numbers generated using loccount.)
These removals include most of the bulk of ntpd, the most security-critical code. And the numbers understate the size of the removals slightly because the way we counted "remaining" code includes some test code and features added since the repository conversion.
- We have improved the accuracy of time stepping with real hardware by x10
When NTP was originally written, computer clocks only delivered microsecond precision. Now they deliver nanosecond precision (though not all of that precision is accurate). By changing some internal representations we have made NTPsec able to use the full precision of modern clocks, which results in a factor 10 or more of accuracy improvement with real hardware such as GPSDOs and dedicated time radios.
- We have implemented a simpler refclock configuration syntax
The ntp.conf configuration syntax for declaring local reference clocks was notoriously odd and opaque. While we continue to support it for backward compatibility, we have implemented and documented a simpler and more comprehensible syntax. No more magic 127.127.t.u address or numeric driver types!
- We have added a statistics-file visualizer to help diagnose problems
Tuning the configuration of a time server is a subtle and sometimes tricky process. The causes of problems can range from misconfiguration through power outages to bad network weather and beyond. NTP keeps logs of operating statistics that can be very helpful - if you know how to interpret them. NTPsec supplies a visualization tool that makes those numbers into graphs so you can (literally) see what’s going on.
- We have applied Coverity scanning and fixed bugs where they were revealed
NTP Classic had occasionally been Coverity-checked in the past, but those warnings were not systematically fixed. We found 40 unresolved defect reports and dealt with them appropriately.
- We have fixed all compiler warnings
All compiler warnings have been either fixed or suppressed for principled reasons (like "that one is caused by a known GCC optimizer bug"). This may seem like a small thing, but it can have a significant effect on maintainability and downstream defect rates. Noise warnings are a kind of undergrowth in which real warnings and the bugs that trigger them can lurk undetected.
- We have increased the visibility of potential defects to static analyzers
By well-defined transformations of the code, including improving its type discipline, we can give static analyzers and verification tools better traction to find defects and potential vulnerabilities. We’ve got a good running start on this already, changing the code to use C99 bools wherever possible and safe.
- We have moved the build to waf
The NTP Classic build system was 31,000 lines of impacted autotools cruft, a Lovecraftian nightmare of complexity that reached out from sunken R’lyeh to trouble the dreams of the living. We switched to waf, an alternative single-phase build system also used (among other places) by the Samba project. Our builds got an order of magnitude faster, and the new build recipe is about 1 KLOC!
- We have updated and reorganized the documentation
The NTP in-tree documentation was in terrible shape - incomplete where it was not actively misleading, with some parts of it that needed updating (such as the WHERE-TO-START file) actually unmodified since 1998. It had a serious problem with different documents telling different stories. We’ve fixed all that; every command-line switch and every configuration entity now has a single point of truth, also the manual pages and Web documentation are generated from the same master files. Archaisms have been removed and much has been updated.
- We have reduced the set of languages used in the suite to C and Python
The code we inherited used way too many languages - there were scripts lurking in Perl, awk, and S. By discarding or replacing this stuff we have increased its maintainability, leaving fewer dark corners for bugs to lurk in. All the scripting we install is now Python, creating opportunities for code reuse across programs that didn’t exist before.
- We have added an improved reporting tool for time service operators
We have collected on the gainss from Pythonizing all our scripting by adding ntpmon, an administrator’s tool that can watch the status of a time server in near real time. It is especially useful for catching startup problems and traffic spikes that might indicate a DoS attempt.
- We have earned trust in the InfoSec research community
NTPSec was the first NTP implementation to respond to and develop a fix for CVE-2015-7704, the KoD off-path attack bug that achieved news coverage in Ars Technica and elsewhere in October 2015. Even before we first shipped code we participated in the mitigation and disclosure process on over a dozen CVEs, and developed good working relationships with some key players in the security community. They have already learned to trust us to respond rapidly and effectively to vulnerability reports.
- We have achieved measurable gains in security
Since early 2016 and our 0.1 release it has become routine that, in public disclosures of batches of CVEs against NTP Classic, around 75% fail to affect NTPsec at all because we had pre-hardened the code or removed the relevant attack surface.
You can read more details about differences from NTP Classic.
You should probably read What we plan to do next.