A presentation at the recent RIPE79 gives us some initial insight into the recursive resolution of Domain Name System (DNS) queries. Who performs recursive resolution and therefore has access to DNS query data is at the center the DNS over HTTPS (DoH) debate. But as we’ll explain below, the data presented provide only one part of the analysis needed to answer complex questions about market concentration.
For the uninitiated, a recursive resolver performs an intermediate step in a DNS query. A DNS query might, for example, ask “at what address can the domain example.com be found?” A recursive resolver determines the response to a user’s DNS query by querying an authoritative resolver and then responding to the user with the answer. In performing this role, a recursive resolver knows a lot about what content a user is accessing.
Using their Google Ads-based measurement system, APNIC’s Geoff Huston examined how users’ DNS queries were resolved, importantly distinguishing between “open” or public resolvers (see RFC 8499 for a full discussion of DNS terminology), and what we’ll call “closed” recursive resolvers. The data collected so far covers the period from February 2019 to September 2019. This time frame is important because it provides baseline measures before deployment of DoH by Mozilla and Google.
Open vs. closed recursive resolution
Open recursive resolvers are available to anyone who configures their DNS resolver settings to reach them. There are multiple open resolvers operated by organizations across different industry sectors including, but not limited to, search, advertising, internet service, network security, domain name, and content delivery. Some of the more recognizable organizations operating open resolvers include Cisco’s OpenDNS, Cloudflare, and Google. But some ISPs run open resolvers, including Level 3, China Telecom, and China Unicom. Recently, Comcast launched a resolver as part of Google’s DoH beta testing.
Closed resolvers are typically made available only to users on a specific network. This usually happens by pre-configuring equipment, such as a router or a mobile device, to point to the resolver. ISPs usually operate closed resolvers for their customers (e.g., Comcast, Vodafone, Reliance Jio, Uninet, etc.). While it is possible to configure equipment to use one or more resolvers, or even to perform recursive resolution directly, most end users presumably rely on the default resolver setting provided by the network operator.
APNIC’s data and observations
Huston’s presentation looked at both users’ initial and “full set” (i.e., more than one resolver is typically configured) of recursive resolvers responding to a DNS query. Data was aggregated by Autonomous System (AS), that is, the network operating the resolver, and highlighted the top-25 ASes with networks operating open resolvers shown in red (below, left). One issue in the presentation seemed to be that it lumped together all types of organizations operating ASes. For example, Huston referred to “same ASes” as ISPs. But it’s unclear if the category “same ASes” only includes ISPs. It probably also includes enterprises who run their own closed recursive resolver (e.g., to implement “split DNS”), use their ISP’s resolver, or use a managed DNS service.
Unsurprisingly, the data show that ISPs in countries that have lots of users do lots of DNS resolution, e.g., Reliance Jio in India or ChinaNet. Among the top 25 ASes observed, he found that seven networks performing open resolution accounted for 16% of use, while networks with closed resolution accounted for twice that, 31%. Looking at the overall cumulative distribution of resolver sets to users, Huston found three resolver farms (networks run multiple resolvers to load-balance queries) used by around 30% of users, and a much larger number of resolver sets (450) handling the query load of 90% of users.
Figure 1: Huston presentation at RIPE79
Huston also mapped levels of same AS and open resolver use by country. Most countries exhibited high levels of users’ resolution occurring within the same AS (i.e., their ISP). Huston attributed this to DNS resolution being a default setting that most users never touch. However, open resolver use was observably higher in some African countries, Iran, and North Korea, among others. He speculated that African operators might offload DNS resolution to open resolvers (specifically Google’s public DNS) to simplify operations. He also showed that the greatest number of users of Google’s public DNS were in China. But his main conclusion was that the data currently suggested little centralization of open resolver use today.
Where is DNS recursive resolution actually “concentrated”?
The focus of the APNIC presentation and much of the DoH press coverage has been on the hypothetical, and factually challenged, threat of centralization of recursive resolution in open resolver operator(s) like Google or Cloudflare. Determining where resolution is occurring and how it changes over time is a straightforward technical assessment. However, evaluating and understanding the market concentration of DNS recursive resolution is more complicated. It consists of two factors. The degree to which resolution occurs in AS’s open or closed resolvers (i.e., market share), as well as how many competitors that AS has within the relevant market. This latter factor is neglected in Huston’s study.
With the APNIC data and some work, we can begin to explore this. Figure 2 below shows at the country level the percentage of users using resolvers within the same AS, with countries grouped by region. You can see that, in the Americas and Europe, the use of these resolvers is over 60% in three-quarters of the countries observed with those regions showing the smallest dispersion and overall range of values. This starts to explain why “anti-DoH” advocacy has been so vociferous in countries like the United States and the United Kingdom.
Figure 2: Country-level % of users resolving DNS queries with recursive resolver in same AS
A clearer picture emerges by identifying the ASes. Figure 3 drills down in the United States, ranking the top 25 AS operators identified by the number of samples collected (a proxy for their relative size). They are almost exclusively major ISPs, with user DNS resolution in the same AS averaging around 84%, and open resolution use only around 14%. This is consistent with Huston’s suggestion and others claims that as many as 97% of users in some countries rely on the default resolver settings provided by their ISP. The next step in understanding market concentration would be to identify the number of competing ISPs in the relevant market, e.g., in Atlanta, there are 3-4 fixed broadband providers; in many other cities there are fewer. Policymakers could then evaluate how concentrated DNS recursive resolution within ISPs is today, versus a potentially expanded number of open recursive resolution competitors available to users with deployment of DoH. Regardless, it seems clear that the market share of DNS query resolution is currently relatively high among ISPs and with that comes the control and benefits of access to users’ DNS query data.
Figure 3: % of resolver resolution occurring in same AS among top 25 ASes in the United States
Exploring alternative explanations of open resolver use
As mentioned, Huston speculated about why higher rates of open resolver use were observed in certain places. Of course, without speaking directly to network operators it’s difficult to know exactly why they rely on open resolvers. But we can examine some country-level measurements as proxies and see how they relate to resolver use. For instance if, as Huston suggested, African operators are using open resolvers because it’s easier operationally, we might expect to see associated decreases in network performance, e.g., higher broadband and mobile network latency levels, in those countries as a result. Indeed, we found some evidence of this in the African region, with higher mobile latency levels correlated with higher levels of open resolver use, r(31)=.375 p<.038. Variables that capture overall network operational costs, staffing or capacity (e.g., DNS expertise) levels might be more appropriate to examine.
But what about observations of higher open resolver use in places like Iran, Afghanistan, North Korea, and China? Proponents of DoH argue that DNS query data should be confidential. Confidentiality of DNS query data is particularly important to users in countries where authoritarian governments and/or state-controlled ISPs might conduct surveillance, censorship or filter. What does the data say? Is there any relationship between countries where citizens are less free or governments less supportive of civil liberties, which is presumably associated with greater likelihood of surveillance, censorship and filtering, and the use of open resolvers? Yes, there is. Using Freedom House (FH) scores, we found that countries with lower levels of individual political rights and civil liberties were associated with higher open resolver use, r(175) = -.221, p<.003. We found further support in the Reporters Sans Frontiers (RSF) data (reversed scale for easy comparison), which measures levels of pluralism, media independence, media environment and self-censorship, as well as the legislative framework, transparency, and the quality of the infrastructure that supports the production of news and information. Again, higher levels of open resolver use were associated with lower RSF scores, r(197) = -.199, p<.005. It appears that this relationship is entirely driven by open resolver use in the Asia region. Based on both data sources, it appears that Internet censorship/surveillance explain around 20% of the variation in the use of open resolvers.
Figure 4: Open resolver use and FH/RSF scores
Measuring users’ recursive resolution, and determining whether it occurs in “open” or “closed” (same AS) resolvers is extraordinarily useful. But instead of speculating we should (and can) empirical understand market concentration of DNS recursive resolution and DNS query data today. Understanding market concentration requires not only data on DNS recursive resolution market share, but also analysis of the competitive landscape among ISPs or other DNS providers.
The APNIC dataset, combined with other national-level indicators, offers some initial insight. DNS recursive resolution occurs in the same AS (usually ISPs) in the Americas and Europe at higher levels than other regions. In the largest networks, users’ recursive resolution appears to occur overwhelmingly within that AS. The benefits associated with DNS recursive resolution and access to DNS query data are already concentrated in a relatively small number of actors. Knowing this helps explain why we see such strong advocacy against DoH by ISPs in certain countries. Outside of those regions, users rely less on ISPs default recursive resolver settings, particularly in more authoritarian countries, supporting claims by proponents of DoH that confidentiality of DNS query data matters.