The ICANN policy process for bringing Whois into compliance with privacy law has just about reached the end of its first phase (we will report more on that when it releases the final report). The next phase will focus on what is sometimes called “access” but is more accurately described as the problem of how or whether to disclose the redacted data to third parties with a legitimate interest. The Canadian domain name registrar Tucows has released some data that is highly relevant to that problem. Tucows is one of the world’s largest registrars, serving about 23 million domains.
Before GDPR, Whois allowed anyone in the world to access the contact information, including email address and phone numbers, of a domain name registrant. There was no accountability to this regime: we had no idea who was requesting the data, why they wanted it or how they were using it.
Once the law became effective, registrars (aided by an emergency ICANN policy designed to quickly make ICANN’s policy GDPR compliant) began to redact the sensitive personal data. That meant that anyone who wanted to see the sensitive personally identifiable information (PII) had to request it from the registrar.
For the past 9 months, Tucows has been keeping track of who is requesting that data from them as it implemented a “tiered access” or “gated Whois” system. Their data about the use of this system is important because it bears on many of the controversies surrounding disclosure policy. The business interests who mined Whois data for years have contended that closing off indiscriminate viewing was a disaster for cybersecurity and Internet hygiene. Open access to the PII, they maintained, was a highly utilized and important part of the Internet ecosystem. They also contended that registrars could not be relied on to disclose the private data to legitimate third parties.
Now these claims can be tested with facts. We can get a sense of how many times third parties actually need the data, who is making those requests, some sense of what their purpose is, and how often the registrar discloses.
Tucows reports that it has received about 2,100 data access requests since its Tiered Access system started last May. That’s roughly one request for every 10,000 domains, or .0000932 of the domains under management. But nearly 1,400 (65%) of those requests came from a single company, a brand protection firm named Appdetex. As it happens, Facebook is one of Appedetex’s main clients; the Executive Vice President of Appdetex is Ben Milam, who is the spouse of Facebook’s representative in ICANN’s Expedited Policy Development Process (EPDP), Margie Milam. This connection is relevant because in the early post-GDPR days registrars were reporting very few requests for redacted Whois data, and this was working against the data miners’ arguments in the ICANN policy process. So voila, right before the last two ICANN meetings, Appdetex unloaded about 1,400 disclosure requests on Tucows (and a bunch of other registrars) to make it seem as if the Internet masses were teeming with urgent demands to disclose Whois data. As Tucows put it, “These spikes and the prevalence of certain requestors strongly suggests an attempt to skew the data to create an argument against the loss of public Whois data.”
Even more interesting is Tucows categorization of the source of these requests. No less than 92% of the requests came from “commercial litigation interests” and relate to “a suspected intellectual property (copyright or trademark) infringement.” As IGP has always maintained, open Whois was primarily a surveillance tool for trademark and copyright interests; it subsidized their policing costs at the expense of domain name registrants’ privacy and registrars’ infrastructure.
According to the Tucows data, the use of Whois by law enforcement and cybersecurity researchers is tiny in comparison. Requests made by law enforcement constituted 2% of the total, and security researchers accounted for only 1%. Registries, the registrants themselves, and third-parties interested in purchasing specific domains made up the rest.
Other registrars have had similar experiences. One small registrar reported that they had 48 requests in all: 44 from Appdetex, 3 from IP lawyers, and 1 official law enforcement request.
Also of interest is Tucows’ classification of how they handled requests. Just over 25% of the requests resulted in relevant registration data being provided to the requestor, and only 4% of the requests were denied. The reason most requests didn’t get data is because the requestor did not respond to requests for additional information. As Tucows’ blog put it, “some requests failed to include the requestor’s own identity, their legal basis to access the information, or even which specific domain name they’re asking about. In all cases, we reply promptly to ask for the missing information but, so far, for 70% of the requests we have received, that information was never provided.”
Thanks to Tucows for providing a factual basis for continuing discussions of Whois, privacy and data disclosure.
While this is interesting, it doesn’t account for the chilling effect that putting this information behind closed doors has caused. I regularly used WHOIS data when it was available, but not being able to easily automate and access the information means I just simply go without it. I can’t make a script that individually begs for access each and every time I need it.
Milton, thank you for picking up this post. I’d like to make a few clarifications, as in our post we were careful not to call out any specific party. We said in our post, “65% of all requests came on behalf of a single requestor”. That requestor was the Facebook family of trademarks but not all Facebook-related requests came from AppDetex. 62% of all requests came from AppDetex; 65% of all requests came on behalf of Facebook. We have received requests from AppDetex on behalf of other companies and we have received duplicative requests on behalf of Facebook—both internally duplicative (AppDetex requesting the same data multiple times) and domains that were requested by AppDetex as well other parties.
Thanks, Reg for the clarification – that distinction between Facebook requests and Appdetex wasn’t broken down in the blog! Helpful detail, but the basic story is the same. I must say, duplicate, multiple requests from both FB and its brand protection agent must drive you nuts