Performing whois on a list of domain names
You can also use Linux commandtool whois
. The following code opens a subprocess and looks for a domain.
But you have to be careful with many queries in a short time. The servers will end up blocking you after a while.;)
import subprocess
def find_whois(domain):
# Linux 'whois' command wrapper
#
# Executes a whois lookup with the linux command whois.
# Returncodes from: https://github.com/rfc1036/whois/blob/master/whois.c
domain = domain.lower().strip()
d = domain.split('.')
if d[0] == 'www': d = d[1:]
# Run command with timeout
proc = subprocess.Popen(['whois', domain], stderr=subprocess.PIPE, stdout=subprocess.PIPE)
ans,err = proc.communicate(input)
if err == 1: raise WhoisError('No Whois Server for this TLD or wrong query syntax')
elif err == 2: raise WhoisError('Whois has timed out after ' + str(whois_timeout) + ' seconds. (try again later or try higher timeout)')
ans = ans.decode('UTF-8')
return ans
with open('domains.txt') as input:
with open('out.txt','a') as output:
for line in input:
output.write(find_whois(line))
The operator with open as
processes the stream. The "A" in the output file means the file is opened in append mode.
source to share
Assuming the domains are in a file named domains.txt
and you have it installed pywhois
, then something like this should do the trick:
import whois
infile = "domains.txt"
# get domains from file
with open(infile, 'rb') as f:
domains = [line.rstrip() for line in f if line.rstrip()]
for domain in domains:
print domain
record = whois.whois(domain)
# write each whois record to a file {domain}.txt
with open("%s.txt" % domain, 'wb') as f:
f.write(record.text)
This will output each whois record to a file named {domain}.txt
Without pywhois
:
import subprocess
infile = "domains.txt"
# get domains from file
with open(infile, 'rb') as f:
domains = [line.rstrip() for line in f if line.rstrip()]
for domain in domains:
print domain
record = subprocess.check_output(["whois", domain])
# write each whois record to a file {domain}.txt
with open("%s.txt" % domain, 'wb') as f:
f.write(record)
source to share
It sounds like you already had some helpful answers, but I thought it would be nice to say a little more about the problems associated with bulk WHOIS lookups (and in general) and provide some alternative solutions.
WHOIS Lookup
Finding a single domain name usually involves finding the appropriate WHOIS server for that domain and then querying the information on port 43. If you have access to a unix-like shell (like Bash), you can use whois
to do this easily (as others have noted ):
$ whois example.com
Very similar WHOIS tools were also available as modules for a wide variety of programming languages. One example is the pywhois module for Python.
In its simplest form, bulk WHOIS search simply iterates over the domains list, issuing a whois query for each domain and writing an exit record.
Here's an example in Bash that reads domains from a file domains.txt
and writes each WHOIS record to separate files (if you're using Windows, try Cygwin).
#!/bin/bash
domain_list="domains.txt"
while read line
do
name=$line
echo "Looking up ${line}..."
whois $name > ${line}.txt
sleep 1
done < $domain_list
Beware of the following complications of bulk WHOIS searches:
-
Some WHOIS servers may not give you a complete WHOIS record. This is especially true for country-specific domains (such as .de and .fr) and domains registered with certain registrars (such as GoDaddy).
If you want the most complete entry, you often need to go to the registry website or a third party service that can cache the entry (like DomainTools). This is much more difficult to automate and may have to be done manually. Even then, the record may not contain what you want (such as contact details for the registrant).
-
Some WHOIS servers impose limits on the number of requests you can make in a given period of time. If you hit the limit, you may find that you have to come back a few hours later to request records again. For example, with .org domains, you are limited to no more than three searches per minute, and several registrars will block you for 24 hours.
Your best bet is to pause for a few seconds between searches, or try to shuffle the list of domains by TLD so you don't bother the same server too many times in a row.
-
Some whois servers are often dropped and the request will time out, which means you may need to go back and rerun those requests. ICANN states that whois servers should have a pretty decent uptime , but I've found one or two servers that give out bad records.
Parsing a record
Parsing WHOIS records (e.g. for registrar contact information) can be a challenge because:
-
The records are not always in a consistent format. You will find this with .com domains in particular. A .com record can hold any of the thousands of registrars around the world (not .com, Verisign), and not everyone wants to submit records in the easy-to-parse format recommended by ICANN.
-
Again, the information you want to retrieve may not be in the record you are returning from the search.
As mentioned, pywhois is one option for analyzing WHOIS data. Here's a very simple Python script that looks at the WHOIS record for each domain and extracts the registrar name (* where applicable), writing the results to a CSV file. You can include other fields if you like:
import whois
import csv
with open("domains.txt", "r") as f:
domains = f.readlines()
with open("output_file.csv", "wb") as csvfile:
writer = csv.writer(csvfile)
writer.writerow(["Domain", "Registrant Name"])
for domain in domains:
domain = domain.rstrip()
record = whois.whois(domain)
try:
r_name = record.registrant_name
except AttributeError:
r_name = "error"
writer.writerow([domain, r_name])
* When I quickly tested this script, pywhois was not very reliable in extracting the logger name. Another similar library you could try is pythonwhois
.
source to share
Download and install the Microsoft whois tool from http://technet.microsoft.com/en-us/sysinternals/bb897435.aspx
Create a text file with a list of domain names with a title bar.
name
google.com
yahoo.com
stackoverflow.com
Create a powershell script file:
$domainname = Import-Csv -Path "C:\domains.txt"
foreach($domain in $domainname)
{
.\whois.exe $domain.name Export-Csv -Path "C:\domain-info.csv" -Append
}
Run powershell script.
source to share