Performing whois on a list of domain names

I have a domain names file for example. equivalent to 2500.

I would like to do a whois for these domain names.

The problem is, I've never done this and I don't know where to start. If you have any ideas, I am all ears.

TIA.

+3


source to share


4 answers


You can also use Linux commandtool whois

. The following code opens a subprocess and looks for a domain.

But you have to be careful with many queries in a short time. The servers will end up blocking you after a while.;)



import subprocess

def find_whois(domain):
    # Linux 'whois' command wrapper
    # 
    # Executes a whois lookup with the linux command whois.
    # Returncodes from: https://github.com/rfc1036/whois/blob/master/whois.c

    domain = domain.lower().strip()
    d = domain.split('.')
    if d[0] == 'www': d = d[1:]

    # Run command with timeout
    proc = subprocess.Popen(['whois', domain], stderr=subprocess.PIPE, stdout=subprocess.PIPE)
    ans,err = proc.communicate(input)

    if err == 1: raise WhoisError('No Whois Server for this TLD or wrong query syntax') 
    elif err == 2: raise WhoisError('Whois has timed out after ' + str(whois_timeout) + ' seconds. (try again later or try higher timeout)')
    ans = ans.decode('UTF-8')
    return ans


with open('domains.txt') as input:
    with open('out.txt','a') as output:
        for line in input:
            output.write(find_whois(line))

      

The operator with open as

processes the stream. The "A" in the output file means the file is opened in append mode.

+6


source


Assuming the domains are in a file named domains.txt

and you have it installed pywhois

, then something like this should do the trick:

import whois

infile = "domains.txt"

# get domains from file
with open(infile, 'rb') as f:
    domains = [line.rstrip() for line in f if line.rstrip()]

for domain in domains:
    print domain
    record = whois.whois(domain)

    # write each whois record to a file {domain}.txt
    with open("%s.txt" % domain, 'wb') as f:
        f.write(record.text)

      

This will output each whois record to a file named {domain}.txt




Without pywhois

:

import subprocess

infile = "domains.txt"

# get domains from file
with open(infile, 'rb') as f:
    domains = [line.rstrip() for line in f if line.rstrip()]

for domain in domains:
    print domain
    record = subprocess.check_output(["whois", domain])

    # write each whois record to a file {domain}.txt
    with open("%s.txt" % domain, 'wb') as f:
        f.write(record)

      

+2


source


It sounds like you already had some helpful answers, but I thought it would be nice to say a little more about the problems associated with bulk WHOIS lookups (and in general) and provide some alternative solutions.

WHOIS Lookup

Finding a single domain name usually involves finding the appropriate WHOIS server for that domain and then querying the information on port 43. If you have access to a unix-like shell (like Bash), you can use whois

to do this easily (as others have noted ):

$ whois example.com

      

Very similar WHOIS tools were also available as modules for a wide variety of programming languages. One example is the pywhois module for Python.

In its simplest form, bulk WHOIS search simply iterates over the domains list, issuing a whois query for each domain and writing an exit record.

Here's an example in Bash that reads domains from a file domains.txt

and writes each WHOIS record to separate files (if you're using Windows, try Cygwin).

#!/bin/bash

domain_list="domains.txt"

while read line 
do
    name=$line
    echo "Looking up ${line}..."
    whois $name > ${line}.txt
    sleep 1
done < $domain_list

      

Beware of the following complications of bulk WHOIS searches:

  • Some WHOIS servers may not give you a complete WHOIS record. This is especially true for country-specific domains (such as .de and .fr) and domains registered with certain registrars (such as GoDaddy).

    If you want the most complete entry, you often need to go to the registry website or a third party service that can cache the entry (like DomainTools). This is much more difficult to automate and may have to be done manually. Even then, the record may not contain what you want (such as contact details for the registrant).

  • Some WHOIS servers impose limits on the number of requests you can make in a given period of time. If you hit the limit, you may find that you have to come back a few hours later to request records again. For example, with .org domains, you are limited to no more than three searches per minute, and several registrars will block you for 24 hours.

    Your best bet is to pause for a few seconds between searches, or try to shuffle the list of domains by TLD so you don't bother the same server too many times in a row.

  • Some whois servers are often dropped and the request will time out, which means you may need to go back and rerun those requests. ICANN states that whois servers should have a pretty decent uptime , but I've found one or two servers that give out bad records.

Parsing a record

Parsing WHOIS records (e.g. for registrar contact information) can be a challenge because:

  • The records are not always in a consistent format. You will find this with .com domains in particular. A .com record can hold any of the thousands of registrars around the world (not .com, Verisign), and not everyone wants to submit records in the easy-to-parse format recommended by ICANN.

  • Again, the information you want to retrieve may not be in the record you are returning from the search.

As mentioned, pywhois is one option for analyzing WHOIS data. Here's a very simple Python script that looks at the WHOIS record for each domain and extracts the registrar name (* where applicable), writing the results to a CSV file. You can include other fields if you like:

import whois
import csv

with open("domains.txt", "r") as f:
    domains = f.readlines()

with open("output_file.csv", "wb") as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(["Domain", "Registrant Name"])
    for domain in domains:
        domain = domain.rstrip()
        record = whois.whois(domain)
        try:
            r_name = record.registrant_name
        except AttributeError:
            r_name = "error"
        writer.writerow([domain, r_name])

      

* When I quickly tested this script, pywhois was not very reliable in extracting the logger name. Another similar library you could try is pythonwhois

.

+2


source


Download and install the Microsoft whois tool from http://technet.microsoft.com/en-us/sysinternals/bb897435.aspx

Create a text file with a list of domain names with a title bar.

name
google.com
yahoo.com
stackoverflow.com

      

Create a powershell script file:

$domainname = Import-Csv -Path "C:\domains.txt"
foreach($domain in $domainname) 
{
   .\whois.exe $domain.name Export-Csv -Path "C:\domain-info.csv" -Append
}

      

Run powershell script.

0


source







All Articles