Whois parsing shell + PHP

While I was developing a web service (to look for drop domains), I wanted:

  • to get the domain’s whois information
  • parse plain text (whois info), and as result, each field is separately saved to the database.

Getting whois information

I was looking for solutions on GitHub and found several repos. I tried it and concluded that it will be risky to use repositories with a small number of active contributors. The reason is there are many new domain zones every month around the world and it will be difficult to maintain an actual list of them (domains zones and their whois server) alone.

Eventually, I decided that I didn’t use solutions written on PHP in favor of thewhois client built in Ubuntu. GitHub repohttps://github.com/rfc1036/whois actively maintained and developing, so the risk that it will be abandoned is quite low.

If you don’t have it on your machine, install it:

apt-get install whois

It’s easy to use just whois domain

whois thisis-blog.ru


You will get whois information as plain text, so the next step will be parsing it and save to DB.

Whois text parsing

I didn’t find a nice package for this purpose. Many of the repos are abandoned. So I “reinvented the wheel” https://github.com/shapito27/whois.

At the first sight, the task of text parsing can look trivial. But the difficulty lies in different formats of responses. Because the whois tool sends requests to different whois servers for example for domain zones: .ru and .co.uk.

For example, look at the result of running the command: whois car.co.uk

And now compare it with the first screenshot.

You can get a completely different format for other domain zones. I think you got it :)

I tried to solve this challenge with the new tool I mentioned above. So tool parse text and return structured data:

  'status' => 1,
  'creationDate' => '1997-03-29 05:00:00',
  'updateDate' => '2020-03-10 18:53:59',
  'expirationDate' => '2028-03-30 04:00:00',
  'nameServers' => 
 array (
   0 => 'a.ns.facebook.com',
   1 => 'b.ns.facebook.com',
   2 => 'c.ns.facebook.com',
   3 => 'd.ns.facebook.com',
  'registrar' => 
    'id' => '3237',
    'name' => 'RegistrarSafe, LLC',
    'abuseContactEmail' => '[email protected]',
    'abuseContactPhone' => '+1.6503087004',
  'registryDomainId' => '2320948_DOMAIN_COM-VRSN',
  'errorMessage' => NULL,

You might also like

Leave a Comment

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

The reCAPTCHA verification period has expired. Please reload the page.