While I was developing web service to looking for drop domains I wanted:
- to get domain’s whois information
- parse plain text(whois info) as result each field separately to database.
Getting whois information
I was looking for solutions on github and found several repos. I tried it and had concluded that it will be risky to use repositories with a small number of active contributors. The reason is there are many new domain zones every months around the world and it will be difficult to maintain actual list of them (domains zones and its whois server) alone.
Eventually I had decided that I didn’t use solutions written on php in favor ofwhois client built in Ubuntu. Github repohttps://github.com/rfc1036/whois actively maintaining and developing, so risk that it will be abandoned is quite low.
If you don’t have it on your machine, install it:
apt-get install whois
It’s easy to use just
You will get whois information as plain text, so next step will be parsing it and save to DB.
Whois text parsing
I didn’t find nice package for this purpose. Many of the repos are abandoned. So I “reinvented the wheel” https://github.com/shapito27/whois.
At the first sight the task of text parsing can be look trivial. But the difficulty lies in different formats of responses. Because whois tool sends requests to different whois servers as example for domain zones: .ru and .co.uk.
For example look at result of running command:
And now compare it with the first screenshot.
You can get a completely different format for other domains zones. I think you got it :)
I tried to solve this challenge in my new tool I’ve mentioned above. So tool parse text and return structured data:
Shapito27\Whois\Whois::__set_state(array( 'status' => 1, 'creationDate' => '1997-03-29 05:00:00', 'updateDate' => '2020-03-10 18:53:59', 'expirationDate' => '2028-03-30 04:00:00', 'nameServers' => array ( 0 => 'a.ns.facebook.com', 1 => 'b.ns.facebook.com', 2 => 'c.ns.facebook.com', 3 => 'd.ns.facebook.com', ), 'registrar' => Shapito27\Whois\Registrar::__set_state(array( 'id' => '3237', 'name' => 'RegistrarSafe, LLC', 'abuseContactEmail' => 'firstname.lastname@example.org', 'abuseContactPhone' => '+1.6503087004', )), 'registryDomainId' => '2320948_DOMAIN_COM-VRSN', 'errorMessage' => NULL, ))