

![]() |
| ![]() | |||||||
|
![]() |
| | LinkBack | Thread Tools | Display Modes |
| | #1 (permalink) |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | .com Zone File Query - ZFbot.com Ok - I wanted to share my latest concoction with you all. I just finished building an app/website that will allow you to search the entire .com zone file (had to get approval from Verisign to get access to it). That file contains over 185 million records (duplicates for multi name server domains)... It's such a huge file that I had to build the app so that it is continually looping through and updating two character combination sets (you can see what it's currently updating on the lower left of the app/website). Once it finishes the entire loop (0 through zz), it downloads a new zone file (around 7 gig) and starts the whole process again. Searching is pretty quick considering the magnitude of the data it's got to crunch through. I suggest you type at least 3 characters or it will take longer than you want. You can download the results to a spreadsheet as well... There are some interesting stats in the grid on the right - domain counts/percentages/changes up or down for each 2 character set (in the case of numbers, I just made 1 for each number). Keep in mind that the change values are not yet accurate since it hasn't looped through twice yet... some have been updated...some not yet. The change is interesting in that you'll be able to see what 2 character combinations are being dropped and which ones are being picked up... You can sort any of the columns too... If you are in the business of selling domains, it's works great to search on your own domains - I found a whole bunch of other similar domains that I wasn't even aware were registered - and they turned out to be companies with live websites... sooooo...they become potential customers for my domain portfolio. Just one use for the app. Let me know what you think - Enjoy! http://www.ZFBot.com Ken Last edited by kengreenwood; 02-18-2009 at 03:37 PM.. |
| | |
| Sponsored Ads |
| | #2 (permalink) |
| Platinum Lifetime Member Last Online: 11-06-2009 06:34 PM iTrader: (0) Join Date: Oct 2002
Posts: 67
DNF$: 221 Location: Audubon, PA
Country: | Ken, sounds great -- thanks for sharing; will check it out. One question: when I applied for zone file access from Verisign a while back, I remember there being a clause in the agreement that stated you were only allowed to download the zone file once in a 24-hour period. Is this not the case any longer? |
| | |
| | #3 (permalink) |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | Audobon - Yes, that is still the case. But there really isn't a need to download it more than once a day. Unless you have a monster of a server, you won't be able to process the data more than once a day. My app takes about 12 to 15 hours to loop through all of the combinations and split the master file up (686 unique tables get built... this is for performance. I couldn't build an index on a 185 million record table... don't have the disk space and it would take 3 days to build the index!). You couldn't possibly be 100% accurate on the data unless you had direct access to the source. But once a day gives you probably 99% accuracy... Last edited by kengreenwood; 02-18-2009 at 03:30 PM.. |
| | |
| | #4 (permalink) |
| Bloody lovely Last Online: Yesterday 10:43 PM iTrader: (393) Join Date: Feb 2004
Posts: 23,730
DNF$: 3,407 Location: USA
Country: | A great idea and some programming ingenuity at work (the 2-letter table matrix). Now I can cut down on my research time when looking for potential buyers.
__________________ ![]() DomainGang.com - Domainers' Most Awesome News Source Acroplex - Web & Graphics Acro.net - My Blog |
| | |
| | #6 (permalink) |
| Name: Dale Hubbard Last Online: Yesterday 12:09 PM iTrader: (45) Join Date: Jan 2003
Posts: 5,868
DNF$: 5,845 Location: Exeter, England
Country: | Ken, you might find that MySQL is too resource hungry for this particular application. Did you consider stripping out all the duplicate NS data from the zone? You can use awk and grep with flat files and that takes minutes instead of 12 -15 hours. If you look at the zone file structure, you can see the distinct points in each line (A, NS) where awk and grep will work to separate out the individual domains. I wrote a bash script that does the main donkey work if you'd like me to dig it out. Anyway, just a thought. I have to say it's a very well presented site indeed!
__________________ UK Drop Catching Services: Dropsystem.co.uk New! Canada TBR Drop Catching: Dropping.ca New! QUALITY MiniSites: NOTsoMINI.com |
| | |
| | #7 (permalink) | |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | Quote:
FYI - I am currently loading the .net domains in as I'm typing this... total between the .com and .net domains will be around 92 million. Once I load all of the .net domains in and split them off for the first time, I'll upload the new .swf front end to the app that allows you to select .com, .net or both in the query. | |
| | |
| | #9 (permalink) |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | thanks Adam... You'll also notice that any of the two letter/digit combo's that have yet to be updated today won't have an extension on them... that's due to the fact that I now have .com and .net domains in the database.... once it's finished today (and going forward), you'll see the extension. Last edited by kengreenwood; 02-28-2009 at 06:04 PM.. |
| | |
| | #10 (permalink) |
| Bloody lovely Last Online: Yesterday 10:43 PM iTrader: (393) Join Date: Feb 2004
Posts: 23,730
DNF$: 3,407 Location: USA
Country: | Awesome. I miss the days of the unified com/net/org Registry though. Nowadays don't expect to extract similar data from PIR :(
__________________ ![]() DomainGang.com - Domainers' Most Awesome News Source Acroplex - Web & Graphics Acro.net - My Blog |
| | |
| | #12 (permalink) |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | FYI - I loaded up the new front end with the drop down for selecting .com, .net or both in your query. http://www.zfbot.com I also added a column with a link to the archive.org information for the domain - if there is any, that is... (the wayback machine). Not a major deal, but a nice little feature. Last edited by kengreenwood; 03-01-2009 at 06:40 PM.. Reason: Automerged Doublepost |
| | |
| | #13 (permalink) | ||
| Platinum Lifetime Member Name: John McCormac Last Online: 11-06-2009 10:19 AM iTrader: (0) Join Date: Oct 2006
Posts: 30
DNF$: 1,110 Location: Ireland
Country: | Quote:
Quote:
MySQL could handle the total .com list. The number of distinct .com domains (as of yesterday's zone was only around 79.5 million domains). Loading it into a single table on a desktop PC took 1 hour 8 min 56.55 sec. The query time for a two character count was 38.17 seconds with a simple domain based index. A single table is a very inefficient method of doing this kind of work. It all comes down to computability. It is far more efficient to run a set of queries on smaller tables (alphanumerical) and use these results to build your stats table. You can use various tricks such as limiting the number of characters used to build the index or even a number of indexes. It would then be simply a case of running the stats query on each smaller table to update your stats table. I don't know how far back historically you are running your stats table but you are effectively creating a spreadsheet with it. If it is a simple two set historical (today's and yesterday's figure) then it is a lot simpler. You did a good job getting this far. Regards...jmcc
__________________ http://www.hosterstats.com Hoster Stats on 2.9M+ hosters and Domain DNS History Database. Tracks over 236 Million active and deleted domains. | ||
| | |
| | #14 (permalink) | |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | Quote:
I got a suggestion from Acro to perhaps add a warning on the www link when it may be "adult" in nature... which was a good suggestion... but determining if it's an "adult" site would be difficult so now if you mouse over the www button of any domains, you'll see a snapshot of the website, if there is one. And you can pretty easily see if the domain is parked or if there is a legitimate site up and running... Last edited by kengreenwood; 03-04-2009 at 10:24 AM.. | |
| | |
| | #15 (permalink) | |
| Platinum Lifetime Member Name: John McCormac Last Online: 11-06-2009 10:19 AM iTrader: (0) Join Date: Oct 2006
Posts: 30
DNF$: 1,110 Location: Ireland
Country: | Quote:
I ran a simple test on the .com domain list, breaking it down, loading it into a set of tables and then generating stats subtables. The process of generating the schema, loading the data and generating the stats data took approximately two hours. That was on an old Semperon 3G box running MySQL with a barely tweaked configuration. The whole thing, including parsing and formatting the zonefile data, shouldn't take more than three hours. Breaking it down into smaller tables should only take about an hour and a half - faster if the breakdown was done first and the stats second. Preprocessing the zonefile data will remove the bottleneck that causes your process to take 12 to 15 hours to complete. Regards...jmcc
__________________ http://www.hosterstats.com Hoster Stats on 2.9M+ hosters and Domain DNS History Database. Tracks over 236 Million active and deleted domains. | |
| | |
| | #16 (permalink) | |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | Quote:
| |
| | |
| | #17 (permalink) |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | Line chart trend... Okay - I've just added a neat little feature to the ZFBot. If you click on any of the rows in the right-hand grid, a line chart will pop up showing the trend. Obviously it doesn't mean much with only a few days of trend stored but it's gonna look pretty cool after 30, 60, 90 days... or a year or more. I'm going to add either a radio button or drop down at the bottom of the chart that will allow you to select different date ranges like last 3 months, last year, etc... check out the 'ju' chart for an example (fictitious history right now) I'll be adding a couple pie charts as well that show the breakdown of domains... Also - I'm working on adding the ability to search on name server as well. If you wanted to see all of the domains on your name server for example, you could find them all easily. http://www.zfbot.com |
| | |
| | #19 (permalink) |
| Platinum Lifetime Member Name: That shouldn't be too hard to figure out... Last Online: 10-29-2009 08:46 AM iTrader: (2) Join Date: May 2006
Posts: 377
DNF$: 4,437 Location: Tampa
Country: | One tip for the charts - once the chart is displayed, you can just click or hold down any letter and it will automatically scroll through and find the appropriate value in the grid to display the associated chart. |
| | |
| | #20 (permalink) | |
| CrossLogix.com Last Online: Yesterday 04:44 PM iTrader: (65) Join Date: Mar 2006
Posts: 2,237
DNF$: 2,163 Location: Matthews, NC. U | Quote:
I already used it many times. Didn't know it was yours ![]() But... What did you just say ![]()
__________________ ![]() Domain Names For Sale | |
| | |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
| |