Welcome to Welcome to DNF.com™ - Domain Sales, Domain Forum, Domain Appraisals, Domain Registrars

If you are new to domains and looking to buy, sell and learn about domains then you have come to the right place. DNForum is the largest domain name community on the internet and continues to grow every day. There are over 105,000 domainers on DNForum doing everything from buying domains, selling domains, learning about domains and discussing domains. Take a minute and Register.

Register Today on DNForum IT'S FREE!

Page 1 of 8 1 2 3 ... LastLast
Results 1 to 20 of 143
  1. #1
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  

    .com Zone File Query - http://www.ZFbot.com

    Ok - I wanted to share my latest concoction with you all. I just finished building an app/website that will allow you to search the entire .com zone file (had to get approval from Verisign to get access to it). That file contains over 185 million records (duplicates for multi name server domains)... It's such a huge file that I had to build the app so that it is continually looping through and updating two character combination sets (you can see what it's currently updating on the lower left of the app/website). Once it finishes the entire loop (0 through zz), it downloads a new zone file (around 7 gig) and starts the whole process again.

    Searching is pretty quick considering the magnitude of the data it's got to crunch through. I suggest you type at least 3 characters or it will take longer than you want.

    You can download the results to a spreadsheet as well...

    There are some interesting stats in the grid on the right - domain counts/percentages/changes up or down for each 2 character set (in the case of numbers, I just made 1 for each number). Keep in mind that the change values are not yet accurate since it hasn't looped through twice yet... some have been updated...some not yet. The change is interesting in that you'll be able to see what 2 character combinations are being dropped and which ones are being picked up...

    You can sort any of the columns too...

    If you are in the business of selling domains, it's works great to search on your own domains - I found a whole bunch of other similar domains that I wasn't even aware were registered - and they turned out to be companies with live websites... sooooo...they become potential customers for my domain portfolio. Just one use for the app.

    Let me know what you think - Enjoy!

    http://www.ZFBot.com

    Ken
    Last edited by kengreenwood; 02-18-2009 at 03:37 PM.

  2. #2
    Exclusive Lifetime Member

    Join Date
    Oct 2002
    Location
    Audubon, PA
    Posts
    95
    DNF$
    617
    Bank
    0
    Total DNF$
    617
    Donate  
    Ken, sounds great -- thanks for sharing; will check it out. One question: when I applied for zone file access from Verisign a while back, I remember there being a clause in the agreement that stated you were only allowed to download the zone file once in a 24-hour period. Is this not the case any longer?

  3. #3
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    Audobon - Yes, that is still the case. But there really isn't a need to download it more than once a day. Unless you have a monster of a server, you won't be able to process the data more than once a day. My app takes about 12 to 15 hours to loop through all of the combinations and split the master file up (686 unique tables get built... this is for performance. I couldn't build an index on a 185 million record table... don't have the disk space and it would take 3 days to build the index!). You couldn't possibly be 100% accurate on the data unless you had direct access to the source. But once a day gives you probably 99% accuracy...
    Last edited by kengreenwood; 02-18-2009 at 03:30 PM.

  4. #4
    Bloody Hell
    Acro's Avatar
    Join Date
    Feb 2004
    Location
    USA
    Posts
    28,170
    Country

    Holy See
    DNF$
    15,444
    Bank
    0
    Total DNF$
    15,444
    Donate  
    A great idea and some programming ingenuity at work (the 2-letter table matrix). Now I can cut down on my research time when looking for potential buyers.

    DomainGang.com - Digital Entertainment for Domainers
    Acroplex - Web & Graphics
    Acro.net - My Blog

  5. #5
    Platinum Lifetime Member
    Coward's Avatar
    Join Date
    Oct 2007
    Posts
    155
    DNF$
    493
    Bank
    0
    Total DNF$
    493
    Donate  
    Wonderful!

  6. #6
    Formerly 'aZooZa'
    Dale Hubbard's Avatar
    Join Date
    Jan 2003
    Location
    UK
    Posts
    6,178
    Country

    England
    DNF$
    2,082
    Bank
    0
    Total DNF$
    2,082
    Donate  
    Ken, you might find that MySQL is too resource hungry for this particular application. Did you consider stripping out all the duplicate NS data from the zone? You can use awk and grep with flat files and that takes minutes instead of 12 -15 hours. If you look at the zone file structure, you can see the distinct points in each line (A, NS) where awk and grep will work to separate out the individual domains. I wrote a bash script that does the main donkey work if you'd like me to dig it out.

    Anyway, just a thought. I have to say it's a very well presented site indeed!

  7. #7
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    Quote Originally Posted by aZooZa View Post
    Ken, you might find that MySQL is too resource hungry for this particular application. Did you consider stripping out all the duplicate NS data from the zone? You can use awk and grep with flat files and that takes minutes instead of 12 -15 hours. If you look at the zone file structure, you can see the distinct points in each line (A, NS) where awk and grep will work to separate out the individual domains. I wrote a bash script that does the main donkey work if you'd like me to dig it out.

    Anyway, just a thought. I have to say it's a very well presented site indeed!
    Dale - if you have the script that greps out the domans to a file, that would help a bit... What I'm currently doing is bulk loading the entire file into a table and then deleting any record where it's not a domain...but I still have the ns field and the actual name server name... so it's making the table much bigger than it needs to be....resulting in the rest of the process slowing down. I'd love to use Oracle for this but I don't have the time to install Oracle on my server right now and I also don't want any Oracle cronies jumping down my throat about license issues. Could use the express version i guess... anyway, I'd appreciate that script if you can find it... thanks...

    FYI - I am currently loading the .net domains in as I'm typing this... total between the .com and .net domains will be around 92 million. Once I load all of the .net domains in and split them off for the first time, I'll upload the new .swf front end to the app that allows you to select .com, .net or both in the query.

  8. #8
    Administrator
    Adam Dicker's Avatar
    Join Date
    Feb 2003
    Location
    Toronto, Canada
    Posts
    15,750
    Blog Entries
    1
    Country

    Canada Follow Adam Dicker On Twitter Add Adam Dicker on Facebook
    DNF$
    8,426,845
    Bank
    0
    Total DNF$
    8,426,845
    Donate  
    Excellent Tool!

    -=DCG=-

  9. #9
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    thanks Adam...

    You'll also notice that any of the two letter/digit combo's that have yet to be updated today won't have an extension on them... that's due to the fact that I now have .com and .net domains in the database.... once it's finished today (and going forward), you'll see the extension.
    Last edited by kengreenwood; 02-28-2009 at 06:04 PM.

  10. #10
    Bloody Hell
    Acro's Avatar
    Join Date
    Feb 2004
    Location
    USA
    Posts
    28,170
    Country

    Holy See
    DNF$
    15,444
    Bank
    0
    Total DNF$
    15,444
    Donate  
    Awesome. I miss the days of the unified com/net/org Registry though. Nowadays don't expect to extract similar data from PIR :(

    DomainGang.com - Digital Entertainment for Domainers
    Acroplex - Web & Graphics
    Acro.net - My Blog

  11. #11
    Dn Guru©
    -ET-'s Avatar
    Join Date
    Nov 2006
    Location
    Neighbourhood
    Posts
    596
    DNF$
    6,434
    Bank
    0
    Total DNF$
    6,434
    Donate  
    Awesome tool! will use it for sure in coming days.

  12. #12
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    FYI - I loaded up the new front end with the drop down for selecting .com, .net or both in your query.

    http://www.zfbot.com

    I also added a column with a link to the archive.org information for the domain - if there is any, that is... (the wayback machine). Not a major deal, but a nice little feature.
    Last edited by kengreenwood; 03-01-2009 at 06:40 PM. Reason: Automerged Doublepost

  13. #13
    Platinum Lifetime Member
    jmcc's Avatar
    Join Date
    Oct 2006
    Location
    Ireland
    Posts
    87
    DNF$
    392
    Bank
    0
    Total DNF$
    392
    Donate  
    Quote Originally Posted by kengreenwood View Post
    Dale - if you have the script that greps out the domans to a file, that would help a bit... What I'm currently doing is bulk loading the entire file into a table and then deleting any record where it's not a domain...but I still have the ns field and the actual name server name... so it's making the table much bigger than it needs to be....resulting in the rest of the process slowing down.
    It is a very messy way of doing it. As Dale suggested, it is far quicker to parse the zonefile using scripts. Doing it this way is essential if you are going to mechanise or automate the process. It only takes a few minutes to parse the domains from the zonefile.

    I'd love to use Oracle for this but I don't have the time to install Oracle on my server right now and I also don't want any Oracle cronies jumping down my throat about license issues.
    You don't need to use Oracle. MySQL can handle this kind of thing easily. Crunchwise, you are doing too much too early.

    MySQL could handle the total .com list. The number of distinct .com domains (as of yesterday's zone was only around 79.5 million domains). Loading it into a single table on a desktop PC took 1 hour 8 min 56.55 sec. The query time for a two character count was 38.17 seconds with a simple domain based index. A single table is a very inefficient method of doing this kind of work. It all comes down to computability. It is far more efficient to run a set of queries on smaller tables (alphanumerical) and use these results to build your stats table. You can use various tricks such as limiting the number of characters used to build the index or even a number of indexes.

    It would then be simply a case of running the stats query on each smaller table to update your stats table. I don't know how far back historically you are running your stats table but you are effectively creating a spreadsheet with it. If it is a simple two set historical (today's and yesterday's figure) then it is a lot simpler.

    You did a good job getting this far.

    Regards...jmcc
    http://www.hosterstats.com
    Hoster Stats on 2.9M+ hosters and Domain DNS History Database.
    Tracks over 236 Million active and deleted domains.

  14. #14
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    Quote Originally Posted by jmcc View Post
    It is far more efficient to run a set of queries on smaller tables (alphanumerical) and use these results to build your stats table. You can use various tricks such as limiting the number of characters used to build the index or even a number of indexes.
    I'm already doing all of what you stated above. The queries are not occurring against a single table. That would be foolish. I've been working with Oracle/MySQL databases for over 15 years... trust me, I know what I'm doing. But I appreciate the input!

    I got a suggestion from Acro to perhaps add a warning on the www link when it may be "adult" in nature... which was a good suggestion... but determining if it's an "adult" site would be difficult so now if you mouse over the www button of any domains, you'll see a snapshot of the website, if there is one. And you can pretty easily see if the domain is parked or if there is a legitimate site up and running...
    Last edited by kengreenwood; 03-04-2009 at 10:24 AM.

  15. #15
    Platinum Lifetime Member
    jmcc's Avatar
    Join Date
    Oct 2006
    Location
    Ireland
    Posts
    87
    DNF$
    392
    Bank
    0
    Total DNF$
    392
    Donate  
    Quote Originally Posted by kengreenwood View Post
    I'm already doing all of what you stated above. The queries are not occurring against a single table. That would be foolish. I've been working with Oracle/MySQL databases for over 15 years... trust me, I know what I'm doing. But I appreciate the input!
    The key to handling large datasets such as zonefiles is preparing the data before inserting it into the database rather than throwing it all into the database and then sorting it out.

    I ran a simple test on the .com domain list, breaking it down, loading it into a set of tables and then generating stats subtables. The process of generating the schema, loading the data and generating the stats data took approximately two hours. That was on an old Semperon 3G box running MySQL with a barely tweaked configuration. The whole thing, including parsing and formatting the zonefile data, shouldn't take more than three hours. Breaking it down into smaller tables should only take about an hour and a half - faster if the breakdown was done first and the stats second. Preprocessing the zonefile data will remove the bottleneck that causes your process to take 12 to 15 hours to complete.

    Regards...jmcc
    http://www.hosterstats.com
    Hoster Stats on 2.9M+ hosters and Domain DNS History Database.
    Tracks over 236 Million active and deleted domains.

  16. #16
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    Quote Originally Posted by jmcc View Post
    The key to handling large datasets such as zonefiles is preparing the data before inserting it into the database rather than throwing it all into the database and then sorting it out.

    I ran a simple test on the .com domain list, breaking it down, loading it into a set of tables and then generating stats subtables. The process of generating the schema, loading the data and generating the stats data took approximately two hours. That was on an old Semperon 3G box running MySQL with a barely tweaked configuration. The whole thing, including parsing and formatting the zonefile data, shouldn't take more than three hours. Breaking it down into smaller tables should only take about an hour and a half - faster if the breakdown was done first and the stats second. Preprocessing the zonefile data will remove the bottleneck that causes your process to take 12 to 15 hours to complete.

    Regards...jmcc
    Couldn't agree with you more - I've been chatting with Dale about this as well. My forte is the database work - the pre-processing of the data within Unix, using command line stuff is not my forte. Soooo, if either you or Dale have a script that cleans up the data first, it would speed up my process.

  17. #17
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  

    Line chart trend...

    Okay - I've just added a neat little feature to the ZFBot. If you click on any of the rows in the right-hand grid, a line chart will pop up showing the trend. Obviously it doesn't mean much with only a few days of trend stored but it's gonna look pretty cool after 30, 60, 90 days... or a year or more. I'm going to add either a radio button or drop down at the bottom of the chart that will allow you to select different date ranges like last 3 months, last year, etc...

    check out the 'ju' chart for an example (fictitious history right now)

    I'll be adding a couple pie charts as well that show the breakdown of domains...

    Also - I'm working on adding the ability to search on name server as well. If you wanted to see all of the domains on your name server for example, you could find them all easily.

    http://www.zfbot.com

  18. #18
    Success Is My Only Option
    Carter's Avatar
    Join Date
    Jul 2008
    Location
    Italy
    Posts
    4,249
    Country

    Italy
    DNF$
    27,686
    Bank
    0
    Total DNF$
    27,686
    Donate  
    Fantastic tool congrats!!

  19. #19
    Platinum Lifetime Member
    kengreenwood's Avatar
    Join Date
    May 2006
    Location
    Tampa
    Posts
    400
    DNF$
    4,961
    Bank
    0
    Total DNF$
    4,961
    Donate  
    One tip for the charts - once the chart is displayed, you can just click or hold down any letter and it will automatically scroll through and find the appropriate value in the grid to display the associated chart.

  20. #20
    CrossLogix.com
    copper's Avatar
    Join Date
    Mar 2006
    Location
    Matthews, NC. U
    Posts
    2,548
    DNF$
    3,521
    Bank
    0
    Total DNF$
    3,521
    Donate  
    Quote Originally Posted by kengreenwood View Post
    One tip for the charts - once the chart is displayed, you can just click or hold down any letter and it will automatically scroll through and find the appropriate value in the grid to display the associated chart.
    Thanks for the Great Tool.
    I already used it many times.
    Didn't know it was yours

    But...
    What did you just say

Page 1 of 8 1 2 3 ... LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Domain name forum recommended by Domaining.com