Quote:
Originally Posted by aZooZa Ken, you might find that MySQL is too resource hungry for this particular application. Did you consider stripping out all the duplicate NS data from the zone? You can use awk and grep with flat files and that takes minutes instead of 12 -15 hours. If you look at the zone file structure, you can see the distinct points in each line (A, NS) where awk and grep will work to separate out the individual domains. I wrote a bash script that does the main donkey work if you'd like me to dig it out.
Anyway, just a thought. I have to say it's a very well presented site indeed! |
Dale - if you have the script that greps out the domans to a file, that would help a bit... What I'm currently doing is bulk loading the entire file into a table and then deleting any record where it's not a domain...but I still have the ns field and the actual name server name... so it's making the table much bigger than it needs to be....resulting in the rest of the process slowing down. I'd love to use Oracle for this but I don't have the time to install Oracle on my server right now and I also don't want any Oracle cronies jumping down my throat about license issues. Could use the express version i guess... anyway, I'd appreciate that script if you can find it... thanks...
FYI - I am currently loading the .net domains in as I'm typing this... total between the .com and .net domains will be around 92 million. Once I load all of the .net domains in and split them off for the first time, I'll upload the new .swf front end to the app that allows you to select .com, .net or both in the query.