Welcome to Welcome to DNF.com™ - Domain Sales, Domain Forum, Domain Appraisals, Domain Registrars

If you are new to domains and looking to buy, sell and learn about domains then you have come to the right place. DNForum is the largest domain name community on the internet and continues to grow every day. There are over 105,000 domainers on DNForum doing everything from buying domains, selling domains, learning about domains and discussing domains. Take a minute and Register.

Register Today on DNForum IT'S FREE!

Results 1 to 5 of 5
  1. #1
    DNF Regular
    stevey's Avatar
    Join Date
    Aug 2004
    Location
    Button Moon
    Posts
    887
    DNF$
    8,887
    Bank
    0
    Total DNF$
    8,887
    Donate  

    spiders, how to block?

    hi, was wondering if there was a way to stop spiders picking up urls such as:

    www.mysite.com/index.php?url=fdsf.sdfer,sader
    http://www.mysite.com/index.php?url=jjj.tbbtyr.pbz

    (billions of combinations of /index.php?url=*code* so blocking each individually isnt a realistic option)

    yet i still want the spiders to crawl www.mysite.com/index.php

    anyone have any surgestions or ideas that would work?

    thanks,
    steve

    i'll give 1,000 DNF$ to anyone who can solve this problem
    http://www.goodridgeelec.com
    Electrical Contractors, West Midlands, UK

  2. #2
    Platinum Lifetime Member
    DogFaceBoy's Avatar
    Join Date
    Sep 2005
    Location
    Canada
    Posts
    1,038
    DNF$
    4,789
    Bank
    0
    Total DNF$
    4,789
    Donate  

    Re: spiders, how to block?

    In your Robots.txt just disallow:

    http://www.mysite.com/index.php?url=*

  3. #3
    DNF Regular
    stevey's Avatar
    Join Date
    Aug 2004
    Location
    Button Moon
    Posts
    887
    DNF$
    8,887
    Bank
    0
    Total DNF$
    8,887
    Donate  

    Re: spiders, how to block?

    thanks for the help
    http://www.goodridgeelec.com
    Electrical Contractors, West Midlands, UK

  4. #4
    Platinum Lifetime Member
    DogFaceBoy's Avatar
    Join Date
    Sep 2005
    Location
    Canada
    Posts
    1,038
    DNF$
    4,789
    Bank
    0
    Total DNF$
    4,789
    Donate  

    Re: spiders, how to block?

    Np thanks for payment .

  5. #5
    Platinum Lifetime Member
    kokopelli's Avatar
    Join Date
    Jul 2004
    Location
    USA
    Posts
    1,063
    DNF$
    5,278
    Bank
    0
    Total DNF$
    5,278
    Donate  

    Re: spiders, how to block?

    I'm not too sure your suggested robots syntax is correct. I did a test robots.txt file as per your suggestion and ran it through an online validator and this is what I got:
    Disallow: http://www.mysite.com/index.php?url=*
    The "*" wildchar in file names is not supported by (all) the user-agents addressed by this block of code. You should use the wildchar "*" in a block of code exclusively addressed to spiders that support the wildchar (Eg. Googlebot).
    You can't use an absolute URL. Please remove the "http://" and the domain name and insert just a file/directory full path, starting from the root directory (Example: /pagename.html).
    The Disallow field has an inherent wildcard nature. The standard dictates that /bob would disallow /bob.html and /bob/index.html (both the file bob and files in the bob directory will not be indexed). Another example, Disallow: /help disallows both /help.html and /help/index.html, whereas Disallow: /help/ would disallow /help/index.html but allow /help.html

    So perhaps your robots.txt file should rather just read:
    User-agent: *
    Disallow: /index.php?
    References:
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    My Current Websites for SALE

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Domain name forum recommended by Domaining.com