The search engine will simply assume that /this/that/that/this is a directory and request that file from your web server.
A search engine WILL NOT submit "GET" data, such as this=that&that=this.
When I submit my site to search engines...what should the url be that I submit?
Basically, submit whatever pages you want indexed. It's best to have an 'index.php' or 'index.html' file with links to the REST of your page; that way, when you submit your 'index.*' page, the Googlebot will simply follow the links you provided there and spider the rest of your page for you.
BTW: When submitting to search engines, it's best to submit just your domain name (ie, Blitzweb.org). Your web server will automatically serve up your index page for you just like it would for any ordinary visitor. The Googlebot will find your links and take it from there automatically. You don't have to submit all your links by hand.
Yes, and yes.
Your main page MUST be index.php or index.html; if you use anything else, your visitors will get a 404 error when they try access your web page using just your domain name!
You would "submit" individual URL's using ROBOTS.TXT, but that's completely unrelated to the current issue - explore this link at your liesure.
If my index.php makes use of the include("myfile.php"); function, will the search engine "build" the page, before indexing it.
Your current "index.php" can be left alone, so long as it's default is to produce the main "home" page output (ie, no "id=1" or other parameters). Then, change all the LINKS GENERATED to point to "show.php"; ie, http://domain.com/show.php/id/1.
Show.php, which is the code snippet I included, will redirect the web server BACK TO the index.php file, with the proper parameters to produce your desired output.
1.) index.php :: "about" link points to "show.php/link/about"
2.) show.php :: Receives a user's click (or Googlebot's click) for /link/about, sends server back to index.php?link=about
The redirection which occurs in step #2 is COMPLETELY TRANSPARENT TO GOOGLEBOT! Your output will be generated by "index.php?link=about", but Googlebot (or any of your users, for that matter) will think they're seeing "show.php/link/about".
I understand that this whole thread will be highly confusing. The best way to understand how it works is to create test files and see how it behaves. The key to understanding why you have to go through all this is to realize that variables passed as "GET" variables (ie, "link=news&view=summary" or whatever) WILL NOT BE USED BY GOOGLEBOT, so if your "index.php?link=news&view=summary" is your news link, Google will never see it, because it will never submit the {variable1}={value1}&{variable2}={value2} arguments to the web server.
By converting these arguments to directory names tricks Google into thinking it's accessing "show.php/link/news/view/summary"; show.php translates this to index.php?link=news&view=summary on the SERVER side, so Google doesn't even know it's happening.
Britisch
Blitzweb.org