I wasn't sure on in which section to post this, so feel free to move if incorrect.

I have a question on search engines and include(d) files.

I have my main index.php file which links to the 'pages' as index.php?page=etc.

These included pages are php files without any design whatsoever, it's just a plain file,
all design is being called and formatted through a css through the index.php file.

The basic knowledge on searchengines I have is from my html past, is that they can find pages linked through solid .html links, but with such included files as page=... ?

Here's my question solid:

  1. can searchengines find included files as seperate files?
  2. if so, what to do about it.

    search engines generally dont index any pages with variables in the query string so it will probably not index all of your pages. if you must make your pages like that, you can use apache mod_rewrite so you can make certain files map to your index.php?page=xxx setup

      the major search engines will index the pages fine, as long as the query strings dont have tons of variables in them. if its just ?page=something youll be fine

      for example, im sure we have all used a search engine and found a thread on a message board. well look at the urls, they are dynamic.

        OK, thanks for the info, I looked into the mod_rewrite thingie, I think I'll need that anyway to 'convert' the ol d site's bookmarked links to the new one.

        And indeed I didn't think about similar systems such as messageboards, I'll think that will work.

        However, I am still somwhat worried about the seperate files.

        If they are found (any way whatsoever) they will stumble upon a white page with notjhing but text.

        Correct me if I'm wrong:
        pages cannot be found when placed in a seperate directory and aren't linked to from anywhere.

        (I hope so, because that's how my admin folder is stored...)

          google spiders your site via links on the main page so with no links to the admin section it wont find it, but you could also make a robots.txt file to tell google not to look there anyway.

          User-agent: Googlebot
          Disallow: /admin/*

            Originally posted by drew010
            google spiders your site via links on the main page so with no links to the admin section it wont find it, but you could also make a robots.txt file to tell google not to look there anyway.

            User-agent: Googlebot
            Disallow: /admin/*

            Yeah, that brings back some memories no some research I did quite a while back on robot.txt files.

            The only thing I don't like about it, is that because it's a txt file, everyone can read it through their browser, and thus know the location to my 'restricted areas'...

              security by not linking to your files is a bad idea.
              espescially if you have the files in a directory named "admin/" for example

              use .htaccess to prevent people from viewing any files from that directory via the url

              put this in your admin folder

              <Limit GET POST>
              Order Allow,Deny
              deny from all
              </Limit>
              
              

                Originally posted by rehfeld
                security by not linking to your files is a bad idea.
                espescially if you have the files in a directory named "admin/" for example

                use .htaccess to prevent people from viewing any files from that directory via the url

                put this in your admin folder

                <Limit GET POST>
                Order Allow,Deny
                deny from all
                </Limit>
                
                

                [/B]

                Being funny? how am I supposed to get in then?
                (please forgive my stupidity...)

                  oh i was thinking you were including the files for some reason, and wouldnt ever need to access them directly via http.

                  you could also use htaccess to just password protect the whole directory.

                  theres plenty of tutorials for that found via google.

                    Originally posted by rehfeld
                    oh i was thinking you were including the files for some reason, and wouldnt ever need to access them directly via http.

                    you could also use htaccess to just password protect the whole directory.

                    theres plenty of tutorials for that found via google.

                    Oh yeah, I can do that through Cpanel too .

                    Thanks all!
                    and sorry for the offopic part...

                      Write a Reply...