• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

How to apply apache rewrite rules to whole domain space?

Gytis

New Pleskian
Hello,

we are trying to create a "development" domain space:
- project1.example.com
- project2.example.com
- project3.example.com

Our goal is to disallow indexing of these websites. Simpliest way would be to create a robots.txt files for each website, however, we would like to use Apache RewriteRule to redirect requests like:

proj1.example.com/robots.txt >> example.com/robots.txt

example.com is an empty website. example.com/robots.txt contains crawler rules (disallows all).

What would ideal if the redirect would take place for all subdomains, regardless of wherther they have their own robots.txt file or they don't.

We couldn't find any options to accomplish this. Placing .htaccess in top level domain doesn't apply the rules to subdomains. I checked Plesk with a regular, should I check with admin account?

Is it even possible t accomplish this in Plesk 11.0.9?

There question is also described here:

http://stackoverflow.com/questions/...for-domain-and-its-subdomains-in-plesk-11-0-9
 
Hi Gytis,

one solution would be to use a "Custom Virtual Host Template":

( apart from that, please keep in mind, that uploding a single file like ".htaccess" and/or "robots.txt" to a webspace - folder is done withing seconds, while using custom virtual host templates will result to copy the whole skeleton content to each domain, which is newly created. You have to modify/delete the ".htaccess" and/or "robots.txt" - file for new domains/subdomains, if you don't wish these files to be existent for that new domain/subdomain, or switch back to the default skeleton, before you create a domain/subdomain ).

You should as well be aware, that some bots/spiders totally ignore robots.txt. You could consider to use password protection for domains/subdomains in development states.

 
The problem is that our server holds both production and development domains. Therefore a custom template won't work (as far as I know only one template can be used).
The perfect solution would be to "mask/rewrite" robots.txt by the enviroment:
- the robots.txt file should only be rewritten on this dev server, no matter if the site contains it's own copy of robots.txt
- copying blocking robots.txt to subdomain wouldn't be ideal, because we may accidentally move it to production.
- the subdomain may generate its own robots.txt or .htaccess

Another solution would be to block crawler IPs to the whole domain space, but it raises the same problem. Using users isn't an option because it has to be avaiable ot the testers.
 
Back
Top