October 23

Essential .htaccess Tips and Tricks


People running on a good web host should be able to make use of a .htaccess file, which is a plain text file that allows you to have configuration rules that affect your Unix web server on a per-directory basis.

This is a good starting .htaccess file that beefs up your web server’s security, with rules from a post on 0x000000.com, and SigSiu.net

[code lang=”apache”]
# Disable server signature
ServerSignature Off
# deny folder listing
IndexIgnore *
# deny directory browsing
Options All -Indexes
# enable symbolic links
Options +FollowSymLinks
# enable basic rewriting
RewriteEngine on
# Prevent use of specified methods in HTTP Request
# Block out use of illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR]
# Block out use of illegal or unsafe characters in the Referer Variable of the HTTP Request
RewriteCond %{HTTP_REFERER} ^(.*)(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
# Block out use of illegal or unsafe characters in any cookie associated with the HTTP Request
RewriteCond %{HTTP_COOKIE} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
# Block out use of illegal characters in URI or use of malformed URI
RewriteCond %{REQUEST_URI} ^/(,|;|:|<|>|">|"<|/|\\\.\.\\).{0,9999}.* [NC,OR]
# Block out use of empty User Agent Strings
# NOTE – disable this rule if your site is integrated with Payment Gateways such as PayPal
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
# Block out use of illegal or unsafe characters in the User Agent variable
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
# Measures to block out SQL injection attacks
RewriteCond %{QUERY_STRING} ^.*(;|<|>|’|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*(/\*|union|select|insert|cast|set|declare|drop|update|md5|benchmark).* [NC,OR]
# Block out reference to localhost/loopback/ in the Query String
RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).* [NC,OR]
# Block out use of illegal or unsafe characters in the Query String variable
RewriteCond %{QUERY_STRING} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
#proc/self/environ? no way!
RewriteCond %{QUERY_STRING} proc\/self\/environ [NC,OR]
RewriteRule .* – [F]
########## Begin – File injection protection, by SigSiu.net
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http:// [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http%3A%2F%2F [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(\.\.//?)+ [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=/([a-z0-9_.]//?)+ [NC]
RewriteRule .* – [F]
########## End – File injection protection

Regex Character Definitions For htaccess

In addition, I found the following write-up useful for understanding the characters in the square brackets after each rule (from StopMalvertising).

the # instructs the server to ignore the line. used for including comments. Each line of comments requires it’s own #. when including comments, it is good practice to use only letters, numbers, dashes, and underscores. this practice will help eliminate/avoid potential server parsing errors.

Forbidden: instructs the server to return a 403 Forbidden to the client.

Last rule: instructs the server to stop rewriting after the preceding directive is processed.

Next: instructs Apache to rerun the rewrite rule until all rewriting directives have been achieved.

Gone: instructs the server to deliver Gone (no longer exists) status message.

Proxy: instructs server to handle requests by mod_proxy

Chain: instructs server to chain the current rule with the previous rule.

Redirect: instructs Apache to issue a redirect, causing the browser to request the rewritten/modified URL.

No Case: defines any associated argument as case-insensitive. i.e., “NC” = “No Case”.

Pass Through: instructs mod_rewrite to pass the rewritten URL back to Apache for further processing.

Or: specifies a logical “or” that ties two expressions together such that either one proving true will cause the associated rule to be applied.

No Escape: instructs the server to parse output without escaping characters.

No Subrequest: instructs the server to skip the directive if internal sub-request.

Append Query String: directs server to add the query string to the end of the expression (URL).

Skip: instructs the server to skip the next “x” number of rules if a match is detected.

Environmental Variable: instructs the server to set the environmental variable “variable” to “value”.

Mime Type: declares the mime type of the target resource.

specifies a character class, in which any character within the brackets will be a match. e.g., [xyz] will match either an x, y, or z.

character class in which any combination of items within the brackets will be a match. e.g., [xyz]+ will match any number of x’s, y’s, z’s, or any combination of these characters.

specifies not within a character class. e.g., [^xyz] will match any character that is neither x, y, nor z.

a dash (-) between two characters within a character class ([]) denotes the range of characters between them. e.g., [a-zA-Z] matches all lowercase and uppercase letters from a to z.

specifies an exact number, n, of the preceding character. e.g., x{3} matches exactly three x’s.

specifies n or more of the preceding character. e.g., x{3,} matches three or more x’s.

specifies a range of numbers, between n and m, of the preceding character. e.g., x{3,7} matches three, four, five, six, or seven x’s.

used to group characters together, thereby considering them as a single unit. e.g., (perishable)?press will match press, with or without the perishable prefix.

denotes the beginning of a regex (regex = regular expression) test string. i.e., begin argument with the proceeding character.

denotes the end of a regex (regex = regular expression) test string. i.e., end argument with the previous character.

declares as optional the preceding character. e.g., monzas? will match monza or monzas, while mon(za)? will match either mon or monza. i.e., x? matches zero or one of x.

declares negation. e.g., “!string” matches everything except “string”.

a dot (or period) indicates any single arbitrary character.

instructs “not to” rewrite the URL, as in “…domain.com.* – [F]”.

matches one or more of the preceding character. e.g., G+ matches one or more G’s, while “+” will match one or more characters of any kind.

matches zero or more of the preceding character. e.g., use “.*” as a wildcard.

declares a logical “or” operator. for example, (x|y) matches x or y.

escapes special characters ( ^ $ ! . * | ). e.g., use “\.” to indicate/escape a literal dot.

indicates a literal dot (escaped).

zero or more slashes.

zero or more arbitrary characters.

defines an empty string.

the standard pattern for matching everything.

defines one character that is neither a slash nor a dot.

defines any number of characters which contains neither slash nor dot.

this is a literal statement — in this case, the literal character string, “http://”.

defines a string that begins with the term “domain”, which then may be proceeded by any number of any characters.

defines the exact string “domain.com”.

tests if string is an existing directory

tests if string is an existing file

tests if file in test string has a non-zero value

