Skip to content

Essential .htaccess Tips and Tricks

October 23, 2014

People running on a good web host should be able to make use of a .htaccess file, which is a plain text file that allows you to have configuration rules that affect your Unix web server on a per-directory basis.

This is a good starting .htaccess file that beefs up your web server’s security, with rules from a post on 0x000000.com, and SigSiu.net

[code lang=”apache”]
# Disable server signature
ServerSignature Off
# deny folder listing
IndexIgnore *
# deny directory browsing
Options All -Indexes
# enable symbolic links
Options +FollowSymLinks
# enable basic rewriting
RewriteEngine on
# Prevent use of specified methods in HTTP Request
RewriteCond %{REQUEST_METHOD} ^(HEAD|TRACE|DELETE|TRACK) [NC,OR]
# Block out use of illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR]
# Block out use of illegal or unsafe characters in the Referer Variable of the HTTP Request
RewriteCond %{HTTP_REFERER} ^(.*)(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
# Block out use of illegal or unsafe characters in any cookie associated with the HTTP Request
RewriteCond %{HTTP_COOKIE} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
# Block out use of illegal characters in URI or use of malformed URI
RewriteCond %{REQUEST_URI} ^/(,|;|:|<|>|">|"<|/|\\\.\.\\).{0,9999}.* [NC,OR]
# Block out use of empty User Agent Strings
# NOTE – disable this rule if your site is integrated with Payment Gateways such as PayPal
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
# Block out use of illegal or unsafe characters in the User Agent variable
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
# Measures to block out SQL injection attacks
RewriteCond %{QUERY_STRING} ^.*(;|<|>|’|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*(/\*|union|select|insert|cast|set|declare|drop|update|md5|benchmark).* [NC,OR]
# Block out reference to localhost/loopback/127.0.0.1 in the Query String
RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).* [NC,OR]
# Block out use of illegal or unsafe characters in the Query String variable
RewriteCond %{QUERY_STRING} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]
#proc/self/environ? no way!
RewriteCond %{QUERY_STRING} proc\/self\/environ [NC,OR]
RewriteRule .* – [F]
########## Begin – File injection protection, by SigSiu.net
RewriteCond %{REQUEST_METHOD} GET
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http:// [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http%3A%2F%2F [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(\.\.//?)+ [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=/([a-z0-9_.]//?)+ [NC]
RewriteRule .* – [F]
########## End – File injection protection
[/code]

Regex Character Definitions For htaccess

In addition, I found the following write-up useful for understanding the characters in the square brackets after each rule (from StopMalvertising).

#
the # instructs the server to ignore the line. used for including comments. Each line of comments requires it’s own #. when including comments, it is good practice to use only letters, numbers, dashes, and underscores. this practice will help eliminate/avoid potential server parsing errors.

[F]
Forbidden: instructs the server to return a 403 Forbidden to the client.

[L]
Last rule: instructs the server to stop rewriting after the preceding directive is processed.

[N]
Next: instructs Apache to rerun the rewrite rule until all rewriting directives have been achieved.

[G]
Gone: instructs the server to deliver Gone (no longer exists) status message.

[P]
Proxy: instructs server to handle requests by mod_proxy

[C]
Chain: instructs server to chain the current rule with the previous rule.

[R]
Redirect: instructs Apache to issue a redirect, causing the browser to request the rewritten/modified URL.

[NC]
No Case: defines any associated argument as case-insensitive. i.e., “NC” = “No Case”.

[PT]
Pass Through: instructs mod_rewrite to pass the rewritten URL back to Apache for further processing.

[OR]
Or: specifies a logical “or” that ties two expressions together such that either one proving true will cause the associated rule to be applied.

[NE]
No Escape: instructs the server to parse output without escaping characters.

[NS]
No Subrequest: instructs the server to skip the directive if internal sub-request.

[QSA]
Append Query String: directs server to add the query string to the end of the expression (URL).

[S=x]
Skip: instructs the server to skip the next “x” number of rules if a match is detected.

[E=variable:value]
Environmental Variable: instructs the server to set the environmental variable “variable” to “value”.

[T=MIME-type]
Mime Type: declares the mime type of the target resource.

[]
specifies a character class, in which any character within the brackets will be a match. e.g., [xyz] will match either an x, y, or z.

[]+
character class in which any combination of items within the brackets will be a match. e.g., [xyz]+ will match any number of x’s, y’s, z’s, or any combination of these characters.

[^]
specifies not within a character class. e.g., [^xyz] will match any character that is neither x, y, nor z.

[a-z]
a dash (-) between two characters within a character class ([]) denotes the range of characters between them. e.g., [a-zA-Z] matches all lowercase and uppercase letters from a to z.

a{n}
specifies an exact number, n, of the preceding character. e.g., x{3} matches exactly three x’s.

a{n,}
specifies n or more of the preceding character. e.g., x{3,} matches three or more x’s.

a{n,m}
specifies a range of numbers, between n and m, of the preceding character. e.g., x{3,7} matches three, four, five, six, or seven x’s.

()
used to group characters together, thereby considering them as a single unit. e.g., (perishable)?press will match press, with or without the perishable prefix.

^
denotes the beginning of a regex (regex = regular expression) test string. i.e., begin argument with the proceeding character.

$
denotes the end of a regex (regex = regular expression) test string. i.e., end argument with the previous character.

?
declares as optional the preceding character. e.g., monzas? will match monza or monzas, while mon(za)? will match either mon or monza. i.e., x? matches zero or one of x.

!
declares negation. e.g., “!string” matches everything except “string”.

.
a dot (or period) indicates any single arbitrary character.


instructs “not to” rewrite the URL, as in “…domain.com.* – [F]”.

+
matches one or more of the preceding character. e.g., G+ matches one or more G’s, while “+” will match one or more characters of any kind.

*
matches zero or more of the preceding character. e.g., use “.*” as a wildcard.

|
declares a logical “or” operator. for example, (x|y) matches x or y.

\
escapes special characters ( ^ $ ! . * | ). e.g., use “\.” to indicate/escape a literal dot.

\.
indicates a literal dot (escaped).

/*
zero or more slashes.

.*
zero or more arbitrary characters.

^$
defines an empty string.

^.*$
the standard pattern for matching everything.

[^/.]
defines one character that is neither a slash nor a dot.

[^/.]+
defines any number of characters which contains neither slash nor dot.

http://
this is a literal statement — in this case, the literal character string, “http://”.

^domain.*
defines a string that begins with the term “domain”, which then may be proceeded by any number of any characters.

^domain\.com$
defines the exact string “domain.com”.

-d
tests if string is an existing directory

-f
tests if string is an existing file

-s
tests if file in test string has a non-zero value

Related Posts.