Fedora iTOps Tube

Thursday, November 17, 2011

Conclusion

Conclusion

Web sites both personal and commercial can be very rewarding exercises as they share your interests with the world and allow you to meet new people with whom to develop friendships or transact business.


Unfortunately, even the best Web sites can be impersonal as they frequently only provide information that the designer expects the visitor to need. E-mail, although ancient in comparison to newer personalized interactive Internet technologies, such as IP telephony and instant messaging, has the advantage of being able to relay documents and other information without interrupting the addressee. This allows them to schedule a response when they are better prepared to answer, a valuable quality when replies need to be complex.


Chapter 21, "Configuring Linux Mail Servers", explains how to configure a Linux e-smail server to reduce spam and provide personalized addresses across multiple domains. No Web site should be without one.

The Apache Error Log Files

The Apache Error Log Files

The /var/log/httpd/error_log file is a good source for error information. Unlike the /var/log/httpd/access_log file, there is no standardized formatting.


Typical errors that you'll find here are HTTP queries for files that don't exist or forbidden requests for directory listings. The file will also include Apache startup errors which can be very useful.


The /var/log/httpd/error_log file also is the location where CGI script errors are written. Many times CGI scripts fail with a blank screen on your browser; the /var/log/httpd/error_log file most likely lists the cause of the problem.

The Apache Status Log Files

The Apache Status Log Files

The /var/log/httpd/access_log file is updated after every HTTP query and is a good source of general purpose information about your website. There is a fixed formatting style with each entry being separated by spaces or quotation marks. Table 20-3 lists the layout.


Table 20-3 Apache Log File Format


Field Number Description Separator
1IP Address of the remote web surfer Spaces
2Time Stamp Square Brackets []
3HTTP query including the web page served Quotes ""
4HTTP result code Spaces
5The amount of data in bytes sent to the remote web browser Spaces
6The web page that contained the link to the page served. Quotes ""
7The version of the web browser used to get the page Quotes ""


Upon examining the entry, you can determine that someone at IP address 67.119.25.115 on February 15, looked at the web page /dns-static.htm returning a successful 200 status code. The amount of data sent was 15190 bytes and the surfer got to the site by clicking on the linkhttp://www.itopstube.com/sendmail.htm using Microsoft Internet Explorer version 5.5.


67.119.25.115 - - [15/Feb/2003:23:06:51 -0800] "GET /dns-static.htm HTTP/1.1" 200 15190
"Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; AT&T CSM6.0; YComp 5.0.2.6)"

The HTTP status code can provide some insight into the types of operations surfers are trying to attempt and may help to isolate problems with your pages, not the operation of the Apache. For example 404 errors are generated when someone tries to access a web page that doesn't exist anymore. This could be caused by incorrect URL links in other pages on you site. Table 20-4 has some of the more common examples.


Table 20-4 HTTP Status Codes


HTTP Code Description
200Successful request
304Successful request, but the web page requested hasn't been modified since the current version in the remote web browser's cache. This means the web page will not be sent to the remote browser, it will just use its cached version instead. Frequently occurs when a surfer is browsing back and forth on a site.
401Unauthorized access. Someone entered an incorrect username / password on a password protected page.
403Forbidden. File permissions or contexts prevents Apache from reading the file. Often occurs when the web page file is owned by user "root" even though it has universal read access.
404Not found. Page requested doesn't exist.
500Internal server error. Frequently generated by CGI scripts that fail due to bad syntax. Check your error_log file for further details on the script's error message.

Server Name Errors

Server Name Errors

All ServerName directives must list a domain that is resolvable in DNS, or else you'll get an error similar to these when starting httpd.


Starting httpd: httpd: Could not determine the server's fully qualified domain name, using 127.0.0.1 for ServerName

Starting httpd: [Wed Feb 04 21:18:16 2004] [error] (EAI 2)Name or service not known: Failed to resolve server name for 192.16.1.100 (check DNS) -- or specify an explicit ServerName

You can avoid this by adding a default generic ServerName directive at the top of the httpd.conf file that references localhost instead of the default new.host.name:80.


#ServerName new.host.name:80
ServerName localhost


Incompatible httpd.conf Files When Upgrading

Incompatible httpd.conf Files When Upgrading

Your old configuration files will be incompatible when upgrading from Apache version 1.3 to Apache 2.X. In Redhat / Fedora, the new version 2.X default configuration file is stored in /etc/httpd/conf/httpd.conf.rpmnew. For the simple virtual hosting example above, it would be easiest to:

  1. Save the old httpd.conf file with another name, httpd.conf-version-1.x for example. Copy the ServerName, NameVirtualHost, and VirtualHost containers from the old file and place them in the and place them in the new httpd.conf.rpmnew file.
  2. Copy the httpd.conf.rpmnew file an name it httpd.conf
  3. Restart Apache

With other distributions, the procedure is similar; just place your containers in the new default configuration file and restart Apache.

Only The Default Apache Page Appears

Only The Default Apache Page Appears

When only the default Apache page appears, there are two main causes. The first is the lack of an index.html file in your Web site's DocumentRoot directory. The second cause is usually related to an incorrect security context for the Web page's file. Please refer to the "General Configuration Steps" section for further details.

Browser 403 Forbidden Messages

Browser 403 Forbidden Messages

Browser 403 Forbidden messages are usually caused by file permissions and security context issues. Please refer to the "General Configuration Steps" section for further details.

A sure sign of problems related to security context are "avc: denied" messages in your /var/log/messages log file.


Nov 21 20:41:23 bigboy kernel: audit(1101098483.897:0): avc:  denied  { getattr } for  pid=1377 exe=/usr/sbin/httpd path=/home/www/index.html dev=hda5 ino=12 scontext=root:system_r:httpd_t tcontext=root:object_r:home_root_t tclass=file


Testing Basic HTTP Connectivity

Testing Basic HTTP Connectivity

The very first step is to determine whether your web server is accessible on TCP port 80 (HTTP).

Lack of connectivity could be caused by a firewall with incorrect permit, NAT, or port forwarding rules to your Web server. Other sources of failure include Apache not being started at all, the server being down, or network-related failures.


If you can connect on port 80 but no pages are being served, then the problem is usually due to a bad Web application, not the Web server software itself.


It is best to test this from both inside your network and from the Internet. Troubleshooting with TELNET is covered in Chapter 4, "Simple Network Troubleshooting".

Troubleshooting Apache

Troubleshooting Apache

Troubleshooting a basic Apache configuration is fairly straightforward; you'll find errors in the /var/log/httpd/error_log file during normal operation or displayed on the screen when Apache starts up. Most of the errors you'll encounter will probably be related to incompatible syntax in the <VirtualHosts> statement caused by typing errors.

The conf.d Directory

The conf.d Directory

Files in the /etc/httpd/conf.d (Redhat / Fedora) or the /etc/apache*/conf.d (Debian / Ubuntu) directory are read and automatically appended to the configuration in the httpd.conf file every time Apache is restarted. In complicated configurations, in which a Web server has to host many Web sites, you can create one configuration file per Web site each with its own set of <VirtualHost> and <Directory> containers. This can make Web site management much simpler. To do this correctly:


  1. Backup your httpd.conf file, in case you make a mistake.
  2. Create the files located in this directory that contain the Apache required <VirtualHost> and <Directory> containers and directives.
  3. If each site has a dedicated IP address, then place the NameVirtualHost statements in the corresponding conf.d directory file. If it is shared, it'll need to remain in the main httpd.conf file.
  4. Remove the corresponding directives from the httpd.conf file.
  5. Restart Apache, and test.

The files located in the conf.d directory don't have to have any special names, and you don't have to refer to them in the httpd.conf file.

How To Protect Web Page Directories With Passwords

How To Protect Web Page Directories With Passwords

You can password protect content in both the main and subdirectories of your DocumentRoot fairly easily. I know people who allow normal access to their regular Web pages, but require passwords for directories or pages that show MRTG or Webalizer data. This example shows how to password protect the /home/www directory.


1) Use Apache's htpasswd password utility to create username/password combinations independent of your system login password for Web page access. You have to specify the location of the password file, and if it doesn't yet exist, you have to include a -c, or create, switch on the command line. I recommend placing the file in your /etc/httpd/conf directory, away from the DocumentRoot tree where Web users could possibly view it. Here is an example for a first user named peter and a second named paul:


[root@bigboy tmp]# htpasswd -c /etc/httpd/conf/.htpasswd peter
New password:
Re-type new password:
Adding password for user peter
[root@bigboy tmp]#

[root@bigboy tmp]# htpasswd /etc/httpd/conf/.htpasswd paul
New password:
Re-type new password:
Adding password for user paul
[root@bigboy tmp]#

2) Make the .htpasswd file readable by all users.


[root@bigboy tmp]# chmod 644 /etc/httpd/conf/.htpasswd


3) Create a .htaccess file in the directory to which you want password control with these entries.


AuthUserFile /etc/httpd/conf/.htpasswd
AuthGroupFile /dev/null
AuthName EnterPassword
AuthType Basic
require user peter

Remember this password protects the directory and all its subdirectories. The AuthUserFile tells Apache to use the .htpasswd file. The require user statement tells Apache that only user peter in the .htpasswd file should have access. If you want all .htpasswd users to have access, replace this line with require valid-user. AuthType Basic instructs Apache to accept basic unencrypted passwords from the remote users' Web browser.

4) Set the correct file protections on your new .htaccess file in the directory /home/www.


[root@bigboy tmp]# chmod 644 /home/www/.htaccess


5) Make sure your /etc/httpd/conf/http.conf file has an AllowOverride statement in a <Directory> directive for any directory in the tree above /home/www. In this example below, all directories below /var/www/ require password authorization.


<Directory /home/www/*>
   AllowOverride AuthConfig
</Directory>

6) Make sure that you have a <VirtualHost> directive that defines access to /home/www or another directory higher up in the tree.


<VirtualHost *>
   ServerName 97.158.253.26
   DocumentRoot /home/www
</VirtualHost>

7) Restart Apache.

Try accessing the web site and you'll be prompted for a password.

Configure DNS "Views"

Step 2: Configure DNS "Views"

You now need to fix the DNS problem that NAT creates. Users on the Internet need to access IP address 97.158.253.26 when visiting www.my-site.com and users on your home network need to access IP address 192.168.1.100 when visiting the same site.


You can configure your DNS server to use views which makes your DNS server give different results depending on the source IP address of the Web surfer's PC doing the query. Chapter 18, "Configuring DNS", explains how to do this in detail.


Note: If you have to rely on someone else to do the DNS change, then you can edit your PC's hosts file as a quick and dirty temporary solution to the problem. Remember that this will fix the problem on your PC alone.

Configure Virtual Hosting on Multiple IPs

Step 1: Configure Virtual Hosting on Multiple IPs

You can configure Apache to serve the correct content when accessing www.mysite.com or www.another-site.com from the outside, and also when accessing the specific IP address 192.168.1.100 from the inside. Fortunately Apache allows you to specify multiple IP addresses in the <VirtualHost> statements to help you overcome this problem.

Here is an example:


NameVirtualHost 192.168.1.100
NameVirtualHost 97.158.253.26

<VirtualHost 192.168.1.100 97.158.253.26>
   DocumentRoot /www/server1
   ServerName www.my-site.com
   ServerAlias bigboy, www.my-site-192-168-1-100.com
</VirtualHost>


Apache Running On A Server Behind A NAT Firewall

Apache Running On A Server Behind A NAT Firewall

If your Web server is behind a NAT firewall and you are logged on a machine behind the firewall as well, then you may encounter problems when trying to access www.mysite.com of www.another-site.com. Because of NAT (network address translation), firewalls frequently don't allow access from their protected network to IP addresses that they masquerade on the outside.


For example, Linux Web server bigboy has an internal IP address of 192.168.1.100, but the firewall presents it to the world with an external IP address of 97.158.253.26 via NAT/masquerading. If you are on the inside, 192.168.1.X network, you may find it impossible to hit URLs that resolve in DNS to 97.158.253.26.


There is a two part solution to this problem:

Compression Configuration Example

Compression Configuration Example

You can insert these statements just before your virtual hosting section of your httpd.conf file to activate the compression of static pages. Remember to restart Apache when you do.

Note: Fedora's version of httpd.conf loads the compression module mod_deflate by default. This means that the LoadModule line (the first line of the example snippet) is not required for Fedora. The location statements are required, however.


LoadModule deflate_module modules/mod_deflate.so
 
<Location />
 
     # Insert filter
     SetOutputFilter DEFLATE
 
     # Netscape 4.x has some problems...
     BrowserMatch ^Mozilla/4 gzip-only-text/html
 
     # Netscape 4.06-4.08 have some more problems
     BrowserMatch ^Mozilla/4\.0[678] no-gzip
 
     # MSIE masquerades as Netscape, but it is fine
     BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
 
     # Don't compress images
     SetEnvIfNoCase Request_URI \
       \.(?:gif|jpe?g|png)$ no-gzip dont-vary
 
 
     # Make sure proxies don't deliver the wrong content
     Header append Vary User-Agent env=!dont-vary
 
</Location>


Using Data Compression On Web Pages

Using Data Compression On Web Pages

Apache also has the ability to dynamically compress static Web pages into gzip format and then send the result to the remote Web surfers' Web browser. Most current Web browsers support this format, transparently uncompressing the data and presenting it on the screen. This can significantly reduce bandwidth charges if you are paying for Internet access by the megabyte.

First you need to load Apache version 2's deflate module in your httpd.conf file and then use Location directives to specify which type of files to compress. After making these modifications and restarting Apache, you will be able to verify from your /var/log/httpd/access_log file that the sizes of the transmitted HTML pages have shrunk.

Compare the file sizes in this Apache log.



[root@ bigboy tmp]# grep dns-static /var/log/httpd/access_log
...
...
67.119.25.115 - - [15/Feb/2003:23:06:51 -0800] "GET /dns-static.htm HTTP/1.1" 200 15190 "http://www.itopstube.com/sendmail.htm" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; AT&T CSM6.0; YComp 5.0.2.6)"
...
...
[root@ bigboy tmp]#


and the corresponding directory listing



[root@ bigboy tmp]# ll /web-dir/dns-static.htm
-rw-r--r--    1 user      group       78350 Feb 15 00:53 /home/www/ccie/dns-static.htm
[root@bigboy tmp]#


As you can see, 78,350 bytes shrunk to 15,190 bytes, that's almost 80% compression.

Handling Missing Pages

Handling Missing Pages

You can tell Apache to display a predefined HTML file whenever a surfer attempts to access a non-index page that doesn't exist. You can place this statement in the httpd.conf file, which will make Apache display the contents of missing.htm instead of a generic "404 file Not Found" message:


ErrorDocument 404 /missing.htm


Remember to put a file with this name in each DocumentRoot directory. You can see the missing.htm file I use by trying the nonexistent link.


http://www.itopstube.com/bogus-file.htm


Notice that this gives the same output as


http://www.itopstube.com/missing.htm

Disabling Directory Listings

Disabling Directory Listings

Be careful to include an index.html pages in each subdirectories under your DocumentRoot directory, as if one isn't found, Apache will default to giving a listing of all the files in that subdirectory.

Say, for example, you create a subdirectory named /home/www/site1/example under www.my-site.com's DocumentRoot of /home/www/site1/. Now you'll be able to view the contents of the file my-example.html in this subdirectory if you point your browser to:



http://www.my-site.com/example/my-example.html


If curious surfers decide to see what the index page is for www.my-site.com/example, they would type the link:


http://www.my-site.com/example


Apache lists all the contents of the files in the example directory if it can't find the index.html file. You can disable the directory listing by using a -Indexes option in the <Directory> directive for the DocumentRoot like this:


<Directory "/home/www/*">
 ...
 ...
 ...
 Options MultiViews -Indexes SymLinksIfOwnerMatch IncludesNoExec

Remember to restart Apache after the changes. Users attempting to access the nonexistent index page will now get a "403 Access denied" message.


Note: When setting up a yum server it's best to enable directory listings for the RPM subdirectories. This allows web surfers to double check the locations of files through their browsers.

Testing Your Website Before DNS Is Fixed

Testing Your Website Before DNS Is Fixed

You may not be able to wait for DNS to be configured correctly before starting your project. The easiest way to temporarily bypass this is to modify the hosts file on the Web developer's client PC or workstation (not the Apache server). By default, PCs and Linux workstations query the hosts file first before checking DNS, so if a value for www.my-site.com is listed in the file, that's what the client will use.

The Windows equivalent of the Linux /etc/hosts file is named C:\WINDOWS\system32\drivers\etc\hosts. You need to open and edit it with a text editor, such as Notepad. Here you could add an entry similar to:


97.158.253.26          www.my-site.com

Do not remove the localhost entry in this file

Web Hosting Scenario Summary

Table 20-2 Web Hosting Scenario Summary

Domain IP Address Directory Type of Virtual Hosting
www.my-site.com

my-site.com

www.my-cool-site.com

97.158.253.26 Site2Name Based
www.test-site.com 97.158.253.27Site3Name Based (Wild card)
www.another-site.com97.158.253.27 Site4Name Based
www.default-site.com

All other domains

97.158.253.26Site1 Name Based


How do these requirements translate into code? Here is a sample snippet of a working httpd.conf file:



ServerName localhost
NameVirtualHost 97.158.253.26
NameVirtualHost 97.158.253.27

#
# Match a webpage directory with each website
#
<VirtualHost *>
    DocumentRoot /home/www/site1
 </VirtualHost>

<VirtualHost 97.158.253.26>
    DocumentRoot /home/www/site2
   ServerName www.my-site.com
</VirtualHost>
 
<VirtualHost 97.158.253.27>
    DocumentRoot /home/www/site3
   ServerName www.test-site.com
</VirtualHost>
 
<VirtualHost 97.158.253.27>
    DocumentRoot /home/www/site4
   ServerName www.another-site.com
</VirtualHost>
 
 
#
# Make sure the directories specified above
# have restricted access to read-only.
#
<Directory "/home/www/*">
    Order allow,deny
   Allow from all
 
    AllowOverride FileInfo AuthConfig Limit
   Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
   <Limit GET POST OPTIONS>
     Order allow,deny
     Allow from all
    </Limit>
    <LimitExcept GET POST OPTIONS>
     Order deny,allow
     Deny from all
   </LimitExcept>
 
</Directory>

These statements would normally be found at the very bottom of the file where the virtual hosting statements reside. The last section of this configuration snippet has some additional statements to ensure read-only access to your Web pages with the exception of Web-based forms using POSTs (pages with "submit" buttons). Remember to restart Apache every time you update the httpd.conf file for the changes to take effect on the running process.

Note: You will have to configure your DNS server to point to the correct IP address used for each of the Web sites you host. Chapter 18, "Configuring DNS", shows you how to configure multiple domains, such as my-site.com and another-site.com, on your DNS server.

A Note On Virtual Hosting And SSL

A Note On Virtual Hosting And SSL

Because it makes configuration easier, system administrators commonly replace the IP address in the <VirtualHost> and NameVirtualHost directives with the * wildcard character to indicate all IP addresses.

If you installed Apache with support for secure HTTPS/SSL, which is used frequently in credit card and shopping cart Web pages, then wild cards won't work. The Apache SSL module demands at least one explicit <VirtualHost> directive for IP-based virtual hosting. When you use wild cards, Apache interprets it as an overlap of name-based and IP-based <VirtualHost> directives and gives error messages because it can't make up its mind about which method to use:


Starting httpd: [Sat Oct 12 21:21:49 2002] [error] VirtualHost _default_:443 -- mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results

If you try to load any Web page on your web server, you'll see the error:


Bad request!

Your browser (or proxy) sent a request that this server could not understand.
If you think this is a server error, please contact the webmaster

The best solution to this problem is to use wild cards more sparingly. Don't use virtual hosting statements with wild cards except for the very first <VirtualHost> directive that defines the web pages to be displayed when matches to the other <VirtualHost> directives cannot be found. Here is an example.


NameVirtualHost *

<VirtualHost *>
   Directives for other sites
</VirtualHost>

<VirtualHost 97.158.253.28>
   Directives for site that also run on SSL
</VirtualHost>


IP Virtual Hosting Example: Single Wild Card

IP Virtual Hosting Example: Single Wild Card

In this example, Apache listens on all interfaces, but gives the same content. Apache displays the content in the first <VirtualHost *> directive even if you add another right after it. Apache also seems to enforce the single <VirtualHost> container per IP address requirement by ignoring any ServerName directives you may use inside it.


<VirtualHost *>
   DocumentRoot /home/www/site1
</VirtualHost>

IP Virtual Hosting Example: Wild Card and IP addresses

In this example, Apache listens on all interfaces, but gives different content for addresses 97.158.253.26 and 97.158.253.27. Web surfers get the site1 content if they try to access the web server on any of its other IP addresses.


<VirtualHost *>
   DocumentRoot /home/www/site1
</VirtualHost>

<VirtualHost 97.158.253.26>
   DocumentRoot /home/www/site2
</VirtualHost>

<VirtualHost 97.158.253.27>
   DocumentRoot /home/www/site3
</VirtualHost>