Fedora iTOps Tube

Sunday, July 10, 2011

Fedora Simple Network Troubleshooting

You will eventually find yourself trying to fix a network related problem which usually appears in one of two forms. The first is slow response times from the remote server, and the second is a complete lack of connectivity. These symptoms can be caused by:

Sources of Network Slowness
NIC duplex and speed incompatibilities
Network congestion
Poor routing
Bad cabling
Electrical interference
An overloaded server at the remote end of the connection
Misconfigured DNS (Covered in Chapter 18, "Configuring DNS" and Chapter 19, "Dynamic DNS")
Sources of a Lack of Connectivity
All sources of slowness can become so severe that connectivity is lost. Additional sources of disconnections are:

Power failures
The remote server or an application on the remote server being shut down.
We discuss how to isolate these problems and more in the following sections.

Doing Basic Cable and Link Tests
Your server won't be able to communicate with any other device on your network unless the NIC's "link" light is on. This indicates that the connection between your server and the switch/router is functioning correctly.

In most cases a lack of link is due to the wrong cable type being used. As described in Chapter 2, "Introduction to Networking", there are two types of Ethernet cables crossover and straight-through. Always make sure you are using the correct type.

Other sources of link failure include:

The cables are bad.
The switch or router to which the server is connected is powered down.
The cables aren't plugged in properly.
If you have an extensive network, investment in a battery-operated cable tester for basic connectivity testing is invaluable. More sophisticated models in the market will be able to tell you the approximate location of a cable break and whether an Ethernet cable is too long to be used.

Fedora Networking

Now that you have a firm grasp of many of the most commonly used networking concepts, it is time to apply them to the configuration of your server. Some of these activities are automatically covered during a Linux installation, but you will often find yourself having to know how to modify these initial settings whenever you need to move your server to another network, add a new network interface card or use an alternative means of connecting to the Internet.

In Chapter 2, "Introduction to Networking", we started with an explanation of TCP/IP, so we'll start this Linux networking chapter with a discussion on how to configure the IP address of your server


Contents

Additional Introductory Topics

The last few topics of this chapter may not appear to be directly related to networking, but they cover Linux help methods that you'll use extensively and the File Transfer Protocol (FTP) package, which enables you to download all the software you need to get your Linux server operational as quickly as possible.

The File Transfer Protocol

FTP is one of the most popular applications used to copy files between computers via a network connection. Knowledge of FTP is especially important and is a primary method of downloading software for Linux systems.
There are a number of commercially available GUI based clients you can load on your PC to do this, such as WSFTP and CuteFTP. You can also use FTP from the command line as shown in Chapter 6, "Installing RPM Software", on RPM software installation.
From the remote user's perspective, there are two types of FTP. The first is regular FTP which is used primarily to allow specific users to download files to their systems. The remote FTP server prompts you for a specific username and password to gain access to the data.
The second method, anonymous FTP is used primarily to allow any remote user to download files to their systems. The remote FTP server prompts you for a username, at which point the user types anonymous or ftp with the password being any valid e-mail address.
From the systems administrator's perspective, there are another two categories. These are "active" and "passive" FTP which is covered in more detail in Chapter 15, "Linux FTP Server Setup".
It is good to remember that FTP isn't very secure as usernames, passwords and data are sent across the network unencrypted. More secure forms such as SFTP (Secure FTP) and SCP (Secure Copy) are available as a part of the Secure Shell package (covered in Chapter 17, "Secure Remote Logins and File Copying") that is normally installed by default with Fedora.

Linux Help

Linux help files are accessed using the man or manual pages. From the command line you issue the man command followed by the Linux command or file about which you want to get information. If you want to get information on the ssh command, then you'd use the command man ssh.
[root@bigboy tmp]# man ssh
If you want to search all the man pages for a keyword, then use the man command with the -k switch, for example, man -k ssh which will give a list of all the man pages that contain the word ssh.
[root@bigboy tmp]# man -k ssh
...
...
ssh                  (1)  - OpenSSH SSH client (remote login program)
ssh [slogin]         (1)  - OpenSSH SSH client (remote login program)
ssh-agent            (1)  - authentication agent
ssh-keyscan          (1)  - gather ssh public keys
ssh_config           (5)  - OpenSSH SSH client configuration files
sshd                 (8)  - OpenSSH SSH daemon
sshd_config          (5)  - OpenSSH SSH daemon configuration file
...
...
[root@bigboy tmp]#
This book is targeted at proficient Linux beginners and above so I'll be using a wide variety of commands in this book without detailed explanations to help keep the flow brisk. If you need more help on a command, use its man page to get more details on what it does and the syntax it needs. Linux help can sometimes be cryptic, but with a little practice the man pages can become your friend.

Networking Equipment Terminology

TCP/IP can be quite interesting, but a knowledge of the first two layers of the OSI model are important too, because without them, even the most basic communication would be impossible.
There are very many standards that define the physical, electrical, and error-control methodologies of data communication. One of the most popular ones in departmental networks is Ethernet, which is available in a variety of cable types and speed capabilities, but the data transmission and error correction strategy is the same in all.
Ethernet used to operate primarily in a mode where every computer on a network section shared the same Ethernet cable. Computers would wait until the line was clear before transmitting. They would then send their data while comparing what they wanted to send with what they actually sent on the cable as a means of error detection. If a mathematical comparison, or cyclic redundancy check (CRC), detected any differences between the two, the server would assume that it transmitted data simultaneously with another server on the cable. It would then wait some random time and retransmit at some later stage when the line was clear again.
Transmitting data only after first sensing whether the cable, which was strung between multiple devices, had the correct signaling levels is a methodology called carrier sense, multiple access or CSMA. The ability to  detect garbling due to simultaneous data transmissions, also known as collisions, is called collision detect or CD. You will frequently see references to Ethernet being a CSMA/CD technology for this reason and similar schemes are now being used in wireless networks.
Ethernet devices are now usually connected via a dedicated cable, using more powerful hardware capable of simultaneously transmitting and receiving without interference, thereby making it more reliable and inherently faster than its predecessor versions. The original Ethernet standard has a speed of 10 Mbps; the most recent versions can handle up to 40Gbps!
The 802.11 specifications that define many wireless networking technologies are another example of commonly used layer 1 and 2 components of the OSI model. DSL, cable modem standards and, T1 circuits are all parts of these layers.
The next few sections describe many physical and link layer concepts and the operation of the devices that use them to connect the computers in our offices and homes.

Physical and Link Layers

TCP/IP can be quite interesting, but a knowledge of the first two layers of the OSI model are important too, because without them, even the most basic communication would be impossible.
There are very many standards that define the physical, electrical, and error-control methodologies of data communication. One of the most popular ones in departmental networks is Ethernet, which is available in a variety of cable types and speed capabilities, but the data transmission and error correction strategy is the same in all.
Ethernet used to operate primarily in a mode where every computer on a network section shared the same Ethernet cable. Computers would wait until the line was clear before transmitting. They would then send their data while comparing what they wanted to send with what they actually sent on the cable as a means of error detection. If a mathematical comparison, or cyclic redundancy check (CRC), detected any differences between the two, the server would assume that it transmitted data simultaneously with another server on the cable. It would then wait some random time and retransmit at some later stage when the line was clear again.
Transmitting data only after first sensing whether the cable, which was strung between multiple devices, had the correct signaling levels is a methodology called carrier sense, multiple access or CSMA. The ability to detect garbling due to simultaneous data transmissions, also known as collisions, is called collision detect or CD. You will frequently see references to Ethernet being a CSMA/CD technology for this reason and similar schemes are now being used in wireless networks.
Ethernet devices are now usually connected via a dedicated cable, using more powerful hardware capable of simultaneously transmitting and receiving without interference, thereby making it more reliable and inherently faster than its predecessor versions. The original Ethernet standard has a speed of 10 Mbps; the most recent versions can handle up to 40Gbps!
The 802.11 specifications that define many wireless networking technologies are another example of commonly used layer 1 and 2 components of the OSI model. DSL, cable modem standards and, T1 circuits are all parts of these layers.
The next few sections describe many physical and link layer concepts and the operation of the devices that use them to connect the computers in our offices and homes.

How Subnet Masks Group IP Addresses into Networks

Subnet masks are used to tell which part of the IP address represents the network on which the computer is connected (network portion) and the computer's unique identifier on that network (host portion). The term netmasks is often used interchangeably with the term subnet masks, this book will use the latter term for the sake of consistency.
A simple analogy would be a phone number, such as (808) 225-2468. The (808) represents the area code, and the 225-2468 represents the telephone within that area code. Subnet masks allow you to specify how long you want the area code to be (network portion) at the expense of the number of telephones in that are in the area code (host portion)
Most home networks use a subnet mask of 255.255.255.0. Each 255 means this octet is for the area code (network portion). So if your server has an IP address of 192.168.1.25 and a subnet mask of 255.255.255.0, the network portion would be 192.168.1 and the server or host would be device #25 on that network.
In all cases, the first IP address in a network is reserved as the network's base address and the last one is reserved for broadcast traffic that is intended to be received by all devices on the network. In our example, 192.168.1.0 would be the network address and 192.168.1.255 would be used for broadcasts. This means you can then use IP addresses from #1 to #254 on your private network.

Calculating The Number of Addresses Assigned to a Subnet

Most office and home networks use networks with 255 IP addresses or less in which the subnet mask starts with the numbers 255.255.255. This is not a pure networking text, so I'll not discuss larger networks because that can become complicated, but in cases where less than 255 IP addresses are required a few apply. There are only seven possible values for the last octet of a subnet mask. These are 0, 192, 128, 224, 240, 248 and 252. You can calculate the number of IP addresses for each of these by subtracting the value from 256.
In many cases the subnet mask isn't referred to by the dotted decimal notation, but rather by the actual number of bits in the mask. So for example a mask of 255.255.255.0 may be called a /24 (slash 24) mask instead. A list of the most commonly used masks in the office or home environment is presented in Table 2-2.

Table 2-2: The "Dotted Decimal" And "Slash" Subnet Mask Notations

Dotted Decimal Format Slash Format Available Addresses
255.255.255.0 /24 256
255.255.255.128 /25 128
255.255.255.192 /26 64
255.255.255.224 /27 32
255.255.255.240 /28 16
255.255.255.248 /29 8
255.255.255.252 /30 4
So for example, if you have a subnet mask of 255.255.255.192, then you have 64 IP addresses in your subnet (256 - 192)

Calculating the Range of Addresses on Your Network

If someone gives you an IP address of 97.158.253.28 and a subnet mask of 255.255.255.248, how do you determine the network address and the broadcast address, in other words the boundaries, of your network? The following section outlines the steps to do this using both a manual and programmed methodology.

Manual Calculation

Take out your pencil and paper, manual calculation can be tricky. Here we go!

  1. Subtract the last octet of the subnet mask from 256 to give the number of IP addresses in the subnet. (256 - 248) = 8
  2. Divide the last octet of the IP address by the result of step 1; don't bother with the remainder (for example 28 / 8 = 3). This gives you the theoretical number of subnets of the same size that are below this IP address.
  3. Multiply this result by the result of step 1 to get the network address (8 x 3 = 24). Think of it as the third subnet with 8 addresses in it. The network address is therefore 97.158.253.24
  4. The broadcast address is the result of step 3 plus the result of step 1 minus 1. (24 + 8 -1 = 31). Think of it as the broadcast address being the network address plus the number of IP addresses in the subnet minus 1". The broadcast address is 97.158.253.31

Let's do this for 192.168.3.56 with a mask of 255.255.255.224:

  1. 256 - 224 = 32
  2. 56/32 = 1
  3. 32 x 1 = 32. Therefore the network base address is 192.168.3.32
  4. 32 + 32 - 1 = 63. Therefore the broadcast address is 192.168.3.63

Let's do this for 10.0.0.75 with a mask of 255.255.255.240

  1. 256 - 240 = 16
  2. 75/16 = 4
  3. 16 x 4 = 64. Therefore the network base address is 10.0.0.64
  4. 64 + 16 -1 = 79. Therefore the broadcast address is 10.0.0.79

Note: As a rule of thumb, the last octet of your network base address must be divisible by the "256 minus the last octet of your subnet mask" and leave no remainder. If you are sub-netting a large chunk of IP addresses it's always a good idea to lay it out on a spreadsheet to make sure there are no overlapping subnets. Once again, this calculation exercise only works with subnet masks that start with "255.255.255".

Calculation Using a Script

There is a BASH script in Appendix II, "Codes, Scripts, and Configurations", that will do this for you. Here is an example of how to use it, just provide the IP address followed by the subnet mask as arguments. It will accept subnet masks in dotted decimal format or /value format

[root@bigboy tmp]# ./subnet-calc.sh 216.151.193.92 /28
IP Address           : 216.151.193.92
Network Base Address : 216.151.193.80
Broadcast Address    : 216.151.193.95
Subnet Mask          : 255.255.255.240
Subnet Size          : 16 IP Addresses
[root@bigboy tmp]#

Subnet Masks for the Typical Business DSL Line

If you purchased a DSL service from your ISP that gives you fixed IP addresses, they will most likely provide you with a subnet mask of 255.255.255.248 that defines 8 IP addresses. For example, if the ISP provides you with a public network address of 97.158.253.24, a subnet mask of 255.255.255.248, and a gateway of 97.158.253.25, then your IP addresses will be:
97.158.253.24 - Network base address
97.158.253.25 - Gateway
97.158.253.26 - Available
97.158.253.27 - Available
97.158.253.28 - Available
97.158.253.29 - Available
97.158.253.30 - Available
97.158.253.31 - Broadcast

How IP Addresses Are Used To Access Network Devices

All TCP/IP enabled devices connected to the Internet have an Internet Protocol (IP) address. Just like a telephone number, it helps to uniquely identify a user of the system. The Internet Assigned Numbers Authority (IANA) is the organization responsible for assigning IP addresses to Internet Service Providers (ISPs) and deciding which ones should be used for the public Internet and which ones should be used on private networks.
IP addresses are in reality a string of 32 binary digits or bits. For ease of use, network engineers often divide these 32 bits into four sets of 8 bits (or octets), each representing a number from 0 to 255. Each number is then separated by a period (.) to create the familiar dotted decimal notation. An example of an IP address that follows these rules is 97.65.25.12.
Note: Chapter 3, "Linux Networking", which covers Linux specific networking topics, explains how to configure the IP address of your Linux box.

Private IP Addresses

Some groups of IP addresses are reserved for use only in private networks and are not routed over the Internet. These are called private IP addresses and have the following ranges:
10.0.0.0 - 10.255.255.255
 172.16.0.0 - 172.31.255.255
192.168.0.0 - 192.168.255.255
Home networking equipment/devices usually are configured in the factory with an IP address in the range 192.168.1.1 to 192.168.1.255.
You may be wondering how devices using private addresses could ever access the Internet if the use of private addresses on the Internet is illegal. The situation gets even more confusing if you consider the fact that hundreds of thousands of office and home networks use these same addresses. This must cause networking confusion. Don't worry, this problem is overcome by NAT.

The localhost IP Address

Whether or not your computer has a network interface card it will have a built-in IP address with which network-aware applications can communicate with one another. This IP address is defined as 127.0.0.1 and is frequently referred to as localhost. This concept is important to understand, and will be revisited in many later chapters.

Network Address Translation (NAT) Makes Private IPs Public

Your router/firewall will frequently be configured to give the impression to other devices on the Internet that all the servers on your home/office network have a valid public IP address, and not a "private" IP address. This is called network address translation (NAT) and is often also called IP masquerading in the Linux world. There are many good reasons for this, the two most commonly stated are:
  • No one on the Internet knows your true IP address. NAT protects your home PCs by assigning them IP addresses from "private" IP address space that cannot be routed over the Internet. This prevents hackers from directly attacking your home systems because packets sent to the "private" IP will never pass over the Internet.
  • Hundreds of PCs and servers behind a NAT device can masquerade as a single public IP address. This greatly increases the number of devices that can access the Internet without running out of "public" IP addresses.
You can configure NAT to be one to one in which you request your ISP to assign you a number of public IP addresses to be used by the Internet-facing interface of your firewall and then you pair each of these addresses to a corresponding server on your protected private IP network. You can also use many to one NAT, in which the firewall maps a single IP address to multiple servers on the network.
As a general rule, you won't be able to access the public NAT IP addresses from servers on your home network. Basic NAT testing requires you to ask a friend to try to connect to your home network from the Internet.
Examples of NAT may be found in the IP masquerade section of Chapter 14, "Linux Firewalls Using iptables", that covers the Linux iptables firewall. Some of the terms mentioned here may be unfamiliar to you but they will be explained in later sections of this chapter.

Port Forwarding with NAT Facilitates Home-Based Web sites

In a simple home network, all servers accessing the Internet will appear to have the single public IP address of the router/firewall because of many to one NAT. Because the router/firewall is located at the border crossing to the Internet, it can easily keep track of all the various outbound connections to the Internet by monitoring:
  • The IP addresses and TCP ports used by each home based server and mapping it to
  • The TCP ports and IP addresses of the Internet servers with which they want to communicate.
This arrangement works well with a single NAT IP trying to initiate connections to many Internet addresses. The reverse isn't true.
New connections initiated from the Internet to the public IP address of the router/firewall face a problem. The router/firewall has no way of telling which of the many home PCs behind it should receive the relayed data because the mapping mentioned earlier doesn't exist beforehand. In this case the data is usually discarded.
Port forwarding is a method of counteracting this. For example, you can configure your router/firewall to forward TCP port 80 (Web/HTTP) traffic destined to the outside NAT IP to be automatically relayed to a specific server on the inside home network
As you may have guessed, port forwarding is one of the most common methods used to host Web sites at home with DHCP DSL.

DHCP

The Dynamic Host Configuration Protocol (DHCP) is a protocol that automates the assignment of IP addresses, subnet masks default routers, and other IP parameters.
The assignment usually occurs when the DHCP configured machine boots up, or regains connectivity to the network. The DHCP client sends out a query requesting a response from a DHCP server on the locally attached network. The DHCP server then replies to the client PC with its assigned IP address, subnet mask, DNS server and default gateway information.
The assignment of the IP address usually expires after a predetermined period of time, at which point the DHCP client and server renegotiate a new IP address from the server's predefined pool of addresses. Configuring firewall rules to accommodate access from machines who receive their IP addresses via DHCP is therefore more difficult because the remote IP address will vary from time to time. You'll probably have to allow access for the entire remote DHCP subnet for a particular TCP/UDP port.
Most home router/firewalls are configured in the factory to be DHCP servers for your home network. You can also make your Linux box into a DHCP server, once it has a fixed IP address.
The most commonly used form of DSL will also assign the outside interface of your router/firewall with a single DHCP provided IP address.
You can check Chapter 3, "Linux Networking", on Linux networking topics page on how to configure your Linux box to get its IP address via DHCP. You can also look at Chapter 8, "Configuring the DHCP Server", on Configuring a DHCP Server, to make your Linux box provide the DHCP addresses for the other machines on your network.

How DNS Links Your IP Address To Your Web Domain

The domain name system (DNS) is a worldwide server network used to help translate easy to remember domain names like www.linuxhomenetworking.com into an IP address that can be used behind the scenes by your computer. Here step by step description of what happens with a DNS lookup.
  1. Most home computers will get the IP address of their DNS server via DHCP from their router/firewall.
  2. Home router/firewall providing DHCP services often provides its own IP address as the DNS name server address for home computers.
  3. The router/firewall then redirects the DNS queries from your computer to the DNS name server of your Internet service provider (ISP).
  4. Your ISP's DNS server then probably redirects your query to one of the 13 "root" name servers.
  5. The root server then redirects your query to one of the Internet's ".com" DNS name servers which will then redirect the query to the "linuxhomenetworking.com" domain's name server.
  6. The linuxhomenetworking.com domain name server then responds with the IP address for www.linuxhomenetworking.com
As you can imagine, this process can cause a noticeable delay when you are browsing the Web. Each server in the chain will store the most frequent DNS name to IP address lookups in a memory cache which helps to speed up the response. Chapter 18, "Configuring DNS", explains how to you can make your Linux box into a caching or regular DNS server for your network or Web site if your ISP provides you with fixed IP addresses. Chapter 19, "Dynamic DNS", explains how to configure DNS for a Web site housed on a DHCP DSL circuit where the IP address constantly changes. It explains the auxiliary DNS standard called dynamic DNS (DDNS) that was created for this type of scenario.

IP Version 6 (IPv6)

Most Internet-capable networking devices use version 4 of the Internet Protocol (IPv4) which I have described here. You should also be aware that there is now a version 6 (IPv6) that has recently been developed as a replacement.
With only 32 bits, the allocation of version 4 addresses will soon be exhausted between all the world's ISPs. Version 6, which uses a much larger 128-bit address offers eighty billion, billion, billion times more IP addresses which it is hoped should last for most of the 21st century.
IPv6 packets are also labeled to provide quality-of-service information that can be used in prioritizing real-time applications, such as video and voice, over less time-sensitive ones such as regular Web surfing and chat. IPv6 also inherently supports the IPSec protocol suite used in many forms of secured networks, such as virtual private networks (VPNs).
Most current operating systems support IPv6 even though it isn't currently being used extensively within corporate or home environments. Expect it to become an increasingly bigger part of your network planning in years to come.

Introduction to TCP/IP

TCP/IP is a universal standard suite of protocols used to provide connectivity between networked devices. It is part of the larger OSI model upon which most data communications is based.
One component of TCP/IP is the Internet Protocol (IP) which is responsible for ensuring that data is transferred between two addresses without being corrupted.
For manageability, the data is usually split into multiple pieces or packets each with its own error detection bytes in the control section or header of the packet. The remote computer then receives the packets and reassembles the data and checks for errors. It then passes the data to the program that expects to receive it.
How does the computer know what program needs the data? Each IP packet also contains a piece of information in its header called the type field. This informs the computer receiving the data about the type of layer 4 transportation mechanism being used.
The two most popular transportation mechanisms used on the Internet are Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).
When the type of transport protocol has been determined, the TCP/UDP header is then inspected for the "port" value, which is used to determine which network application on the computer should process the data. This is explained in more detail later.

TCP Is a Connection-Oriented Protocol

TCP opens up a virtual connection between the client and server programs running on separate computers so that multiple and/or sporadic streams of data can be sent over an indefinite period of time between them. TCP keeps track of the packets sent by giving each one a sequence number with the remote server sending back acknowledgment packets confirming correct delivery. Programs that use TCP therefore have a means of detecting connection failures and requesting the retransmission of missing packets. TCP is a good example of a connection-oriented protocol.

How TCP Establishes A Connection

Any form of communication requires some form of acknowledgement for it to become meaningful. Someone knocks on the door to a house, the person inside asks "Who is it?", to which the visitor replies, "It's me!" Then the door opens. Both persons knew who was on the other side of the door before it opened and now a conversation can now begin.
TCP acts in a similar way. The server initiating the connection sends a segment with the SYN bit set in TCP header. The target replies with a segment with the SYN and ACK bits set, to which the originating server replies with a segment with the ACK bit set. This SYN, SYN-ACK, ACK mechanism is often called the "three-way handshake".
The communication then continues with a series of segment exchanges, each with the ACK bit set. When one of the servers needs to end the communication, it sends a segment to the other with the FIN and ACK bits set, to which the other server also replies with a FIN-ACK segment also. The communication terminates with a final ACK from the server that wanted to end the session.
This is the equivalent of ending a conversation by saying "I really have to go now, I have to go for lunch", to which the reply is "I think I'm finished here too, see you tomorrow..." The conversation ends with a final "bye" from the hungry person.
Here is a modified packet trace obtained from the tethereal program discussed in Chapter 4, "Simple Network Troubleshooting". You can clearly see the three way handshake to connect and disconnect the session.
hostA -> hostB TCP 1443 > http [SYN] Seq=9766 Ack=0 Win=5840 Len=0
hostB -> hostA TCP http > 1443 [SYN, ACK] Seq=8404 Ack=9767 Win=5792 Len=0
hostA -> hostB TCP 1443 > http [ACK] Seq=9767 Ack=8405 Win=5840 Len=0
hostA -> hostB HTTP HEAD/HTTP/1.1
hostB -> hostA TCP http > 1443 [ACK] Seq=8405 Ack=9985 Win=54 Len=0
hostB -> hostA HTTP HTTP/1.1 200 OK
hostA -> hostB TCP 1443 > http [ACK] Seq=9985 Ack=8672 Win=6432 Len=0
hostB -> hostA TCP http > 1443 [FIN, ACK] Seq=8672 Ack=9985 Win=54 Len=0
hostA -> hostB TCP 1443 > http [FIN, ACK] Seq=9985 Ack=8673 Win=6432 Len=0
hostB -> hostA TCP http > 1443 [ACK] Seq=8673 Ack=9986 Win=54
In this trace, the sequence number represents the serial number of the first byte of data in the segment. So in the first line, a random value of 9766 was assigned to the first byte and all subsequent bytes for the connection from this host will be sequentially tracked. This makes the second byte in the segment number 9767, the third number 9768 etc. The acknowledgment number or Ack, not to be confused with the ACK bit, is the byte serial number of the next segment it expects to receive from the other end, and the total number of bytes cannot exceed the Win or window value that follows it. If data isn't received correctly, the receiver will re-send the requesting segment asking for the information to be sent again. The TCP code keeps track of all this along with the source and destination ports and IP addresses to ensure that each unique connection is serviced correctly.

UDP, TCP's "Connectionless" Cousin

UDP is a connectionless protocol. Data is sent on a "best effort" basis with the machine that sends the data having no means of verifying whether the data was correctly received by the remote machine. UDP is usually used for applications in which the data sent is not mission-critical. It is also used when data needs to be broadcast to all available servers on a locally attached network where the creation of dozens of TCP connections for a short burst of data is considered resource-hungry.

TCP and UDP Ports

The data portion of the IP packet contains a TCP or UDP segment sandwiched inside. Only the TCP segment header contains sequence information, but both the UDP and the TCP segment headers track the port being used. The source/destination port and the source/destination IP addresses of the client & server computers are then combined to uniquely identify each data flow.
Certain programs are assigned specific ports that are internationally recognized. For example, port 80 is reserved for HTTP Web traffic, and port 25 is reserved for SMTP e-mail. Ports below 1024 are reserved for privileged system functions, and those above 1024 are generally reserved for non-system third-party applications.
Usually when a connection is made from a client computer requesting data to the server that contains the data:
  • The client selects a random previously unused "source" port greater than 1024 and queries the server on the "destination" port specific to the application. If it is an HTTP request, the client will use a source port of, say, 2049 and query the server on port 80 (HTTP)
  • The server recognizes the port 80 request as an HTTP request and passes on the data to be handled by the Web server software. When the Web server software replies to the client, it tells the TCP application to respond back to port 2049 of the client using a source port of port 80.
  • The client keeps track of all its requests to the server's IP address and will recognize that the reply on port 2049 isn't a request initiation for "NFS", but a response to the initial port 80 HTTP query.

The TCP/IP "Time To Live" Feature

Each IP packet has a Time to Live (TTL) section that keeps track of the number of network devices the packet has passed through to reach its destination. The server sending the packet sets the initial TTL value, and each network device that the packet passes through then reduces this value by 1. If the TTL value reaches 0, the network device will discard the packet.
This mechanism helps to ensure that bad routing on the Internet won't cause packets to aimlessly loop around the network without being removed. TTLs therefore help to reduce the clogging of data circuits with unnecessary traffic.
Remember this concept as it will be helpful in understanding the traceroute troubleshooting technique outlined in Chapter 4, "Simple Network Troubleshooting", that covers Network Troubleshooting.

The ICMP Protocol and Its Relationship to TCP/IP

There is another commonly used protocol called the Internet Control Message Protocol (ICMP). It is not strictly a TCP/IP protocol, but TCP/IP-based applications use it frequently.
ICMP provides a suite of error, control, and informational messages for use by the operating system. For example, IP packets will occasionally arrive at a server with corrupted data due to any number of reasons including a bad connection; electrical interference, or even misconfiguration. The server will usually detect this by examining the packet and correlating the contents to what it finds in the IP header's error control section. It will then issue an ICMP reject message to the original sending machine saying that the data should be re-sent because the original transmission was corrupted.
ICMP also includes echo and echo reply messages used by the Linux ping command to confirm network connectivity. ICMP TTL expired messages are also sent by network devices back to the originating server whenever the TTL in a packet is decremented to zero. More information on ICMP messages can be found in both Appendix 1, "Miscellaneous Linux Topics", and Chapter 4, "Simple Network Troubleshooting", on network troubleshooting.

The OSI Networking Model

The Open System Interconnection (OSI) model, developed by the International Organization for Standardization, defines how the various hardware and software components involved in data communication should interact with each other.

A good analogy would be a traveler who prepares herself to return home through many dangerous kingdoms by obtaining permits to enter each country at the very beginning of the trip. At each frontier our friend has to hand over a permit to enter the country. Once inside, she asks the border guards for directions to reach the next frontier and displays the permit for that new kingdom as proof that she has a legitimate reason for wanting to go there.
In the OSI model each component along the data communications path is assigned a layer of responsibility, in other words, a kingdom over which it rules. Each layer extracts the permit, or header information, it needs from the data and uses this information to correctly forward what's left to the next layer. This layer also strips away its permit and forwards the data to the next layer, and so the cycle continues for seven layers.
The very first layer of the OSI model describes the transmission attributes of the cabling or wireless frequencies used at each "link" or step along the way. Layer 2 describes the error correction methodologies to be used on the link; layer 3 ensures that the data can hop from link to link on the way to the final destination described in its header. When the data finally arrives, the layer 4 header is used to determine which locally installed software application should receive it. The application uses the guidelines of layer 5 to keep track of the various communications sessions it has with remote computers and uses layer 6 to verify that the communication or file format is correct. Finally, layer 7 defines what the end user will see in the form of an interface, be it graphical on a screen or otherwise. A description of the functions of each layer in the model can be seen in Table 2-1.

Table 2-1: The Seven OSI Layers

LayerNameDescriptionApplication
7Application
  • The user interface to the application
telnet
FTP
sendmail
6Presentation
  • Converts data from one presentation format to another. For example, e-mail text entered into Outlook Express being converted into SMTP mail formatted data.
5Session
  • Manages continuing requests and responses between the applications at both ends over the various established connections.
4Transport
  • Manages the establishment and tearing down of a connection. Ensures that unacknowledged data is retransmitted. Correctly re-sequences data packets that arrive in the wrong order.
  • After the packet's overhead bytes have been stripped away, the resulting data is said to be a segment.
TCP UDP
3Network
  • Handles the routing of data between links that are not physically connected together.
  • After the link's overhead bytes have been stripped away, the resulting data is said to be a packet.
IP ARP
2Link
  • Error control and timing of bits speeding down the wire between two directly connected devices.
  • Data sent on a link is said to be structured in frames.
Ethernet
ARP
1Physical
  • Defines the electrical and physical characteristics of the network cabling and interfacing hardware
Ethernet 

Introduction to Networking on Fedora

Installing the Linux operating system is only the first step toward creating a fully functional departmental server or Web site. Almost all computers are now networked in some way to other devices therefore a basic understanding of networking and issues related to the topic will be essential to feeling comfortable with Linux servers.
This introductory chapter forms the foundation on which the following network configuration and troubleshooting chapters will be built. These chapters will then introduce the remaining chapters that cover Linux troubleshooting, general software installation and the configuration of many of the most popular Linux applications used in corporate departments and Small Office/Home Office (SOHO) environments.
Familiarity with the concepts explained in the following sections will help answer many of the daily questions often posed by coworkers, friends, and even yourself. It will help make the road to Linux mastery less perilous, a road that begins with an understanding of the OSI networking model and TCP/IP



Contents

Hosting Your Own Web Site

Introduction

Web sites have proliferated greatly over the years to become a part of everyday life for many people. People use them to create Web logs of their daily lives, provide family members with a place to store their memories or to tell people of their experiences in getting things to work. The following is a typical Web site address:
www.example.com

Businesses originally used them primarily as a marketing tool, but later expanded them to become an important part of their operations. Many companies rely almost exclusively on their Web sites to sell their products and provide both customer and supplier support services.
The decision as to whether or not to host your own Web site can be difficult. You have to consider factors of cost and convenience as well as service and support. This chapter briefly addresses the most common issues and outline the simple network architecture for use in a small office or home on which the focus of the rest of the book will be based.
Not all businesses departments and homes require a Web site, but the process of establishing one touches many aspects of not only Linux, but information technology as well. This book is about to setting up Linux servers to do the things that most businesses and homes need. It's about getting the job done.
With this in mind, the book is divided into three sections of gradually increasing complexity to make this process easier. After this chapter, the first section introduces you to networking, software installation and troubleshooting before the first major project of using Linux as a main departmental file server for Windows PCs. The next section expands upon this knowledge to show you how to create, manage and monitor your own Linux-based Web site on this network using a simple DSL or cable modem Internet connection. Finally, the third section covers more advanced topics that will become invaluable as your Linux administration role expands.

Our Network

The typical small office or home network is usually quite simple with a router/firewall, connected to a broadband Internet connection, protecting a single network on which all servers and PCs are connected as seen in Figure 1-1.
As stated before, the rest of the book shows you how you can make a simple layout such as this become a functional low volume Web site, but before you do it would be best to weigh the pros and cons of doing this.

Figure 1-1 : Wireless home network topology

Topology.gif

Alternatives To In-House Web Hosting

There are two broad categories of hosting options for small Web sites. There are companies that host multiple Web sites on the same server, and are called virtual hosting providers. There are also those that allow you to use servers completely dedicated to your site, these are called dedicated hosting providers. Dedicated providers might provide you with only a network connection for a server you purchase ands install in their data center, or they might offer a menu of services from monitoring to backups from which to choose.

Virtual Hosting

It is easy to find virtual hosting companies on the Web that offer to host a simple Web site for about $10 per month.
The steps are fairly straightforward:
  1. Sign up for the virtual hosting service. They will provide you with a login name and password, the IP address of your site, and the name of a private directory on a shared Web server in which you'll place your Web pages.
  2. Register your domain name, such as www.my-site.com, with companies like Register.com, Verisign or RegisterFree.com. You must make sure your new domain name's DNS records point to the DNS server of the virtual hosting company.
  3. Upload your Web pages to your private virtual hosting directory.
  4. Start testing your site using your IP address in your Web browser. It takes about 3-4 days for DNS to propagate across the Web, so you'll probably have to wait at least that long before you'll be able to view your site using your domain, www.my-site.com.
The virtual hosting provider will also offer free backups of your site, technical support, a number of e-mail addresses and an easy-to-use Web based GUI to manage your settings. For an additional charge, many will also provide an e-commerce feature which allows you to have a shopping cart and customer loyalty programs.
The disadvantage of virtual hosting is that, though it is cheap, you often have no control over the operation of the server and have to rely on the staff and operational procedures of the hosting company to get your changes implemented. These may not necessarily be to your liking.

Dedicated Hosting

In this scenario, you typically have to make contact with a live sales representative who represents an Internet data center. At a minimum, you have to pay for the amount of space your server occupies in the data center, the amount of power it consumes and the amount of Internet bandwidth you expect to use. Additional services such as backups, monitoring, on call engineering time, firewall management and bandwidth graphing information often can be purchased as extra line items on your bill.
As you can imagine, these services can be fairly expensive. A 3 cm slot in a computer rack for a Web server can easily cost $200 per month for 1 Mbps of bandwidth. The advantage over virtual hosting is that you can customize the server to your needs.
Despite the relative merits of external hosting, you may also want to consider doing it yourself.

Factors To Consider Before Hosting Yourself

Hosting your Web site externally, especially virtual hosting, is the ideal solution for many small Web sites but there are a number of reasons why you may want to move your Web site to your home or small office. Some factors to consider are listed in Table 1-1.

Table 1-1 The Pros and Cons of Web Hosting In-House

SavingsCostsRisks
  • Monthly out sourced Web hosting fee
  • Elimination of the cost of delays to implement desired services.
  • New hardware and software
  • Possible new application development.
  • Training
  • The percentage of IT staff's time installing and maintaining the site
  • Potential cost of the risks (% likelihood of failure per month X cost of failure)
  • Likelihood of a failure and it's expected duration
  • The cost of both the failure and post-failure recovery (hardware, software, data restoration, time)
  • Irregular procedures that could increase the vulnerability of your site to failure.

Is In-House Hosting Preferred?

There are a number of advantages and disadvantages to hosting Web sites that are physically under your own control.
Pros
  • Increased Control and Flexibility: You will be able to manage all aspects of your Web site if it is hosted on a server based either in-house or within your control at a remote data center. You won't have to wait before changes are made and you can select the IT solution that best meets your needs, not those of the hosting provider. You can install the software you need, not what the ISP dictates.
There is also the possibility of offsetting the cost of your server by subleasing space on it to other companies in your community, so that you can become a small virtual hosting service yourself.
  • Cost: It is possible to host a Web site on most DSL connections. A Web site can be hosted on this data circuit for only the additional hardware cost of a network switch and a Web server. You should be able to buy this equipment second hand for about $100. If your home already has DSL there would  be no additional network connectivity costs. So for a savings of $10 per month the project should pay for itself in less than a year.
The cost of using an external Web hosting provider will increase as you purchase more systems administration services. You will eventually be able to justify hosting your Web site in-house based on this financial fact.
  • New Skills: An additional benefit is learning the new skills required to set up the site. Changes can be made with little delay.
  • Availability: Reliable virtual hosting facilities may not be available in your country and/or you may not have access to the foreign currency to host your site abroad.
  • Language: ISPs often provide technical support in only a few languages. If you can't get adequate support for billing, engineering, and customer care services, then an in house solution may be better.
Cons
  • Lost Services: You lose the convenience of many of the services such as backups, security audits, load balancing, DNS, redundant hardware, data base services and technical support offered by the virtual hosting company.
  • Security: One important factor to consider is the security of your new server. Hosting providers may provide software patches to fix security vulnerabilities on your Web servers and may even provide a firewall to protect it. These services may be more difficult to implement in-house. Sharing your external web and internal home or corporate systems on the same server or network increases the risk of hackers or automated malware accessing or corrupting your data. Consider a comprehensive security audit of any options you choose. Always weigh the degree of security maintained by your hosting provider against the security you expect to provide in-house. Proceed with the server migration only if you feel your staff can handle the job.
  • Scalability: Your home is not a purpose built data center and it will be difficult to expand your business if visitor volumes become high. Adequate additional Internet bandwidth, space, power and cooling could become difficult or costly to provision.
  • Technical Ability: Your service provider may have more expertise in setting up your site than you do. You may also have to incur additional training costs to ensure that your IT staff has the necessary knowledge to do the job internally.
  • Availability: In many cases the reliability of a data center's Internet connectivity is usually higher than that of your broadband connection.
  • Cost: Though you may be able to save money on a data circuit, there are other costs to consider. You may not have access to cheap real estate in which to host your servers. Commercial office space is often more expensive than basic data center space. You may have to purchase additional equipment and services to support your servers, such as UPSs, backup systems, software patch management, maintenance contracts, monitoring systems and additional power feeds, all of which may be already bundled in with the services of an external data center.

How to Migrate from an External Provider

Chapter 18, "Configuring DNS", which covers DNS has a detailed explanation of the steps involved in migrating your Web site from an external hosting provider to your home or small office. You should also read Chapters 20, "The Apache Web Server", and 21, "Configuring Linux Mail Servers", on Web configuration to help provide a more rounded understanding of the steps involved.

In-House Server Considerations

For small Web sites without a great deal of database activity and where "hot standby" hardware isn't a great need a basic desktop system will work fine. The linuxhomenetworking.com site, which was the inspiration for this book, receives over half a million page views per month and runs on a 1 GHz Intel Celeron with 1GB of RAM. A secondhand PC is adequate in this case.
Purpose-built Web servers, tend to use multiple CPUs, dual redundant power supplies, high-speed redundant SCSI disks that can be replaced while the system is running without affecting performance, special error-correcting ECC RAM, multiple PCI buses, special built in diagnostic tools and slim line cases only a few inches high. They cost significantly more, but you pay for the peace of mind when your only source of income is your Web site.
Try to have a dedicated area for your server that's clean, cool, and dry, and uses UPS-protected power. Label all your cables at both ends and try to create an updated network diagram that you can show anyone who will provide you assistance.
Another good idea is to color-code your cables. Some companies use one color for networks using private IP addresses and another for Internet-facing networks, others use one color for straight-through and another for crossover cables.
Wireless technology for a home-based Web site can be extremely convenient. You can place your small wireless router near your DSL/Cable modem and the server anywhere in the house. In my little lab, I have one server behind a bookcase, another behind the TV, one under a bed, and a couple around my desk. When you live in an apartment, there may be no other choice, but the risk is that a book falling behind a bookcase or a bounce from a vacuum cleaner, could take your site off the air.
Selecting an Internet connection for your Web site may not be as easy as it first seems. There are many data circuit technologies such as cable modem, DSL, and wireless links, but they may not available in your area or the installation times may not be acceptable. High speed links are usually marketed to businesses and their cost per megabyte of data transfer is usually higher as the service may be combined with data center space, be more reliable, offer more bandwidth and provide better customer support. Some technologies, such as T1 links, can optionally provide a dedicated circuit between two locations external to the Internet but the service also has a per kilometer monthly distance charge.
DSL and wireless services are sometimes asymmetrical, in that the downstream data rate from the Internet is different the reverse upstream speed. You should be most concerned about the upstream speed for your Web site to the Internet. Inbound Web browser queries don't use a lot of data bandwidth, but the Web pages that contain the outbound replies do. Internet service providers (ISPs) provide asymmetric services for residential users and the downstream rate is almost always higher than the upstream. They reserve symmetrical data circuits for businesses which usually need high bandwidth to both surf the web and serve Web pages. The ISP will usually provide the business with a fixed range of Internet addresses as part of the service; residential customers usually get a dynamic address allocation. This can have an impact on your Web site preparation and will be discussed in more detail in later chapters.
Another source of concern would be deciding on the operating system to use. A popular one is Windows which may be the only product your Web or business application will work with and with which your staff is most familiar. These issues are becoming less important as software vendors are increasingly porting their applications to Linux, an increasingly strong rival to Windows which also has a lower overall total cost of ownership, especially for smaller companies.
This book focuses on Fedora Linux with some references to RedHat Linux, its popular corporate cousin. What's the difference? Until Version 9, RedHat Linux was a free product. The company then decided to create enterprise and desktop versions that had paid service contracts bundled with them and these maintained the RedHat brand. At the same time RedHat decided to create Fedora Linux as a support-free product with an aggressive development cycle, which is generally unsuitable for businesses that often require more stability and support. New versions of Fedora are released every 6 months. Though the original applications may be developed by volunteers, the Fedora versions are maintained by RedHat. Once Fedora updates are proven stable they are incorporated into the RedHat Linux releases which are updated every 12-18 months. Constant communication between RedHat and the developers help to keep the updates synchronized.
I chose Fedora because it's free. You don't have to get a purchase order to play with Fedora. When you become comfortable with it and have proven the concept to yourself, your peers and management, you can then consider the more stable RedHat equivalent.
I also chose Fedora Linux because it's popular and it's the Linux flavor I've worked with most frequently at home and at work. This may not be the one suitable for you, and other Linux distributions should also be considered.