From: Herbert Poetzl (herbert_at_13thfloor.at)
Date: Wed 31 Mar 2004 - 17:09:23 BST
Hi Community!
this is the last part of the (IMHO necessary) explanations
to actually start discussing the options we have for future
linux-vserver development (and their implications, advantages
and disadvantages)
this time we take a short look at the netfilter (iptables)
stuff and how packets are handled in the kernel (brief)
-------------
here a simple overview how packets from the network traverse
the tables (on the left side is the network, and on the right
the local process receiving and sending packets)
+- -----------+ .-------. +--------+ +-----+
--->| PREROUTING |--->| route |----->| INPUT |---->| |
+- -----------+ '-------' +--------+ | P |
| | L R |
V | O O |
+---------+ | C C |
| FORWARD | | A E |
+---------+ | L S |
+- -----------+ | +--------+ | S |
<---| POSTROUTING |<-------+----------| OUTPUT |<----| |
+- -----------+ +--------+ +-----+
important things to mention here are:
- packets destined for a local process (routing decision)
do not pass the FORWARD, OUTPUT, or POSTROUTING table
- packets originating from a local process do not pass
the FORWARD, INPUT, or PREROUTING table
- packets 'routed' through the host will not pass the
INPUT or OUTPUT table
here is a simple example illustrating the above (again with QEMU)
on the host:
ifconfig tun0 10.0.0.1 netmask 255.255.255.0
route add -net 192.168.0.0/24 gw 10.0.0.2
on the (QEMU) client
ifconfig eth0 10.0.0.2 netmask 255.255.255.0
ifconfig dummy0 192.168.0.1
iptables -A INPUT -j LOG --log-prefix INPUT:
iptables -A FORWARD -j LOG --log-prefix FORWARD:
iptables -A OUTPUT -j LOG --log-prefix OUTPUT:
after this setup, we can simulate terminating, originating,
and routed packets, with two simple pings ...
H# ping -c 1 10.0.0.2
INPUT: IN=eth0 OUT= MAC=.. SRC=10.0.0.1 DST=10.0.0.2
LEN=84 PROTO=ICMP ID=5665
OUTPUT: IN= OUT=eth0 SRC=10.0.0.2 DST=10.0.0.1
LEN=84 PROTO=ICMP ID=5665
we see, that the INPUT table is consulted when the echo
request arrives, and the OUTPUT table, when the echo reply
is sent, but there is no forwarding involved
C# echo 1 >/proc/sys/net/ipv4/ip_forward
H# ping -c 1 192.168.0.2
FORWARD:IN=eth0 OUT=dummy0 SRC=10.0.0.1 DST=192.168.0.2
LEN=84 PROTO=ICMP ID=5921
this on the other hand is a ping request forwarded by the
kernel from one interface (eth0) to the other (dummy0)
(it doesn't matter that this request is never answered)
and now for the last piece of information, how packets
and sockets are related (very simplified version)
applications like apache, sendmail, ssh and others, communicate
via sockets, the server has to bind it's socket(s) to a specific
ip and port, and the client uses another socket to send to this
ip/port, depending of the type of protocol, a connection is
established or 'just' a message sent.
basically when a packet it received by the hosts network card,
the nic driver allocates a buffer (skb) and puts the data into
this buffer, then the packet is passed on to the network stack.
after some routing decisions, firewalling etc, when the stack
decides that the packet is destined for the host, the kernel
starts checking each bound socket (qualifying for the ip) and
sends a copy of the buffer to that socket, which according to
the protocoll, unwraps the packet, and delivers the message to
the application waiting for that socket. a similar but simpler
process works in the other direction.
so interesting things to keep in mind are:
- sockets are checked for each packet destined for the host
- sockets are created and bound by userspace applications
- there is a buffer (skb) travelling the kernel's ip stack
best,
Herbert
additional documentation:
http://iptables-tutorial.frozentux.net/iptables-tutorial.html
http://gnumonks.org/ftp/pub/doc/packet-journey-2.4.html
http://gnumonks.org/ftp/pub/doc/skb-doc.html
http://www.linuxsecurity.com/resource_files/firewalls/netfilter-hacking-HOWTO/
_______________________________________________
Vserver mailing list
Vserver_at_list.linux-vserver.org
http://list.linux-vserver.org/mailman/listinfo/vserver