networking
This guide goes top down, so you should be able to stop at any point and have an understanding to some degree.
Big Picture
The internet is a global computer network1 made of many local networks.
The websites you visit are stored in a server2 elsewhere on the planet, each time you visit them, you request their contents and the server delivers it. The World Wide Web (WWW), where websites are hosted is a sub-section of the internet.
IP Protocol
IP addresses identify network interfaces (devices), and they’re similar to home addresses. When visiting a domain, you must first know what is the IP address of its server. Domain Name Servers (DNS) tell you that. You receive the IP of your website from Recursive DNS which query other DNS recursively until they reach the Authoritative DNS, which knows the IP address of the website. This result is cached both in your device and in your router.
IPv4 addresses can express up to 4.29 billion IP addresses (2^32), however, IPv6 goes up to 3.4×10³⁸. IPv6 is newer and meant to tackle the limited addresses in IPv4 globally. In this guide, IPv4 will be assumed, but they are very similar.
Map of the Internet
Your home network also uses IP addresses and they are managed through a router. It assigns a local IP address to each device using DHCP,3 but it also does Network Address Translation (NAT), where it relays all of them through a single public IP that has been assigned by your ISP. In the case of fiber-optic communication, all of the requests are then handled to your Optical Network Terminal (ONT) which converts digital signals into light.4 The Internet Service Provider (ISP), is the one you pay your internet bills to, and they forward your request to the internet backbones, which are companies that handle the fiber-optic cables under the ocean and such. From there, the process is somewhat symmetrical, until it reaches the server, where it gets processed, and returns to you not necessarily through the same path. Since the internet is a web of independent networks, each hop decides where it’s best to route the traffic.
Local
Your device and your router have a Network Interface Card (NIC) that gives them network capabilities, which has a unique hardware identifier called a MAC address.5 Inside of your Local Area network (LAN), aka, home network, MAC addresses are used to identify the physical devices and they are resolved through the Address Resolution Protocol (ARP). Once the MAC has been resolved, the IP protocol may be used to send the data. It can be sent over with radio waves (Wi-Fi) or cables (Ethernet), the latter being much faster.
LANs may feature multiple sub-networks (subnets) for different purposes to segment the traffic for convenience or security, since traffic between subnets can be easily restricted. Of the 32 bits of an IP address, a chosen amount is reserved for the subnet and the rest is to specify the devices (hosts) of each subnet. Although subnets are isolated networks, they can be managed through the same router.
Packets
All data is sent in packets. Like letters, they contain the message and metadata (headers) like destination or size. Packets are sent and received by applications through sockets, a logical abstraction represented as a combination of an IP address and a port. Each IP address may have multiple logical doors (ports) which deliver or expect a specific type of content and serve to match it to the proper application. The packet specifies in its transport header the port through which the data is being sent (source) and at which port it’s intended to arrive at (destination). For instance, when web browsing, your Operating System6 chooses an open port to send the request from, but it requests the website on port 443 of the server.
Packets can be encoded in many ways (protocols), between IPs usually TCP or UDP are used. TCP is very reliable as it performs multiple back and forth “connections” to ensure that the packet has arrived safely. UDP doesn’t and is therefore faster (and lighter7), which makes it suitable for real time applications, like video-streaming or gaming.
HTTP
HyperText Transfer Protocol (HTTP) is the language used for requesting and serving web pages. Methods like GET and POST are used by the client8 to request or send information. A status code by the server will be sent in response, like 200 if successful, or 404 if not found.
HTTPS is a more secure and widely used version of HTTP that adds additional Security by using TLS. This enables encrypted communication between the client and the server.
TLS (and its predecesor SSL) use certificates to both authenticate the site, and encrypt the traffic. The server sends its certificate to the client, which includes a signature from a trusted Certificate Authority (CA). The signature is matched with the public key from the CA pre-installed in the client. If it matches, the client knows that the IP address corresponds to the requested domain.
The encryption is usually done with the Diffie-Hellman (DH) key exchange, where public keys are shared unencrypted to independently agree on a shared key, called session key. Sharing a key, symmetric encryption, is more efficient and faster for data transfer than the asymmetric encryption used during the TLS handshake.
Other Components
Firewalls, used mainly for security purposes, restrict network traffic matching the packet and its headers against predefined rules.
Proxy servers are a middleman that sits between you and the end server. Forward Proxies work for the client, managing outgoing client requests to the server, whereas Reverse Proxies work for the server, dealing with incoming client requests to the server. Forward proxies may be used to hide your IP and Reverse Proxies are usually used for managing the loads on each server.
VPNs (Virtual Private Networks) act like a sort of forward proxy that encrypts your data when you talk to them by establishing encrypted connections between you and the VPN.
- Wireshark
traceroutenmclidigifconfigarp