Client Server Communication
Here we are going to learn the process that is commonly asked to frontend developers, that is, when we write some URL in the address bar of our browser and then hit Enter
, what actually happens under the hood
To make it little simple, let’s divide the whole process into two steps
1. Request and Response Process with insights
2. Browser Rendering
Let’s understand our first process (Request and Response)
The client-server model describes how a server provides resources and services to one or more clients. Examples of servers include web servers, mail servers, and file servers. A single server can provide resources to multiple clients at one time.
Client : A client in simple words, can be understood as the computer/device on which end user is sitting. So the machine that is used by us for web browsing is the client machine and browser is the client (Request)
Server : Server can be considered as a local disk or machine kept somewhere from where client is loading files(Response) and displays them to you — allowing for user interaction
What is a DNS Server ?
- The DNS recursor (also referred to as the DNS resolver) is a server that receives the query from the DNS client, and then interacts with other DNS servers to hunt down the correct IP. Once the resolver receives the request from the client, the resolver then actually behaves as a client itself, querying the other three types of DNS servers in search of the right IP.
- First the resolver queries the Root Nameserver. The root server is the first step in translating (resolving) human-readable domain names into IP addresses. The root server then responds to the resolver with the address of a Top Level Domain (TLD) DNS server (such as .com or .net) that stores the information for its domains.
- Next the resolver queries the TLD server. The TLD server responds with the IP address of the domain’s authoritative nameserver (google in our case or sub domain). The recursor then queries the authoritative nameserver, which will respond with the IP address of the origin server.
- The resolver will finally pass the origin server IP address back to the client. Using this IP address, the client can then initiate a query directly to the origin server, and the origin server will respond by sending website data that can be interpreted and displayed by the web browser.
Protocols involved in communication (HTTP and TCP/IP)
The numbered lines in DNS server diagram is combination of both these protocols
- TCP/IP stands for Transmission Control Protocol/Internet Protocol.
- TCP and IP are two separate computer network protocols but they are meaningless without one another
The IP address is like the phone number assigned to your smartphone. TCP is all the technology that makes the phone ring, and that enables you to talk to someone on another phone.
- To understand the difference we need to understand the idea of a layered networking model (OSI model of layers). However, this layer model is hypothetical but equally important to understand networking principles.
- At the bottom of the network stack is the physical layer. This is where electrical signals or radio waves actually transmit information from place to place. The physical layer doesn’t really have protocols, but instead has standards for voltages, frequencies, and other physical properties.
- The next layer is the link layer. This layer covers communication with devices that share a physical communications medium. Here, protocols like Ethernet, 802.11a/b/g/n specifies how to handle multiple concurrent accesses to the physical medium and direct traffic to right device. “Router” is the correct example for our home networks
- The third layer is the network layer (Internet Layer). This is dominated by Internet Protocol (IP) where the magic of the Internet happens, and we get to talk to the computers around the world, without knowing where it is. Router directs the traffic from our local network to the network where the other computer lives, where its own link layer handles getting the packets to the right computer.
- Now when we can talk to a computer somewhere around the world, that computer is running lots of different programs. How the network knows which program to deliver your message to? The transport layer takes care of this, usually with port numbers. The two most popular transport layer protocols are TCP and UDP. TCP does lot of things over the network-layer communication like reordering packets, retransmitting lost packets, etc.
- Now we've connected our browser to the web server software on the other end, but how does the server know what page we want. These are things that application-layer protocols handle. For web traffic, this is the HyperText Transfer Protocol (HTTP). There are thousands of application-layer protocols: SMTP, IMAP, and POP3 for email; XMPP, IRC, ICQ for chat; Telnet, SSH, RDP for remote administration; etc.
- Some protocols works between various layers (total 7 in number), or one working at multiple layers. TLS/SSL for instance provides encryption and session information between the network and transport layers. Above application layer, Application Programming Interfaces (APIs) handles communication with web applications like Twitter, and Facebook.
Content Delivery Network (CDN)
- A content delivery network (CDN) refers to a geographically distributed group of servers which work together to provide fast delivery of Internet content. Mostly every firm uses CDN in today’s scenario of fast serving data
- This doesn’t mean that the site will be hosted in multiple servers, So when a CDN is in place, the site content will not always be delivered from the server where it’s hosted.
- The content will be served from the closest server to the user, depending on the user’s geographic location.
- These closest server or CDN servers are called Edge Locations.
For reference, we can consider the CDN server to be set up between line 9 and 10 area of our DNS diagram
A CDN basically caches the data (content like HTML pages, javascript files, stylesheets, images, and videos) at CDN servers from the origin server and when a client makes same request to the origin the data is served from the closest CDN server if the data is already requested by someone in that proximity before this user’s request. Otherwise, the request will be redirected to the origin server
Optimizations in Client Server Model
- Though, one cannot control the network weather between the client and server, nor the client hardware or the configuration of their device, but still there are lot of things that can be done to improve the performance of network by taking some steps on other host of networking techs..
- As we already discussed about the OSI model, it is interesting that we can improve the performance on every layer but while we cannot make the bits travel any faster, it is crucial that we apply all the possible optimizations at the transport and application layers to eliminate unnecessary roundtrips, requests, and minimize the distance (CDN) traveled by each packet — i.e., position the servers closer to the client.
However, there are some evergreen Performance Best Practices
- Reduce DNS lookups : Every hostname resolution requires a network roundtrip, imposing latency on the request and blocking the request while the lookup is in progress. Some things that can be done here is to reduce the number of request to different domains (hostnames) and increase the Time To Live(TTL) for the hostname in DNS cache
- Minimum HTTP redirects : HTTP redirects impose high latency overhead — e.g., a single redirect to a different origin can result in DNS, TCP, TLS, and request-response roundtrips that can add hundreds to thousands of milliseconds of delay. The optimal number of redirects is zero.
- Eliminate unnecessary resources : No request is faster than a request not made. Be vigilant about auditing and removing unnecessary resources.
- Reduce roundtrip times : Locating servers closer to the user improves protocol performance by reducing roundtrip times (e.g., faster TCP and TLS handshakes), and improves the transfer throughput of static and dynamic content. One example that we discussed is CDN
There are some add-on mechanisms provided by HTTP such as caching and compression and also some version specific quirks
- Cache Resources On Client : The fastest network request is a request not made. Maintaining a cache of previously downloaded data allows the client to use a local copy of the resource, hence eliminating the request.For resources delivered over HTTP, the appropriate cache headers are :
*Cache-Control
header can specify the cache lifetime (max-age) of the resource.
*Last-Modified
andETag
headers provide validation mechanisms.
Finally, note that you need to specify both the cache lifetime and the validation method! A common mistake is to provide only one of the two, which results in redundant transfer
- Compress Transferred Data : The size of assets, such as HTML, CSS, and JavaScript, can be reduced by 60%–80% on average when compressed with Gzip. Images, on the other hand, require more nuanced consideration:
* Images often carry a lot of metadata that can be stripped — e.g., EXIF.
* Images should be sized to their display width to minimize transferred bytes.
* Images can be compressed with different lossy and lossless formats.
Images account for over half of the transferred bytes of an average page, which makes them a high-value optimization target: the simple choice of an optimal image format can yield dramatically improved compression ratios; lossy compression methods can reduce transfer sizes by orders of magnitude; sizing the image to its display width will reduce both the transfer.
- Eliminate Unnecessary Request Bytes : HTTP is a stateless protocol, which means that the server is not required to retain any information about the client between different requests.
* But many applications require state for session management, analytics, etc. To allow this , HTTP State Management Mechanism (RFC 2965) extension allows any website to associate and update “cookie” metadata for its origin.
* The sites are allowed to associate many cookies per origin. As a result, it is possible to associate tens to hundreds of kilobytes of arbitrary metadata, split across multiple cookies, for each origin, this can have significant performance implications for your application.
* Associated cookie data is automatically sent by the browser on each request, which, in the worst case can add entire roundtrips of network latency by exceeding the initial TCP congestion window
* So, cookie size should be monitored carefully, transfer the minimum amount of required data, such as a secure session token, and leverage a shared session cache on the server to look up other metadata
* Eliminate cookies entirely wherever possible — chances are, you do not need client-specific metadata when requesting static assets, such as images, scripts, and stylesheets. - Parallel Requests : Without connection keeping alive, a new TCP connection is required for each HTTP request, which incurs significant overhead due to the TCP handshake and slow-start. One should optimize the server and proxy connection timeouts to avoid closing the connection prematurely. To get the best performance, use HTTP/2 to allow the client and server to reuse the same connection for all requests. If HTTP/2 is not an option, use multiple TCP connections to achieve request parallelism with HTTP/1.x.
Key Benefits of HTTP/2
HTTP/2 enables more efficient use of network resources and reduced latency by enabling request and response multiplexing, header compression, prioritization. Some key benefits it provides over HTTP/1.x are :
- Elimination of Domain Sharding : HTTP/2 achieves the best performance by multiplexing requests over the same TCP connection, which enables effective request and response prioritization, flow control, and header compression. As a result, the optimal number of connections is exactly one and domain sharding is an anti-pattern.
- Minimize Concatenation: Bundling resources in HTTP/1.x was a critical optimization as due to lacking of parallelism and high protocol overhead. However, with HTTP/2, multiplexing is no longer an issue, and header compression dramatically reduces the metadata overhead of each HTTP request. Some cons of bundling are :
* Bundled resources may result in unnecessary data transfers: the user might not need all the assets on a particular page, or at all.
* Bundled resources may result in expensive cache invalidations: a single updated byte in one component forces a full fetch of the entire bundle.
* Bundled resources may delay execution: many content-types cannot be processed and applied until the entire response is transferred.
* Bundled resources may require additional infrastructure at build or delivery time to generate the associated bundle.
* Bundled resources may provide better compression if the resources contain similar content. - Server Push : Server push is a powerful new feature of HTTP/2 that enables the server to send multiple responses for a single client request. Use of HTTP/2 server push offers many performance benefits over inlining: pushed resources can be cached individually, reused across pages, canceled by the client.
Server push acts as a latency optimization that removes a full request-response roundtrip between the client and server — e.g., if, after sending a particular response, we know that the client will always come back and request a specific sub-resource, we can eliminate the roundtrip by pushing the sub-resource to the client.
Critical resources (DOM, CSSDOM, JS) that block page construction and rendering are prime candidates for the use of server push, as they are often known or can be specified upfront. Eliminating a full roundtrip from the critical path can yield savings of tens to hundreds of milliseconds, especially for users on mobile networks where latencies are often both high and highly variable.
These were some optimizations that we discussed, but that all depends on the thorough capacity of the architecture. So before applying any mechanism one should have full understanding of the application architecture.
Keep Learning .. ! ✋