What Is HTTP and How It Works + Client Implementation
This semester, I have Internet Engineering lecture and had to write a simple HTTP client to send a request to the server and get the response and handle it based on the response type. So I decided to write a medium about it and explain how it works and how to implement it.
If you know all about HTTP and want the implementation just scroll down to reach the section.
Implementation is written in Java. (because I had to!)
About HTTP
What Is HTTP?
HTTP is an application-layer protocol that is used to communicate through a connection between client and server. In the past, It meant to transfer Hyper Text only, but now you can send everything such as Text Documents, Multimedia, or Applications. It’s a reliable communication over TCP and the default port is 80.
HTTP is Stateless; it means the server doesn’t keep the history of the client. In some situations, this is a problem but could be solved using cookies.
HTTP stands for Hyper Text Transfer Protocol
HTTP Client and HTTP Server
HTTP Server is software that understands URLs and HTTP protocol and provides the resources to the clients. An HTTP client, (usually a web browser) is a program or API which sends requests to the HTTP Server and receive their responses.
Structure of HTTP Request and Response
Let’s learn some terms, used to describe the structure:
- Method: The desired action to be performed for a given resource. More
- Path: Path of a resource on the host.
- Query: Key-Value pairs that come after the path in the URL, use to pass information from client to server.
- Header: Additional Key-Value pairs information to describe the request
- Resource: Rest of URL that comes after domain until #frag (Path+Query)
- CRLF: Carriage Return Line Feed (\r\n)
You can read more information on the HTTP Wikipedia.
Structure of a URL
<protocol(scheme)> :// <user> : <pass> @ <host> : < port> / <path> ? <query> # <frag>
Http Request Message consists of 3 main parts:
- Request Line: [<Request Method> <Resource> <HTTP Version>]
- Request Headers: List of [<Header-Label>: <Header-Value>]
- Body: the data bytes that will be sent to the server
Remember that request line and header lines ends with a CRLF.
Headers part ends with a single CRLF
If request have body, It must have a CRLF at the ends.
HTTP Request is plain text.
Necessary Information in an HTTP request
- Every request must have a request line. (HTTP request without request line is like asking for a favor, without telling what the favor is)
- Host Header: domain of the server (IP address or DNS name)
- Accept: Accepted resources
- Accept-Encoding: Accepted encoding of the resources
- Content-Length(if the request has a body): length of the body data
Implementation
Before I implement The Client, It seems to be something complicated but I will show you that it is not true!
Data Classes
Method enum which represents some of the HTTP methods.
Request is the class that you put the request line, headers, and body together and converts it to a String object. HttpClient class that we will be seeing soon uses Request#get() method to get the request as a text message.
We will use the RequestBuilder class to create a Request object. Check it on Github.
After getting a response from the server, we convert the bytes to a Response object in order to be easy to work with.
ResponseBody is the data class that stores the content type of body, content length, and an array of data bytes received from the server. By knowing the content type (e.g. HTML file), You can decide what to do with the bytes.
HTTP Input Stream
The class consists of two-part; one for reading and converting bytes to string and the second part is for extracting information.
In the reading part, available() method waits until a new byte is available, until the headers section finishes. For the body part, we already know the content-length from the headers that we just read. All we have to do is to wait for the total body bytes to be available. This is exactly what happens in the available() method.
As I mentioned earlier, each line ends with a CRLF, so in the readLine() method, after reading a \r byte followed by a \n byte, a line of the message finish and return it as a String; otherwise keep reading and passing bytes as char to the builder until a CRLF detected.
The extraction part is pretty straight forward and happens in readResponseFromStream() method. First, read the status Line which consists of HTTP Version, Status Code, and Response Message. Second, Read the headers one by one and store them in a map structure. and for the body part, wait until all bytes are available and then read them as a byte array.
HttpClient Class
HttpClient class, open up a socket at port 80 on the given host. and write the request message to its output stream; by flushing it, it will send the message. In the end, use HttpInputStream to read the response from the socket input stream.
So, my friend, this is pretty much it. you can play around with it and add some useful methods like an asynchronous send method or anything that is useful.
In the Github repository, there is an example of using the client under the app module. I put the link down below.
How About HTTPS Requests?
the process and the works are similar in many parts. But HTTPS uses SSL for a connection between client and server. You need to use SSL Sockets. There may be some small differences too. You can check baeldung website for more explanations that I found helpful.
You can find the repository here.
Thank you for reading this; Write down what you think and let me know if you're writing an HTTP Client. Goodluck🙌🏻