CGI is one of the oldest components of Internet infrastructure. Although replaced by new alternatives, it is still widely used.
The web server software has traditionally been limited to providing static web pages. CGI scripts allow the production of dynamic responses generated when requests are received.
What is the Common Gateway Interface (CGI)?
The Common Gateway Interface (CGI) is a standard that defines how external programs can provide information to a web server. CGI provides web servers such as Apache with a mechanism for exchanging data with programming languages such as Perl.
HTTP server authentication
CGI is designed to provide a standard way for programming languages to access HTTP server information. Any HTTP server will be linked to any programming language as long as they both comply with the CGI specification.
CGI-enabled servers handle requests using the following process:
- New request received: /example.pl.
- The web server recognizes examples.pl, it is as an executable CGI script and then executes the script.
- The CGI Perl script receives all the data related to the request, such as its URL and HTTP header.
example.pl script implemented; Its output is transmitted to the web server for output as an HTTP response.
The flow described above is completely opposite to the normal operation of a web server. The original request to /example.pl returns the contents of that file. If the file does not exist, you will receive a 404 response instead.
When using CGI, it is not necessary to map the request to the actual file on disk. Instead, a user-defined program is run.
The program is responsible for generating the output to send to the client. The web server is no longer interested in the actual content of the response.
What is the information shared through Common Gateway Interface?
A binary program implemented by Common Gateway Interface can access various data about incoming HTTP requests. This includes the URL, the header, the query string, and the HTTP method, as well as the IP address of the remote client.
No server software is required to provide all the data as it is. Allows the CGI specification server to exclude headers from the environment variables. This means avoiding sensitive information such as authentication header values or avoiding redundancy when the same information can be accessed using a dedicated variable.
In addition to request data, CGI-Aware servers must also provide various details about themselves. This includes the host server name and the software version. Scripts can use these details as they see fit.
Information is sent from the server to the CGI program in the form of environment variables. The program accesses it just like any other environment variable. The server executes the program as a child process, setting environment variables before calling the executable.
There is data that is not passed on as an environment variable. The body of the request will be given special treatment, as it will be too long. This will redirect you to your standard input stream in the script. All data available through the CONTENT_LENGTH environment variable will be reported to the script.
Once the script processing is complete, the CGI script delivers the HTTP response to the server. It must be a complete HTTP response with a header and optional content. The script responds to its standard output stream. The server then sends the response to the client via an HTTP connection.
Where does CGI stand today?
CGI helped create the modern web. It provided a very easy way to create dynamic server-side scripts using technologies from the mid-1990s. The web page is no longer a static HTML file.
The simplicity of the Common Gateway Interface has helped maintain it for decades to come. CGI scripts are in use, especially in older applications based on older languages. However, technology has not stopped; CGI has been replaced by modern alternatives that fit well into today’s web.
Traditional CGI creates an overhead that becomes increasingly problematic. CGI scripts are reloaded with each request, creating a new process of emptying resources on high-traffic sites.
CGI is also limited in terms of the control it provides for scripts. The script can only detect the content of the response file sent back to the client. They do not affect any other part of the HTTP Exchange, such as authentication or session management.
Finally, there are security issues. CGI scripts are usually executed as a server’s child process. This means that the server must be protected from script interference. Improper settings can give unwanted access to scripts to other server-managed resources, such as configuration and log files.
Many problems with CGI have been solved with new interface technologies. Fast CGI was created to reduce the CGI overload problem. It works similarly to CGI but does not create a new process for each request. Instead, the Fast CGI server operates independently of the web server, performing its own continuous processes used to host CGI scripts.
Elsewhere, different programming languages implement their own server interfaces. These are usually integrated directly into the webserver via optional modules. An example is Apache mod_phpes mod_perl, which provides native support for programming languages without using CGI (both can be used via CGI).
Despite the existence of these mechanisms, CGI is still relevant. The simplicity behind its design influenced subsequent attempts to improve the overall structure. Although you may not encounter CGI every day on modern web systems, major web servers still support it and it is unlikely to ever change.