310 likes | 446 Views
Common Gateway Interface. Web Technologies Piero Fraternali. Outline. Architectures for dynamic content publishing CGI Java Servlet Server-side scripting JSP tag libraries. Motivations.
E N D
Common Gateway Interface Web Technologies Piero Fraternali
Outline • Architectures for dynamic content publishing • CGI • Java Servlet • Server-side scripting • JSP tag libraries
Motivations • Creating pages on the fly based on the user’s request and from structured data (e.g., database content) • Client-side scripting & components do not suffice • They manipulate an existing document/page, do not create a new one from strutured content • Solution: • Server-side architectures for dynamic content production
Common Gateway Interface • An interface that allows the Web Server to launch external applications that create pages dynamically • A kind of «double client-server loop»
What CGI is/is not • Is is not • A programming language • A telecommunication protocol • It is • An interface between the web server and tha applications that defines some standard communication variables • The interface is implemented through system variables, a universal mechanism present in all operating systems • A CGI program can be written in any programming language
Invocation • The client specifies in the URI the name of the program to invoke • The program must be deployed in a specified location at the web server (e.g., the cgi-bin directory) • http://my.server.web/cgi-bin/xyz.exe
Execution • The server recognizes from the URI that the requested resource is an executable • Permissions must be set in the web server for allowing program execution • E.g., the extensions of executable files must be explicitly specified • http://my.server.web/cgi-bin/xyz.exe
Execution • The web server decodes the paramaters sent by the client and initializes the CGI variables • request_method, query_string, content_length, content_type • http://my.server.web/cgi-bin/xyz.exe?par=val
Execution • The server lauches the program in a new process
Execution • The program executes and «prints» the response on the standard output
Execution • The server builds the response from the content emitted to the standard output and sends it to the client
Handling request parameters • Client paramaters can be sent in two ways • With the HTTP GET method • parameters are appended to the URL (1) • http://www.myserver.it/cgi-bin/xyz?par=val • With the HTTP POST method • Parameters are inserted as an HTTP entity in the body of the request (when their size is substantial) • Requires the use of HTML forms to allow users input data onto the body of the request • (1) The specification of HTTP does not specify any maximum URI length, practical limits are imposed by web browser and server software
HTML Form <HTML> <BODY> <FORM action="http://www.mysrvr.it/cgi-bin/xyz.exe" method=post> <P> Tell me your name:<p> <P><INPUT type="text" NAME="whoareyou"> </p> <INPUT type="submit" VALUE="Send"> </FORM> </BODY> </HTML>
Structure of a CGI program Read environment variable Execute business logic Print MIME heading "Content-type: text/html" Print HTML markup
Parameter decoding Read variable Request_method Read variable content_length Read variable Query_string Read content_length bytes from the standard input
CGI development • A CGI program can be written in any programming language: • C/C++ • Fortran • PERL • TCL • Unix shell • Visual Basic • In case a compiled programming language is used, the source code must be compiled • Normally source files are in cgi-src • Executable binaries are in cgi-bin • If instead an interpreted scripting language is used the source files are deployed • Normally in the cgi-bin folder
Overview of CGI variables • Clustered per type: • server • request • headers
Server variables • These variables are always available, i.e., they do not depend on the request • SERVER_SOFTWARE: name and version of the server software • Format: name/version • SERVER_NAME: hostname or IP of the server • GATEWAY_INTERFACE: supported CGI version • Format: CGI/version
Request variables • These variables depend on the request • SERVER_PROTOCOL: transport protocol name and version • Format: protocol/version • SERVER_PORT: port to which the request is sent • REQUEST_METHOD: HTTP request method • PATH_INFO: extra path information • PATH_TRANSLATED: translation of PATH_INFO from virtual to physical • SCRIPT_NAME: invoked script URL • QUERY_STRING: the query string
Other request variables • REMOTE_HOST: client hostname • REMOTE_ADDR: client IP address • AUTH_TYPE: authentication type used by the protocol • REMOTE_USER: username used during the authentication • CONTENT_TYPE: content type in case of POST and PUT request methods • CONTENT_LENGTH: content length
Environment variables: headers • The HTTP headers contained in the request are stored in the environment with the prefix HTTP_ • HTTP_USER_AGENT: browser used for the request • HTTP_ACCEPT_ENCODING: encoding type accepted by the client • HTTP_ACCEPT_CHARSET: charset accepted by the client • HTTP_ACCEPT_LANGUAGE: language accepted by the client
#include <stdlib.h> #include <stdio.h> int main (void){ printf("content-type: text/html\n\n"); printf("<html><head><title>Request variables</title></head>"); printf("<body><h1>Some request header variables:</h1>"); fflush(stdout); printf("SERVER_SOFTWARE: %s<br>\n",getenv("SERVER_SOFTWARE")); printf("GATEWAY_INTERFACE: %s<br>\n",getenv("GATEWAY_INTERFACE")); printf("REQUEST_METHOD: %s<br>\n",getenv("REQUEST_METHOD")); printf("QUERY_STRING: %s<br>\n",getenv("QUERY_STRING")); printf("HTTP_USER_AGENT: %s<br>\n",getenv("HTTP_USER_AGENT")); printf("HTTP_ACCEPT_ENCODING: %s<br>\n",getenv("HTTP_ACCEPT_ENCODING")); printf("HTTP_ACCEPT_CHARSET: %s<br>\n",getenv("HTTP_ACCEPT_CHARSET")); printf("HTTP_ACCEPT_LANGUAGE: %s<br>\n",getenv("HTTP_ACCEPT_LANGUAGE")); printf("HTTP_REFERER: %s<br>\n",getenv("HTTP_REFERER")); printf("REMOTE_ADDR: %s<br>\n",getenv("REMOTE_ADDR")); printf("</body></html>"); return 0; } CGI script for inspecting variables
Problems with CGI • Performance and security issues in web server to application communication • When the server receives a request, it creates a new process in order to run the CGI program • This requires time and significant server resources • A CGI program cannot interact back with the web server • The process of the CGI program is terminated when the program finishes • No sharing of resources between subsequen calls (e.g., reuse of database connections) • No main memory preservation of the user’s session (database storage is necessary if session data are to be preserved) • Exposing to the web the physical path to an executable program can breach security
CGI reference: http://www.w3.org/CGI/ Security and CGI: http://www.w3.org/Security/Faq/index.html Riferimenti
Esempio completo 1. Prima richiesta Form.html 2. Recupero risorsa Form.html 3. Risposta 5. Set variabili d'ambiente e chiamata 4. Seconda richiesta Mult.cgi 6. Calcolo risposta 7. Invio risposta Mult.c Precedentemente compilato in... Mult.cgi
<HTML> <HEAD><TITLE>Form di moltiplicazione</TITLE><HEAD> <BODY> <FORM ACTION="http://www.polimi.it/cgi-bin/run/mult.cgi"> <P>Introdurre i moltiplicandi</P> <INPUT NAME="m" SIZE="5"><BR/> <INPUT NAME="n" SIZE="5"><BR/> <INPUT TYPE="SUBMIT" VALUE="Moltiplica"> </FORM> <BODY> </HTML> La form (form.html) URL chiamata Vista in un browser
#include <stdio.h> #include <stdlib.h> int main(void){ char *data; long m,n; printf("%s%c%c\n", "Content-Type:text/html;charset=iso-8859-1",13,10); printf("<HTML>\n<HEAD>\n<TITLE>Risultato moltiplicazione</TITLE>\n<HEAD>\n"); printf("<BODY>\n<H3>Risultato moltiplicazione</H3>\n"); data = getenv("QUERY_STRING"); if(data == NULL) printf("<P>Errore! Errore nel ricevere i dati dalla form.</P>\n"); else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2) printf("<P>Errore! Dati non validi. Devono essere numerici.</P>\n"); else printf("<P>Risultato: %ld * %ld = %ld</P>\n",m,n,m*n); printf("<BODY>\n"); return 0; } Lo script Istruzioni di stampa della risposta sull'output Recupero di valori dalle variabili d'ambiente
Compilazione: $ gcc -o mult.cgi mult.c Test locale: $ export QUERY_STRING="m=2&n=3" $ ./mult.cgi Risultato: Content-Type:text/html;charset=iso-8859-1 <HTML> <HEAD> <TITLE>Risultato moltiplicazione</TITLE> <HEAD> <BODY> <H3>Risultato moltiplicazione</H3> <P>Risultato: 2 * 3 = 6</P> <BODY> Compilazione e test locale Set manuale della variabile d'ambiente contenente la query string
Considerazioni su CGI • Possibili problemi di sicurezza • Prestazioni (overhead) • creare e terminare processi richiede tempo • cambi di contesto richiedono tempo • Processi CGI: • creati a ciascuna invocazione • non ereditano stato di processo da invocazioni precedenti (e.g., connessioni a database)
CGI reference: http://hoohoo.ncsa.uiuc.edu/cgi/overview.html Sicurezza e CGI: http://www.w3.org/Security/Faq/wwwsf4.html Riferimenti