Let’s ‘C’ how to talk to the web server

We are developing some new functionality which requires the IBM i talks to a web server to extract some information which it uses to define if certain actions are required. The current thrust of the Let’s ‘C’ series is concentrated on the delivery of a IBM i server which can talk to the ASCII based world so this is a bit of a side step, we will get back to the server in the near future.

There are plenty of posts by various people about how to connect and retrieve data from a web server but I did not find any which are IBM i specific. The biggest problem is the world tends to be ASCII based so connecting to the web server has to be done in ASCII, the existing work done for the IBM i server allows us to use the same technology in reverse to send the request off to the web server. We also build some new functions which will allow us to split the URL being passed into segments so we can build the ‘GET’ string which is sent to the server. In the sample we provide we are expecting the address to be IPV4 and in a set format, you may want to consider a IPV6 address plus some of the formatting could differ based on additional parameters (Check the RFC for more information) being passed in.

The first module forms a new service program, we made sure we updated the relevant objects to allow the service program to be built and connected to. We named the module IPFUNC, it has 4 functions all of which are exported to allow use in other programs. Here is the source code.
#include <H/IPFUNC>                         // IP functions
#include <H/MSGFUNC>                        // message functions
#include <H/COMMON>                         // common header
#include <ctype.h>                          // C Types
#include <errno.h>                          // error number

//
// function Get_Host_Addr()
// Purpose: get the Host address.
// @parms
//      string server name
//      struct socket address
//      int server port
// returns 1 on sucess


int Get_Host_Addr(char *server,
                  struct sockaddr_in *addr,
                  int Server_Port)  {
struct hostent *hostp;                      // host struct pointer
char msg_dta[_MAX_MSG];                     // msg array

addr->sin_family = AF_INET;
addr->sin_port = htons(Server_Port);
if((addr->sin_addr.s_addr  = inet_addr(server)) == (unsigned long) INADDR_NONE) {
   hostp = gethostbyname(server);
   if(hostp == (struct hostent *)NULL) {
      sprintf(msg_dta,"%s",hstrerror(h_errno));
      snd_msg("GEN0001",msg_dta,strlen(msg_dta));
      return -1;
      }
   memcpy(&addr->sin_addr,hostp->h_addr,sizeof(addr->sin_addr));
   }
return 1;
}


// (function) rmt_connect
// Connect to the remote system
// @parms
//     Configuration record
//                       socket decriptor
// returns 1 connected, socket set to connection

int rmt_connect(char *server,
                int server_port,
                int *sockfd) {
int rc = 0;                                 // return value
char msg_dta[_MAX_MSG] = {'\0'};            // msg data
struct sockaddr_in addr;                    // socket struct

memset(&addr, 0, sizeof(addr));
*sockfd = socket(AF_INET, SOCK_STREAM, 0);
if(*sockfd < 0) {
   sprintf(msg_dta,"%s",strerror(errno));
   snd_msg("GEN0001",msg_dta,strlen(msg_dta));
   return -1;
   }
// get correct IP address
if(Get_Host_Addr(server,&addr,server_port) != 1) {
   //close the socket
   close(*sockfd);
   return -1;
   }
rc = connect(*sockfd, (struct sockaddr *) &addr, sizeof(addr));
if(rc < 0) {
   sprintf(msg_dta,"Failed to connect to socket : %s",strerror(errno));
   snd_msg("GEN0001",msg_dta,strlen(msg_dta));
   close(*sockfd);
   return -1;
   }
return 1;
}

// (function) parse_url
// parse the URL into parts
// @parms
//     Url
//     url parts struct
// returns 1 on OK -1 on failure

int parse_url(const char *url,
              url_t *parsed_url) {
const char *tmpstr;
const char *curstr;
int len;
int i;
int bracket_flag;

// set up start ptr
curstr = url;
// get scheme
tmpstr = strchr(curstr, ':');
if(tmpstr == NULL) {
   return -1;
   }
// Get the scheme length
len = tmpstr - curstr;
// the port number is also a ':' so make sure its not too long http(s) is maximum of 5
if(len > 5) {
   return -1;
   }
// Copy the scheme to the storage */
parsed_url->scheme = malloc(sizeof(char) * (len + 1));
if(parsed_url->scheme == NULL) {
   return -1;
   }
memcpy(parsed_url->scheme, curstr, len);
parsed_url->scheme[len] = '\0';
// Make the character to lower if it is upper case. */
for(i = 0; i < len; i++) {
   parsed_url->scheme[i] = tolower(parsed_url->scheme[i]);
   }
// Skip ':'
tmpstr++;
curstr = tmpstr;
// skip the '//' if not there bad url
for(i = 0; i < 2; i++) {
   if (*curstr !=  '/') {
      free_mem(parsed_url);
      return -1;
      }
   curstr++;
   }
// get the host
tmpstr = curstr;
while(*tmpstr != '\0') {
   // could be a port number or a '/'
   if((*tmpstr == ':') || (*tmpstr == '/')) {
      break;
      }
   tmpstr++;
   }
len = tmpstr - curstr;
parsed_url->host = malloc(sizeof(char) * (len + 1));
if((parsed_url->host == NULL) || (len <= 0)) {
   free_mem(parsed_url);
   return -1;
   }
memcpy(parsed_url->host, curstr, len);
parsed_url->host[len] = '\0';
curstr = tmpstr;
// Is port number specified?
if(*curstr == ':') {
   curstr++;
   tmpstr = curstr;
   while((*tmpstr != '\0') && (*tmpstr != '/')) {
      tmpstr++;
      }
   len = tmpstr - curstr;
   parsed_url->port = malloc(sizeof(char) * (len + 1));
   if(parsed_url->port == NULL) {
      free_mem(parsed_url);
      return -1;
      }
   memcpy(parsed_url->port, curstr, len);
   parsed_url->port[len] = '\0';
   curstr = tmpstr;
   }
// End of the string
if(*curstr == '\0') {
   return 1;
   }
// Skip '/'
if(*curstr != '/') {
  free_mem(parsed_url);
  return -1;
  }
curstr++;
// Parse path
tmpstr = curstr;
while((*tmpstr != '\0') && (*tmpstr != '#')  && (*tmpstr != '?')) {
   tmpstr++;
   }
len = tmpstr - curstr;
parsed_url->path = malloc(sizeof(char) * (len + 1));
if(parsed_url->path == NULL) {
   free_mem(parsed_url);
   return -1;
   }
memcpy(parsed_url->path, curstr, len);
parsed_url->path[len] = '\0';
curstr = tmpstr;
// Is query specified?
if(*curstr == '?') {
   // Skip '?'
   curstr++;
   // Read query
   tmpstr = curstr;
   while((*tmpstr != '\0') && (*tmpstr != '#')) {
      tmpstr++;
      }
   len = tmpstr - curstr;
   parsed_url->query = malloc(sizeof(char) * (len + 1));
   if(parsed_url->query == NULL) {
      free_mem(parsed_url);
      return -1;
      }
   memcpy(parsed_url->query, curstr, len);
   parsed_url->query[len] = '\0';
   curstr = tmpstr;
   }
// Is fragment specified?
if(*curstr == '#') {
   // Skip '#'
   curstr++;
   // Read fragment
   tmpstr = curstr;
   while(*tmpstr != '\0') {
      tmpstr++;
      }
   len = tmpstr - curstr;
   parsed_url->fragment = malloc(sizeof(char) * (len + 1));
   if(parsed_url->fragment == NULL) {
      free_mem(parsed_url);
      return -1;
      }
   memcpy(parsed_url->fragment, curstr, len);
   parsed_url->fragment[len] = '\0';
   curstr = tmpstr;
   }
return 1;
}

/*
 * Free memory of parsed url
 */
void free_mem(url_t *parsed_url) {
if(parsed_url != NULL) {
   if(parsed_url->scheme != NULL) {
      free(parsed_url->scheme);
      }
   if(parsed_url->host != NULL) {
      free(parsed_url->host);
      }
   if(parsed_url->port != NULL) {
      free(parsed_url->port);
      }
   if(parsed_url->path != NULL) {
      free(parsed_url->path);
      }
   if(parsed_url->query != NULL) {
      free(parsed_url->query);
      }
   if(parsed_url->fragment != NULL) {
      free(parsed_url->fragment);
      }
   }
return;
} 
We also need a new header file which was included in the above and allows the programs to understand the call requirements.
#ifndef IPFUNC_h
   #define IPFUNC_h
   #include <sys/socket.h>                     // sockets header
   #include <netinet/in.h>                     // host address
   #include <netdb.h>                          // network Db func
   #include <arpa/inet.h>                      // inet_addr header
   #include <resolv.h>                         // hstrerror header
   #include <sys/types.h>                      // types header
   #include <netinet/tcp.h>                    // TCP Options

   typedef struct url_x {
                  char *scheme;
                  char *host;
                  char *port;
                  char *path;
                  char *query;
                  char *fragment;
                  } url_t;

   int Get_Host_Addr(char *,struct sockaddr_in *,int);
   int rmt_connect(char *,int,int *);
   int parse_url(const char *, url_t *);
   void free_mem(url_t *);
   #endif                             
We made some changes to the COMMON header file adding a new definition for _MAX_MSG 1024.
Because the connection is being done over TCP/IP we created 2 new functions to handle the connection, first takes the address we enter (could be IP notation 192.168.1.1 or resolvable address such as www.shieldadvanced.com) and converts it to a known structure which is passed to the connection request. The next is the connection request, we have kept it pretty simple in the example as we have not set any of the port attributes, we just rely on the defaults but will set them when we productionize the code. The next 2 take a string such as ‘https://www.shieldadvanced.com/index.php?id=12’ and break it down into its relevant parts, this allows us to connect to the server using the host part and scheme for the port and then build the GET string which is passed to the server.  You will notice we default to port 80 in the code.

We define a structure which contains a set of pointers, when declared they are in initialized to NULL. We use that fact when we clean up the requested memory allocated for each segment using the relevant pointer in the structure to point to it. This means when we clean up we are only ‘freeing’ memory which has been allocated. The process is pretty simple to follow, we walk through the string expecting it to be in a set format, this allows us to segment the relevant parts as we walk and allocate memory which is only big enough to hold the data (plus 1 byte for the NULL terminator) the address of that memory is stored in our structure. If we sense the string is incorrectly formatted we will abandon the request and call the free_mem() function to clean up any allocated memory we have set.

The following program calls the service program and is coded as follows, we are only showing the concepts so this program is far from ready for releasing into the wild.. We have our own internal server which we use for the testing so none of this ever gets outside of our internal network. Here is the code.
#include <H/IPFUNC>                         // IP functions
#include <H/COMMON>                         // common header
#include <H/MSGFUNC>                        // message functions
#include <H/SRVFUNC>                        // Server functions
#include <iconv.h>                          // conversion header
#include <qtqiconv.h>                       // iconv header
#include <errno.h>                          // error number

int main(int argc, char **argv) {
int sockfd = 0;                             // socket
int server_port = 80;                       // server port defaults to http
int rc = 0;                                 // return counter
char msg_dta[_MAX_MSG];                     // message data
char req[2048];                             // maximum allowed request 2048 bytes
char recv_buf[_32K];                        // receive buffer
char convBuf[_32K];                         // conversion buffer
char _LF[2] = {0x0d,0x25};                  // LF string
QtqCode_T jobCode = {0,0,0,0,0,0};          // (Job) CCSID to struct
QtqCode_T asciiCode = {819,0,0,0,0,0};      // (ASCII) CCSID from struct
iconv_t a_e_ccsid;                          // convert table struct
iconv_t e_a_ccsid;                          // convert table struct
url_t parsed_url = {NULL};                  // parsed url structure

// we need the conversion tables for talking to the web server
a_e_ccsid = QtqIconvOpen(&jobCode,&asciiCode);
if(a_e_ccsid.return_value == -1) {
   sprintf(msg_dta,"QtqIconvOpen Failed %s",strerror(errno));
   snd_msg("GEN0001",msg_dta,strlen(msg_dta));
   return -1;
   }
// EBCDIC to ASCII
e_a_ccsid = QtqIconvOpen(&asciiCode,&jobCode);
if(e_a_ccsid.return_value == -1) {
   iconv_close(a_e_ccsid);
   sprintf(msg_dta,"QtqIconvOpen Failed %s",strerror(errno));
   snd_msg("GEN0001",msg_dta,strlen(msg_dta));
   return -1;
   }
// check the url passed in
if(parse_url(argv[1],&parsed_url) != 1) {
   sprintf(msg_dta,"URL %s is incorrectly formatted",argv[1]);
   snd_msg("GEN0001",msg_dta,strlen(msg_dta));
   return -1;
   }
//printf("scheme %s\nhost %s\nport %s\npath %s\nquery %s\nsegment %s\n",
//       parsed_url.scheme, parsed_url.host,parsed_url.port,parsed_url.path,parsed_url.query,parsed_url.fragment);
// check if defined port is used
if(parsed_url.port != NULL) {
   server_port = atoi(parsed_url.port);
   }
else {
   // default is 80 so only change if https
   if(memcmp(parsed_url.scheme,"https",5) == 0) {
      server_port = 443;
      }
   }
// connect to the server
if(rmt_connect(parsed_url.host,server_port,&sockfd) != 1) {
   printf("Failed to connect\n");
   free_mem(&parsed_url);
   return -1;
   }
// build the request
sprintf(req,"GET /%s?%s HTTP/1.1%.2sHost: %s%.2sConnection: close%.2s%.2s",
                              parsed_url.path,parsed_url.query,_LF,parsed_url.host,_LF,_LF,_LF);
//printf("Request : %s\n",req);
// convert to ascii
convert_buffer(req,convBuf,strlen(req),_32K,e_a_ccsid);
// send off the request
rc = send(sockfd,convBuf,strlen(req),0);
if(rc != strlen(req)) {
   sprintf(msg_dta,"Failed to send request %s",req);
   snd_msg("GEN0001",msg_dta,strlen(msg_dta));
   close(sockfd);
   free_mem(&parsed_url);
   return -1;
   }
// receive the response
do {
rc = recv(sockfd,recv_buf,_32K,0);
if(rc <= 0) {
break;
}
// convert back to ebcdic
memset(convBuf,'\0',_32K);
convert_buffer(recv_buf,convBuf,rc,_32K,a_e_ccsid);
printf("returned data %s\n",convBuf);
} while(rc > 0); // close the socket close(sockfd); // free any allocated memory free_mem(&parsed_url); return 1; }
The program is called with the URL being passed in the first parameter (argv[1]), it sets up the translation tables and then parses the input into parts before it connects to the remote server, once it is connected it will build the GET string, send off to the webserver once it is converted to ASCII and then waits for a response from the webserver. On receipt the data is translated back to EBCDIC and sent to STDOUT. It only exits after it has cleaned up the socked and memory allocated for the URL parts.

IMPORTANT NOTE:
The web server requires the request is segmented with a ‘\r\n’ (CRLF) pair, in EBCDIC those pairs do not convert correctly so we have to send in the correct Hex values which represent those escape characters (0x0d and 0x25). If you do not send in the correct formatted string the web server will simple drop the request and no data will be returned.

To compile the above code you will need to run the following commands.
CRTCMOD MODULE(OSLIB/IPFUNC) SRCFILE(OSLIB/QCSRC) SRCMBR(IPFUNC) OUTPUT(*PRINT) DBGVIEW(*SOURCE) REPLACE(*YES) TGTRLS(V7R1M0)
RTVBNDSRC MODULE(OSLIB/IPFUNC) SRCFILE(OSLIB/QSRVSRC)
CRTSRVPGM SRVPGM(OSPGM/IPFUNC) MODULE(OSLIB/IPFUNC) SRCFILE(OSLIB/QSRVSRC) BNDDIR(OSLIB/OS) TGTRLS(V7R1M0)
ADDBNDDIRE BNDDIR(OSLIB/OS) OBJ((IPFUNC *SRVPGM *DEFER))
CRTCMOD MODULE(OSLIB/HTTPGET) SRCFILE(OSLIB/QCSRC) SRCMBR(HTTPGET) OUTPUT(*PRINT) DBGVIEW(*SOURCE) REPLACE(*YES) TGTRLS(V7R1M0)
CRTPGM PGM(OSPGM/HTTPGET) MODULE(OSLIB/HTTPGET) BNDDIR(OSLIB/OS) TGTRLS(V7R1M0)
You should now have the required objects to run the test. We created a simple script that just returns the ID parameter that is passed to the script in the $_REQUEST variable. I have added the script to show how simple the test we ran was.
<?php

echo("This was the request ID: " .$_REQUEST['id']);
exit(0);
We placed the script in the relevant path on our internal web server and called the program with the relevant path. If you have problems with the retruned data uncomment the ‘printf’ statements to verify the data being passed.

Here is a sample output, you will notice that we are passed some header information before the data we require is passed back to use, you will need to strip off the header prior to getting at the data.
CALL PGM(HTTPGET) PARM(‘http://shieldresponsive.shield.local/scripts/testret.php?id=ABCDEFGHIJ’)
returned data HTTP/1.1 200 OK            
Date: Mon, 04 Jun 2018 15:51:15 GMT     
Server: Apache/2.4.10 (Debian)          
Content-Length: 35                      
Connection: close                        
Content-Type: text/html; charset=UTF-8  
                                        
This was the request ID: ABCDEFGHIJ 

That is it, we have seen how easily we can connect to a web server in ‘C’ and receive the data it returns for a givin request. We built the request with a query attached as we wanted to test the ability to run a script and receive some data back, if this is a simple page request you will see the entire page content returned.

That’s it for this weeks post, we will be looking at expanding the server code in the next post (unless we get another idea we want to share) so stay tuned. The updated code will be placed on the GitHub repository later today.

Chris..