Slow Response with i5_pconnect().

[adrotate group=”3,5,7,9″]
While experimenting with the latest version of our DR4i php interface we came across a slight issue with the i5_connection routines. The problem only appeared after we moved the code from our PC testing environment to the iAMP install so we thought it was simply a slow down as we moved from the PC to the IBMi, unfortunately this is only part of the problem. As soon as we found the issue we contacted Aura and asked them for support, they came back asking about how the problem was manifesting itself as they have not seen it elsewhere and were not sure what could be causing the problem.

We asked Aura about the code and what could have changed to cause the significant slow down, they said that nothing had changed and because they were not able to recreate the same issue in their network they could not understand why we were. After some further discussion and discovery they let us know that they had moved away from the gethostbyname() API to the getaddrinfo() API in preparation for IPV6 support. getaddrinfo() is the API which should be used in place of gethostbyname() API where IPV6 support is required.

We scoured the internet and found a number of entries which discussed the slowdown of lookups when getaddrinfo() was used. It was obviously a problem and we needed to understand how this was playing a part in our environment but not in Aura’s. So our first action was to write a test program which would take a host name and try to resolve that using the getaddrinfo() API. Here is the code we started off with.


#include
#include
#include
#include
#include
#include /* CEE date functions */

#ifndef NI_MAXHOST
#define NI_MAXHOST 1025
#endif

int main(int argc,char **argv) {
int error;
int junkl; /* Int holder */
double secs; /* Secs holder */
char Time_Stamp[18]; /* Time Stamp holder */
char hostname[NI_MAXHOST] = ""; /* Host name returned */
unsigned char junk2[23]; /* Junk char string */
struct addrinfo *result;
struct addrinfo *res;

CEELOCT(&junkl, &secs,junk2,NULL);
CEEDATM(&secs,"YYYYMMDDHHMISS999",Time_Stamp,NULL);
printf("Start = %sn",Time_Stamp);
error = getaddrinfo(argv[1], NULL, NULL, &result);
/* time now */
CEELOCT(&junkl, &secs,junk2,NULL);
CEEDATM(&secs,"YYYYMMDDHHMISS999",Time_Stamp,NULL);
printf("After getaddrinfo = %sn",Time_Stamp);
if(error != 0) {
fprintf(stderr, "error in getaddrinfo: %sn", gai_strerror(error));
exit(EXIT_FAILURE);
}
/* loop over all returned results and do inverse lookup */
/* loop over all returned results and do inverse lookup */
for(res = result; res != NULL; res = res->ai_next) {
error = getnameinfo(res->ai_addr,
res->ai_addrlen,
hostname,
NI_MAXHOST,
NULL,
0,
0);
if(error != 0) {
fprintf(stderr, "error in getnameinfo: %sn", gai_strerror(error));
}
if(*hostname != '')
printf("hostname: %sn", hostname);
CEELOCT(&junkl, &secs,junk2,NULL);
CEEDATM(&secs,"YYYYMMDDHHMISS999",Time_Stamp,NULL);
printf("After getnameinfo = %sn",Time_Stamp);
}
freeaddrinfo(result);
return 0;
}

When we ran this test program against our network with a simple hostname which is defined in our HOST file here is a sample of the output.

Start = 20130205095444797
After getaddrinfo = 20130205095453000
hostname: SHIELD3.SHIELD.LOCAL
After getnameinfo = 20130205095453000
Press ENTER to end terminal session.

This showed an 8 second response time for the getaddrinfo() API! Obviously this would not be acceptable as it would be used each time a connection was made. This was an issue because we do not have a DNS to resolve our local names and instead rely on the HOST table entries, our default search is set to *LOCAL so we would have expected getaddrinfo() to look up the address in the HOST table first and it would have been resolved. But due to the way the API has been coded it was always going out to the DNS server asking for an IPV6 address before looking for the IPV4 address in the HOST table.

We then looked at the documentation a lot closer and after some experimentation found that if we removed the Domain Information from the TCP/IP setup (option 12 on the CFGTCP menu) we could get the request for a server name back to immediate responses, but as soon as we added Domain information such as ‘shield3.shield.local’ the response time would instantly creep back up to over 8 seconds. Again not acceptable as the environment we needed the fix for is using NamedVirtualHosting which would always pass in a FQDN.

This is when we raised a PMR with IBM and supplied them with all of the data we had been using and asked for support. They came back with a link to a document which described the problem exactly and it was only affecting i/OS from V6R1 onwards. Because from V6R1 onwards IBM had implemented the getaddrinfo() API to do IPV6 lookups first it would always go out to the DNS for a name resolution even if an IPV4 address could be resolved from the HOST file! It would only drop back to a IPV4 lookup after the IPV6 lookup had failed!

The answer in the end was very simple, we just had to code up the AI_ADDRCONFIG flag in the getaddrinfo() request and it would only do an IPV6 lookup if more than 1 IPV6 address had been configured (::1 is not considered a configured IPV6 address). Now we see immediate responses from the API and everything works as it should even with the Domain Information configured.

If you are seeing a dramatic slowdown in your TCP/IP connection after migrating to V6R1 and you or your aplication vendor are using the getaddrinfo() API you may want to consider the above. Easycom connection routines are affected at the moment but a fix is being developed to resolve the issue.

Chris…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.