11.19 gethostbyname_r and gethostbyaddr_r Functions
There are two ways to make a nonre-entrant function such as gethostbyname re-entrant.
Instead of filling in and returning a static structure, the caller allocates the structure and the re-entrant function fills in the caller's structure. This is the technique used in going from the nonre-entrant gethostbyname to the re-entrant gethostbyname_r. But, this solution gets more complicated because not only must the caller provide the hostent structure to fill in, but this structure also points to other information: the canonical name, the array of alias pointers, the alias strings, the array of address pointers, and the addresses (e.g., Figure 11.2). The caller must provide one large buffer that is used for this additional information and the hostent structure that is filled in then contains numerous pointers into this other buffer. This adds at least three arguments to the function: a pointer to the hostent structure to fill in, a pointer to the buffer to use for all the other information, and the size of this buffer. A fourth additional argument is also required: a pointer to an integer in which an error code can be stored, since the global integer h_errno can no longer be used. (The global integer h_errno presents the same re-entrancy problem that we described with errno.) This technique is also used by getnameinfo and inet_ntop. The re-entrant function calls malloc and dynamically allocates the memory. This is the technique used by getaddrinfo. The problem with this approach is that the application calling this function must also call freeaddrinfo to free the dynamic memory. If the free function is not called, a memory leak occurs: Each time the process calls the function that allocates the memory, the memory use of the process increases. If the process runs for a long time (a common trait of network servers), the memory usage just grows and grows over time.
We will now discuss the Solaris 2.x re-entrant functions for name-to-address and address-to-name resolution.
#include <netdb.h> | struct hostent *gethostbyname_r (const char *hostname, struct hostent *result, char *buf, int buflen, int *h_errnop) ; | struct hostent *gethostbyaddr_r (const char *addr, int len, int type, struct hostent *result, char *buf, int buflen, int *h_errnop) ; | Both return: non-null pointer if OK, NULL on error |
Four additional arguments are required for each function. result is a hostent structure allocated by the caller. It is filled in by the function. On success, this pointer is also the return value of the function.
buf is a buffer allocated by the caller and buflen is its size. This buffer will contain the cànonical hostname, the alias pointers, the alias strings, the address pointers, and the actual addresses. All the pointers in the structure pointed to by result point into this buffer. How big should this buffer be? Unfortunately, all that most man pages say is something vague like, "The buffer must be large enough to hold all of the data associated with the host entry." Current implementations of gethostbyname can return up to 35 alias pointers and 35 address pointers, and internally use an 8192-byte buffer to hold alias names and addresses. So, a buffer size of 8192 bytes should be adequate.
If an error occurs, the error code is returned through the h_errnop pointer, not through the global h_errno.
Unfortunately, this problem of re-entrancy is even worse than it appears. First, there is no standard regarding re-entrancy and gethostbyname and gethostbyaddr. The POSIX specification says that gethostbyname and gethostbyaddr need not be re-entrant. Unix 98 just says that these two functions need not be thread-safe. Second, there is no standard for the _r functions. What we have shown in this section (for example purposes) are two of the _r functions provided by Solaris 2.x. Linux provides similar _r functions, except that instead of returning the hostent as the return value of the function, the hostent is returned using a value-result parameter as the next to last function argument. It returns the success of the lookup as the return value from the function as well as in the h_errno argument. Digital Unix 4.0 and HP-UX 10.30 have versions of these functions with different arguments. The first two arguments for gethostbyname_r are the same as the Solaris version, but the remaining three arguments for the Solaris version are combined into a new hostent_data structure (which must be allocated by the caller), and a pointer to this structure is the third and final argument. The normal functions gethostbyname and gethostbyaddr in Digital Unix 4.0 and HP-UX 10.30 are re-entrant by using thread-specific data (Section 26.5). An interesting history of the development of the Solaris 2.x _r functions is in [Maslen 1997]. Lastly, while a re-entrant version of gethostbyname may provide safety from different threads calling it at the same time, this says nothing about the re-entrancy of the underlying resolver functions.
|