< Day Day Up > |
12.6. Run Application and Performance ToolsNow we can finally begin to analyze the performance characteristics of the different modes of prelink. As you just saw, prelink does not spend much time using the CPU; instead, it spends all of its time on disk I/O. Because prelink must call the kernel for disk I/O, we should be able to trace its execution using the strace performance tool. Because the quick mode of prelink does not appear to be that much faster than the standard full-run mode, we compare both runs using strace to see whether any suspicious behavior shows up. At first, we ask strace to trace the slower full run of prelink. This is the run that creates the initial cache that is used when prelink is running in quick mode. Initially, we ask strace to show us the summary of the system calls that prelink made and see how long each took to complete. The command to do this is shown in Listing 12.7. Listing 12.7.[root@localhost prelink]# strace -c -o af_sum /usr/sbin/prelink -af .... /usr/sbin/prelink: /usr/libexec/autopackage/luau-downloader.bin: Could not parse '/usr/libexec/autopackage/luau-downloader.bin: error while loading shared libraries: libuau.so.2: cannot open shared object file: No such file or directory' ... /usr/sbin/prelink: /usr/lib/mozilla-1.6/regchrome: Could not parse '/usr/lib/mozilla-1.6/regchrome: error while loading shared libraries: libxpcom.so: cannot open shared object file: No such file or directory' ... Listing 12.7 is also a sample of prelink's output. prelink is struggling when trying to prelink some of the system executables and libraries. This information becomes valuable later, so remember it. Listing 12.8 shows the summary output file that the strace command in Listing 12.7 generated. Listing 12.8.[root@localhost prelink] # cat af_sum execve("/usr/sbin/prelink", ["/usr/sbin/prelink", "-af"], [/* 31 vars */]) = 0 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 77.87 151.249181 65 2315836 read 11.93 23.163231 55 421593 pread 3.59 6.976880 63 110585 pwrite 1.70 3.294913 17 196518 mremap 1.02 1.977743 32 61774 lstat64 0.97 1.890977 40 47820 1 open 0.72 1.406801 249 5639 vfork 0.35 0.677946 11 59097 close ... ------ ----------- ----------- --------- --------- ---------------- 100.00 194.230415 3351032 5650 total As you can see in Listing 12.8, a significant amount of time is spent in the read system call. This is expected. prelink needs to figure out what shared libraries are linked into the application, and this requires that part of the executable be read in to be analyzed. The prelink documentation mentions that when generating the list of libraries that an application requires, that application is actually started by the dynamic loader in a special mode, and then the information is read from the executable using a pipe. This is why pread is also high in the profile. In contrast, we would expect the quick version to have very few of these calls. To see how the profile of the quick version is different, we run the same strace command on the quick version of prelink. We can do that with the strace command shown in Listing 12.9. Listing 12.9.[root@localhost prelink]# strace -c -o aq_sum /usr/sbin/prelink -aq Listing 12.10 shows the strace profile of prelink running in quick mode. Listing 12.10.[root@localhost prelink] # cat aq_sum execve("/usr/sbin/prelink", ["/usr/sbin/prelink", "-aq"], [/* 31 vars*/]) = 0 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 47.42 3.019337 70 43397 read 26.74 1.702584 28 59822 lstat64 10.35 0.658760 163 4041 getdents64 5.52 0.351326 30 11681 pread 3.26 0.207800 21 9678 1 open 1.99 0.126593 21 5980 10 stat64 1.98 0.126243 12 10155 close 0.62 0.039335 165 239 vfork .... ------ ----------- ----------- ----------- --------- ------------ 100.00 6.367230 154681 250 total As expected, Listing 12.10 shows that the quick mode executes a significant number of lstat64 system calls. These are the system calls that return the mtime and ctime for each executable. prelink looks in its cache to compare the saved mtime and ctime with the executable's current mtime and ctime. If the executable has changed, it starts to prelink it; if it has not changed, it continues to the next executable. The fact that prelink calls lstat64 a large number of times is a good sign, because that means that prelink's cache is working. However, the fact that prelink still calls read a large number of times is a bad sign. The cache should remember that the executables have already been prelinked and should not try to analyze them. We have to figure out why prelink is trying to analyze them. The easiest way to do that is to run strace in normal mode. strace will show all the system calls that prelink makes, and will hopefully clarify which files are being read and explain why read is being called so often. Listing 12.11 shows the command strace used on the quick prelink. Listing 12.11.[root@localhost prelink]# strace -o aq_run /usr/sbin/prelink -aq The output of strace is a 14MB text file, aq_run. Browsing through it shows that prelink uses lstat64 to check many of the libraries and executables. However, it reveals a few different types cases where read() is used. The first, shown in Listing 12.12, is where prelink reads a file that is a shell script. Because this shell script is not a binary ELF file, it can't be prelinked. These shell scripts were unchanged since the original full-system prelink was run, so it would be nice if prelink's cache would record the fact that this file cannot be prelinked. If the ctime and mtime do not change, prelink should not even try to read them. (If it was a shell script during the last full prelink and we haven't touched it, it still cannot be prelinked.) Listing 12.12.[root@localhost prelink] # cat aq_run ... open("/bin/unicode_stop", O_RDONLY|O_LARGEFILE) = 5 read(5, "#!/bin/sh\n# stop u", 18) = 18 close(5) = 0 .... open("/bin/unicode_start", O_RDONLY|O_LARGEFILE) = 5 read(5, "#!/bin/bash\n# Enab", 18) = 18 close(5) = 0 .... Next, in Listing 12.13, we watch as prelink tries to operate on a statically linked application. Because this application does not rely on any shared libraries, it makes no sense to try to prelink it. The initial run of prelink should have caught the fact that this application could not be prelinked and stored that information in the prelink cache. In quick mode, it should not have even tried to prelink this binary. Listing 12.13.[root@localhost prelink] # cat aq_run ... open("/bin/ash.static", O_RDONLY| O_LARGEFILE) = 5 read(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0", 18) = 18 fcntl64(5, F_GETFL) = 0x8000 (flags O_RDONLY| O_LARGEFILE) pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0", 16, 0) = 16 pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0", 16, 0) = 16 pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\0\201\4"..., 52, 0) = 52 pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\0\201\4"..., 52, 0) = 52 pread(5, "\1\0\0\0\0\0\0\0\0\200\4\10\0\200\4\10\320d\7\0\320d\7"..., 128, 52) = 128 pread(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 920, 488632) = 920 close(5) ... Finally, in Listing 12.14, we see prelink reading a binary that it had trouble prelinking in the original full system run. We saw an error regarding this binary in the original prelink output. When it starts to read this file, it pulls in other libraries and begins to operate on each of those and their dependencies. This triggers an enormous amount of reading. Listing 12.14.[root@localhost prelink] # cat aq_run ... lstat64("/usr/lib/mozilla-1.6/regchrome", {st_mode=S_IFREG|0755, st_size=14444, ...}) = 0 open("/usr/lib/mozilla-1.6/regchrome", O_RDONLY|O_LARGEFILE) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0", 18) = 18 fcntl64(6, F_GETFL) = 0x8000 (flags O_RDONLY|O_LARGEFILE) pread(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0", 16, 0) = 16 ... open("/usr/lib/mozilla-1.6/libldap50.so", O_RDONLY|O_LARGEFILE) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18 close(6) = 0 lstat64("/usr/lib/mozilla-1.6/libgtkxtbin.so", {st_mode=S_IFREG|0755, st_size=14268, ...}) = 0 open("/usr/lib/mozilla-1.6/libgtkxtbin.so", O_RDONLY|O_LARGEFILE) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18 close(6) = 0 lstat64("/usr/lib/mozilla-1.6/libjsj.so", {st_mode=S_IFREG|0755, st_size=96752, ...}) = 0 open("/usr/lib/mozilla-1.6/libjsj.so", O_RDONLY|O_LARGEFILE) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18 close(6) = 0 lstat64("/usr/lib/mozilla-1.6/mozilla-xremote-client", {st_mode=S_IFREG|0755, st_size=12896, ...}) = 0 lstat64("/usr/lib/mozilla-1.6/regxpcom", {st_mode=S_IFREG|0755, st_size=55144, ...}) = 0 lstat64("/usr/lib/mozilla-1.6/libgkgfx.so", {st_mode=S_IFREG|0755, st_size=143012, ...}) = 0 open("/usr/lib/mozilla-1.6/libgkgfx.so", O_RDONLY|O_LARGEFILE) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18 close(6) = 0 ... Optimizing this case is tricky. Because this binary was not actually the problem (rather the library that it was linking to, libxpcom.so), we cannot just mark the executable as bad in the cache. However, if we store the errant library name libxpcom.so with the failing executable, it may be possible to check the times of the binary and the library, and only try to prelink again if one of them has changed. |
< Day Day Up > |