Previous Section  < Day Day Up >  Next Section

12.6. Run Application and Performance Tools

Now we can finally begin to analyze the performance characteristics of the different modes of prelink. As you just saw, prelink does not spend much time using the CPU; instead, it spends all of its time on disk I/O. Because prelink must call the kernel for disk I/O, we should be able to trace its execution using the strace performance tool. Because the quick mode of prelink does not appear to be that much faster than the standard full-run mode, we compare both runs using strace to see whether any suspicious behavior shows up.

At first, we ask strace to trace the slower full run of prelink. This is the run that creates the initial cache that is used when prelink is running in quick mode. Initially, we ask strace to show us the summary of the system calls that prelink made and see how long each took to complete. The command to do this is shown in Listing 12.7.

Listing 12.7.

[root@localhost prelink]# strace -c -o af_sum  /usr/sbin/prelink -af

....

/usr/sbin/prelink: /usr/libexec/autopackage/luau-downloader.bin: Could

not parse '/usr/libexec/autopackage/luau-downloader.bin: error while

loading shared libraries: libuau.so.2: cannot open shared object file: No

such file or directory'

...

/usr/sbin/prelink: /usr/lib/mozilla-1.6/regchrome: Could not parse

'/usr/lib/mozilla-1.6/regchrome: error while loading shared libraries:

libxpcom.so: cannot open shared object file: No such file or  directory'

...


Listing 12.7 is also a sample of prelink's output. prelink is struggling when trying to prelink some of the system executables and libraries. This information becomes valuable later, so remember it.

Listing 12.8 shows the summary output file that the strace command in Listing 12.7 generated.

Listing 12.8.

[root@localhost prelink] # cat af_sum

execve("/usr/sbin/prelink", ["/usr/sbin/prelink", "-af"], [/* 31 vars

*/]) = 0

% time     seconds  usecs/call     calls     errors syscall

------ ----------- ----------- --------- --------- ----------------

 77.87  151.249181          65   2315836            read

 11.93   23.163231          55    421593            pread

  3.59    6.976880          63    110585            pwrite

  1.70    3.294913          17    196518            mremap

  1.02    1.977743          32     61774            lstat64

  0.97    1.890977          40     47820         1  open

  0.72    1.406801         249      5639            vfork

  0.35    0.677946          11     59097            close

...

------ ----------- ----------- --------- --------- ----------------

100.00  194.230415               3351032      5650 total


As you can see in Listing 12.8, a significant amount of time is spent in the read system call. This is expected. prelink needs to figure out what shared libraries are linked into the application, and this requires that part of the executable be read in to be analyzed. The prelink documentation mentions that when generating the list of libraries that an application requires, that application is actually started by the dynamic loader in a special mode, and then the information is read from the executable using a pipe. This is why pread is also high in the profile. In contrast, we would expect the quick version to have very few of these calls.

To see how the profile of the quick version is different, we run the same strace command on the quick version of prelink. We can do that with the strace command shown in Listing 12.9.

Listing 12.9.

[root@localhost prelink]# strace -c -o aq_sum /usr/sbin/prelink -aq


Listing 12.10 shows the strace profile of prelink running in quick mode.

Listing 12.10.

[root@localhost prelink] # cat aq_sum

execve("/usr/sbin/prelink", ["/usr/sbin/prelink", "-aq"], [/* 31 vars*/]) = 0

% time     seconds  usecs/call     calls     errors syscall

------ ----------- ----------- --------- --------- ----------------

 47.42    3.019337          70     43397            read

 26.74    1.702584          28     59822            lstat64

 10.35    0.658760         163      4041            getdents64

  5.52    0.351326          30     11681            pread

  3.26    0.207800          21      9678         1  open

  1.99    0.126593          21      5980        10  stat64

  1.98    0.126243          12     10155            close

  0.62    0.039335         165       239            vfork

....

------ ----------- ----------- ----------- --------- ------------

100.00    6.367230                  154681        250 total


As expected, Listing 12.10 shows that the quick mode executes a significant number of lstat64 system calls. These are the system calls that return the mtime and ctime for each executable. prelink looks in its cache to compare the saved mtime and ctime with the executable's current mtime and ctime. If the executable has changed, it starts to prelink it; if it has not changed, it continues to the next executable. The fact that prelink calls lstat64 a large number of times is a good sign, because that means that prelink's cache is working. However, the fact that prelink still calls read a large number of times is a bad sign. The cache should remember that the executables have already been prelinked and should not try to analyze them. We have to figure out why prelink is trying to analyze them. The easiest way to do that is to run strace in normal mode. strace will show all the system calls that prelink makes, and will hopefully clarify which files are being read and explain why read is being called so often. Listing 12.11 shows the command strace used on the quick prelink.

Listing 12.11.

[root@localhost prelink]# strace -o aq_run /usr/sbin/prelink -aq


The output of strace is a 14MB text file, aq_run. Browsing through it shows that prelink uses lstat64 to check many of the libraries and executables. However, it reveals a few different types cases where read() is used. The first, shown in Listing 12.12, is where prelink reads a file that is a shell script. Because this shell script is not a binary ELF file, it can't be prelinked.

These shell scripts were unchanged since the original full-system prelink was run, so it would be nice if prelink's cache would record the fact that this file cannot be prelinked. If the ctime and mtime do not change, prelink should not even try to read them. (If it was a shell script during the last full prelink and we haven't touched it, it still cannot be prelinked.)

Listing 12.12.

[root@localhost prelink] # cat aq_run



...



open("/bin/unicode_stop", O_RDONLY|O_LARGEFILE) = 5



read(5, "#!/bin/sh\n# stop u", 18)        = 18

close(5)                                  = 0



....



open("/bin/unicode_start", O_RDONLY|O_LARGEFILE) = 5



read(5, "#!/bin/bash\n# Enab", 18)       = 18



close(5)                                 = 0



....


Next, in Listing 12.13, we watch as prelink tries to operate on a statically linked application. Because this application does not rely on any shared libraries, it makes no sense to try to prelink it. The initial run of prelink should have caught the fact that this application could not be prelinked and stored that information in the prelink cache. In quick mode, it should not have even tried to prelink this binary.

Listing 12.13.

[root@localhost prelink] # cat aq_run



...



open("/bin/ash.static", O_RDONLY| O_LARGEFILE) = 5





read(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0", 18) = 18



fcntl64(5, F_GETFL)                     = 0x8000 (flags

O_RDONLY| O_LARGEFILE)



pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0", 16, 0) = 16



pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0", 16, 0) = 16



pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\0\201\4"...,

52, 0) = 52



pread(5, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\0\201\4"...,

52, 0) = 52



pread(5, "\1\0\0\0\0\0\0\0\0\200\4\10\0\200\4\10\320d\7\0\320d\7"...,

128, 52) = 128



pread(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,

920, 488632) = 920



close(5)



...


Finally, in Listing 12.14, we see prelink reading a binary that it had trouble prelinking in the original full system run. We saw an error regarding this binary in the original prelink output. When it starts to read this file, it pulls in other libraries and begins to operate on each of those and their dependencies. This triggers an enormous amount of reading.

Listing 12.14.

[root@localhost prelink] # cat aq_run



...



lstat64("/usr/lib/mozilla-1.6/regchrome", {st_mode=S_IFREG|0755, st_size=14444,



...}) = 0



open("/usr/lib/mozilla-1.6/regchrome", O_RDONLY|O_LARGEFILE) = 6



read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0", 18) = 18



fcntl64(6, F_GETFL)                     = 0x8000 (flags O_RDONLY|O_LARGEFILE)



pread(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0", 16, 0) = 16



...



open("/usr/lib/mozilla-1.6/libldap50.so", O_RDONLY|O_LARGEFILE) = 6



read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18



close(6)                                 = 0



lstat64("/usr/lib/mozilla-1.6/libgtkxtbin.so", {st_mode=S_IFREG|0755, st_size=14268, ...}) = 0



open("/usr/lib/mozilla-1.6/libgtkxtbin.so", O_RDONLY|O_LARGEFILE)  = 6



read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18



close(6)                                = 0



lstat64("/usr/lib/mozilla-1.6/libjsj.so", {st_mode=S_IFREG|0755, st_size=96752,



...}) = 0



open("/usr/lib/mozilla-1.6/libjsj.so", O_RDONLY|O_LARGEFILE) = 6



read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18



close(6)                                 = 0



lstat64("/usr/lib/mozilla-1.6/mozilla-xremote-client", {st_mode=S_IFREG|0755,

 st_size=12896, ...}) = 0



lstat64("/usr/lib/mozilla-1.6/regxpcom", {st_mode=S_IFREG|0755, st_size=55144, ...}) = 0



lstat64("/usr/lib/mozilla-1.6/libgkgfx.so", {st_mode=S_IFREG|0755, st_size=143012, ...}) = 0



open("/usr/lib/mozilla-1.6/libgkgfx.so", O_RDONLY|O_LARGEFILE) = 6



read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0", 18) = 18



close(6)                                 = 0



...


Optimizing this case is tricky. Because this binary was not actually the problem (rather the library that it was linking to, libxpcom.so), we cannot just mark the executable as bad in the cache. However, if we store the errant library name libxpcom.so with the failing executable, it may be possible to check the times of the binary and the library, and only try to prelink again if one of them has changed.

    Previous Section  < Day Day Up >  Next Section