< Day Day Up > |
12.7. Simulating a SolutionThe information revealed by strace shows that prelink is spending a lot of time trying to open and analyze binaries that it cannot possibly prelink. The best way to test whether caching of nonprelinkable binaries could improve prelink's performance is to modify prelink so that it adds all these unprelinkable binaries to its initial cache. Unfortunately, adding code to cache these "unprelinkable" binaries could be a complicated process that involves a good amount of knowledge about the internals of the prelink application. An easier method is to simulate the cache by replacing all the unprelinkable binaries with a known prelinkable binary. This causes all the formerly unprelinkable binaries to be ignored when quick mode is run. This is exactly what would happen if we had a working cache, so we can use it to estimate the performance increase we would see if prelink were able to cache and ignore unprelinkable binaries. To start the experiment, we copy all the files in /usr/bin/ to the sandbox directory and run prelink on this directory. This directory includes normal binaries, and shell scripts, and other libraries that cannot be prelinked. We then run prelink on the sandbox directory and tell it to create a new cache rather than rely on the system cache. This is shown in Listing 12.15. Listing 12.15./usr/sbin/prelink -C new_cache -f sandbox/ Next, in Listing 12.16, we time how long it takes the quick mode of prelink to run. We had to run this multiple times until it gave a consistent result. (The first run warmed the cache for each of the succeeding runs.) The baseline time in Listing 12.16 is .983 seconds. We have to beat this time for our optimization (improving the cache) to be worth investigating. Listing 12.16.time /usr/sbin/prelink -C new_cache -q sandbox/ real 0m0.983s user 0m0.597s sys 0m0.386s Next, in Listing 12.17, we run strace on this prelink command. This is to record which files prelink opens in the sandbox directory. Listing 12.17.strace -o strace_prelink_sandbox /usr/sbin/prelink -C new_cache -q sandbox/ Next we create a new directory, sandbox2, into which we once again copy all the binaries in the /usr/bin directory. However, we overwrite all the files that prelink "opened" in the preceding strace output with a known good binary, less, which can be prelinked. We copy the less on to all the problem binaries rather than just deleting them, so that both sandboxes contain the same number of files. After we set up the second sandbox, we run the full version of prelink on this new directory using the command in Listing 12.18. Listing 12.18.[root@localhost prelink]#/usr/sbin/prelink -C new_cache2 -f sandbox2/ Finally, we time the run of the quick mode and compare it to our baseline. Again, we had to run it several times, where the first time warmed the cache. In Listing 12.19, we can see that we did, indeed, see a performance increase. The time to execute the prelink dropped from ~.98 second to ~.29 seconds. Listing 12.19.[root@localhost prelink]# time /usr/sbin/prelink -C new_cache2 -q sandbox2/ real 0m0.292s user 0m0.158s sys 0m0.134s Next, we compare the strace output of the two different runs to verify that the number of reads did, in fact, decrease. Listing 12.20 shows the strace summary information from sandbox, which contained binaries that prelink could not link. Listing 12.20.execve("/usr/sbin/prelink", ["/usr/sbin/prelink", "-C", "new_cache", "-q", "sandbox/"], [/* 20 vars */]) = 0 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 62.06 0.436563 48 9133 read 13.87 0.097551 15 6504 lstat64 6.20 0.043625 18 2363 10 stat64 5.62 0.039543 21 1922 pread 3.93 0.027671 374 74 vfork 1.78 0.012515 9 1423 getcwd 1.65 0.011594 644 18 getdents64 1.35 0.009473 15 623 1 open 0.90 0.006300 8 770 close ..... 100.00 0.703400 24028 85 total Listing 12.21 shows the strace summary from sandbox where prelink could link all the binaries. Listing 12.21.execve("/usr/sbin/prelink", ["/usr/sbin/prelink", "-C", "new_cache2", "-q", "sandbox2/"], [/* 20 vars */]) = 0 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 54.29 0.088766 15 5795 lstat64 26.53 0.043378 19 2259 10 stat64 8.46 0.013833 8 1833 getcwd 6.95 0.011363 631 18 getdents64 2.50 0.004095 2048 2 write 0.37 0.000611 611 1 rename 0.26 0.000426 39 11 1 open ... 100.00 0.163515 9973 11 total As you can see from the differences in Listing 12.20 and Listing 12.21, we have dramatically reduced the number of reads done in the directory. In addition, we have significantly reduced the amount of time required to prelink the directory. Caching and avoiding unprelinkable executables looks like a promising optimization. |
< Day Day Up > |