This is a place where I hope to post some results from experimenting with tuning my builds (via CFLAGS, CXXFLAGS) to (perhaps) be faster and less vulnerable to certain attacks, and to form an idea of the impact on my build times. They are specific to this machine (an i7 haswell) using my normal desktop build, and are specific to the package versions shown. The first set of files are really to establish my baseline: tuning-1-packages-and-notes.txt ------------------------------- For many years I have tended to pass '-O2' in my CFLAGS and CXXFLAGS, primarily to save space by not creating debug information, and in some cases (particularly when my hardware was older and less powerful) to not spend time optimizing as -O3. Of course, not every package accepts the CFLAGS i nthe environment. I recently read various suggestions which looked interesting, although I suspect that at the end of the day and tuning to speed things up will only have minimal effects. A year or so ago I started to add '-march=native' to see if it would bring any benefits - I expect it mostly does, but they are small and I'm not sure that it is worth recommending this. However, it does show clearly that a package is using my flags. I read a bit further (LTO looks interesting, but I'm not sure if I'll have the time or the patience to try it in an LFS context), and I also was reminded of various hardening options - most of which tend to have a runtime penalty. As a first step, it became clear that I ought to try to adjust every package I build so that it will use the CFLAGS/CXXFLAGS I chooose to pass to it. So, I've now run a few builds to check that my flags were being used, or to adjust things to try to get them used, and also to confirm what flags the packages used if left to their own devices. This list of packages shows which packages I build, and in what order. I will often build some packages differently from the books, and in BLFS I only normally run tests for a few packages (but for all the perl modules I build) and usually ignore adding documentation. Since finding out that cmake and meson release builds use -O3, I have changed my own builds to do that. I have also decided that where a package uses -O3 by default, I should do the same. So, for many packages I'm now either stripping out -O options from my flags (where they get added after any defaults), or changing any -O? in my flags to -O3 where my flags replace the defaults. This leaves me with the option of trying -Os in the future, or even -O1 for general packages if a machine seems underpowered. My initial conclusions re using my own flags: (i) My builds are smaller. This definitely helps my availability of space for backups. (ii) In most cases, forcing my optimizations (i.e. -O2 -march=native) makes no measurable difference except to compile times which are typically a little quicker (and that might be just the smaller files from the lack of debug symbols). (iii) For programs which use rust, or which mix rust and C/C++, my impression is that using the rust equivalent of -march slows down the build and seems to slow down the runtime. Unfortunately, my experience building rustc and firefox is that build times vary unrepeatably. (iv) Anything which uses qmake basically uses the flags passed to qt itself (qtwebengine has some extras for hardening). Rather than repeat the list of packages, I have now updated this for new notes from Stage 2 ("Cheap Hardening"). tuning-notes-1.txt ------------------ These are my notes on what I had to do to various packages. I have made these into a separate file so that you can open them in a separate browser tab while reading the main details. Replaced by tuning-notes-2.txt and then by tuning-nates-2A.txt. tuning-2-cheap-hardening.txt ---------------------------- For the next stages, I decided to try some hardening flags which are described as cheap (i.e. low runtime cost). This file summarises my experience, and offers a view on the runtime, or compile-time, cost. tuning-notes-2.txt ------------------ Updated to include some new notes (53-55). Replaced by tuning-notes-2A.txt tuning-3-mtune-not-march.txt ---------------------------- Exploring the benefits of -mtune=native on my haswell, by comparing builds using instead -mtune=native (both with the cheap hardening and then without). tuning-notes-2A.txt ------------------- I've updated the note on rustc and firefox to reflect some experimental builds I did using -Copt-level=3 (and -O3) instead of the defaults. tuning-notes-2B.txt ------------------- Update note 33 re firefox: If using an optimization level other than -O2, use --disable-optimize to prevent -O2 getting added afterwards. The story of how I found that was odd: I started to look at the possible details for using -flto and searched re LTO for firefox. That pointed me to an lfs-dev thread from November 2014 started by xinglp, where a response from William Harrington pointed to Honza Hubička's blog entry from April 2014: http://hubicka.blogspot.com/2014/04/linktime-optimization-in-gcc-2-firefox.html and within that he mentioned using --disable-optimize to get his own CFLAGS and CXXFLAGS working. My thanks to all concerned. tuning-4-alignment-tests.txt ---------------------------- Looking to see if -falign-functions=32 and -malign-data=cacheline would help. TLDR; case not proven. tuning-5-O3.txt --------------- The final set of experiments: a build using -O3 throughout, then another with binutils and gcc detuned to -O2 (not a clever idea), and some other changes. Then back to the -O3 build to recompile texlive detuned to -O2. In general -O3 seems beneficial, but there is an awful lot of noise in my runtime measurements. scripts/ and fftw/ ------------------ My scripts for calculating the runtime tests, and also my hacked fftw example code, are included. The media files I used are not shareable. desktop-runtime-comparisons.ods ------------------------------- Having discovered that the only semi-reliable way to measure runtime changes was by repeatedly running a shortish test which I could instrument with bash's 'time' command, I provided this with the -2- files. During the tuning-3 builds I noticed that some *compiles* such as LibRaw were faster using -mtune=native instead of -march (and that 'sox' runtimes were faster like that), so I added an extra test to use LibRaw from ImageMagick. The runtime was slower, as I'd hoped. I also changed the labels for the systems, and reworked the percentage calculations to allow for negative (i.e. faster) variations. Updated again after the alignment tests, I've inserted them as 'E' Updated for the -O3 tests as 'F', and the tests for kerneldocs, biblatex and luaxindy were then rerun with texlive built using -O2 and labelled as F'. The tests for the build with binutils and gcc using -O2 and then -O3 for everything else have not been included in the results, they were slooow. Original versions : 2019-05-31. Errata-1.txt added 2019-06-04. Updated for -2- files 2019-06-27. Updated for -2A notes, tuning-3, and updated version of desktop-runtime-comparisons.ods 2019-07-06. tuning-4-alignment-tests.txt added 2019-07-09. Updated for -2B notes, tuning-5-O3.txt, updated version of desktop-runtime-comparisons.odt, and test sources 2019-07-23.