This is a place where I hope to post some results from experimenting with
tuning my builds (via CFLAGS, CXXFLAGS) to (perhaps) be faster and less
vulnerable to certain attacks, and to form an idea of the impact on my
build times.  They are specific to this machine (an i7 haswell) using my
normal desktop build, and are specific to the package versions shown.

The first set of files are really to establish my baseline:


tuning-1-packages-and-notes.txt
-------------------------------

For many years I have tended to pass '-O2' in my CFLAGS and CXXFLAGS,
primarily to save space by not creating debug information, and in some
cases (particularly when my hardware was older and less powerful) to
not spend time optimizing as -O3.  Of course, not every package accepts
the CFLAGS i nthe environment.  I recently read various suggestions
which looked interesting, although I suspect that at the end of the day
and tuning to speed things up will only have minimal effects.

A year or so ago I started to add '-march=native' to see if it would
bring any benefits - I expect it mostly does, but they are small and
I'm not sure that it is worth recommending this.  However, it does show
clearly that a package is using my flags.

I read a bit further (LTO looks interesting, but I'm not sure if I'll
have the time or the patience to try it in an LFS context), and I also
was reminded of various hardening options - most of which tend to have
a runtime penalty.

As a first step, it became clear that I ought to try to adjust every
package I build so that it will use the CFLAGS/CXXFLAGS I chooose to
pass to it.  So, I've now run a few builds to check that my flags were
being used, or to adjust things to try to get them used, and also to
confirm what flags the packages used if left to their own devices.

This list of packages shows which packages I build, and in what order.
I will often build some packages differently from the books, and in
BLFS I only normally run tests for a few packages (but for all the perl
modules I build) and usually ignore adding documentation.

Since finding out that cmake and meson release builds use -O3, I have
changed my own builds to do that.  I have also decided that where a
package uses -O3 by default, I should do the same.  So, for many
packages I'm now either stripping out -O options from my flags (where
they get added after any defaults), or changing any -O? in my flags
to -O3 where my flags replace the defaults.  This leaves me with the
option of trying -Os in the future, or even -O1 for general packages
if a machine seems underpowered.

My initial conclusions re using my own flags:

(i) My builds are smaller.  This definitely helps my availability of
    space for backups.

(ii) In most cases, forcing my optimizations (i.e. -O2 -march=native)
    makes no measurable difference except to compile times which are
    typically a little quicker (and that might be just the smaller
    files from the lack of debug symbols).

(iii) For programs which use rust, or which mix rust and C/C++, my
    impression is that using the rust equivalent of -march slows down
    the build and seems to slow down the runtime.  Unfortunately, my
    experience building rustc and firefox is that build times vary
    unrepeatably.

(iv) Anything which uses qmake basically uses the flags passed to qt
    itself (qtwebengine has some extras for hardening).

Rather than repeat the list of packages, I have now updated this for
new notes from Stage 2 ("Cheap Hardening").


tuning-notes-1.txt
------------------

These are my notes on what I had to do to various packages.  I have made
these into a separate file so that you can open them in a separate
browser tab while reading the main details.

Replaced by tuning-notes-2.txt and then by tuning-nates-2A.txt.


tuning-2-cheap-hardening.txt
----------------------------

For the next stages, I decided to try some hardening flags which are
described as cheap (i.e. low runtime cost).  This file summarises my
experience, and offers a view on the runtime, or compile-time, cost.


tuning-notes-2.txt
------------------

Updated to include some new notes (53-55).

Replaced by tuning-notes-2A.txt


tuning-3-mtune-not-march.txt
----------------------------

Exploring the benefits of -mtune=native on my haswell, by comparing
builds using instead -mtune=native (both with the cheap hardening
and then without).


tuning-notes-2A.txt
-------------------

I've updated the note on rustc and firefox to reflect some
experimental builds I did using -Copt-level=3 (and -O3) instead of
the defaults.


tuning-notes-2B.txt
-------------------

Update note 33 re firefox: If using an optimization level other than
-O2, use --disable-optimize to prevent -O2 getting added afterwards.

The story of how I found that was odd: I started to look at the
possible details for using -flto and searched re LTO for firefox.  That
pointed me to an lfs-dev thread from November 2014 started by xinglp,
where a response from William Harrington pointed to Honza Hubička's
blog entry from April 2014:
http://hubicka.blogspot.com/2014/04/linktime-optimization-in-gcc-2-firefox.html
and within that he mentioned using --disable-optimize to get his own
CFLAGS and CXXFLAGS working.  My thanks to all concerned.

tuning-4-alignment-tests.txt
----------------------------

Looking to see if -falign-functions=32 and -malign-data=cacheline
would help.  TLDR; case not proven.


tuning-5-O3.txt
---------------

The final set of experiments: a build using -O3 throughout, then
another with binutils and gcc detuned to -O2 (not a clever idea),
and some other changes.  Then back to the -O3 build to recompile
texlive detuned to -O2.  In general -O3 seems beneficial, but
there is an awful lot of noise in my runtime measurements.


scripts/ and fftw/
------------------

My scripts for calculating the runtime tests, and also my hacked
fftw example code, are included.  The media files I used are not
shareable.


desktop-runtime-comparisons.ods
-------------------------------

Having discovered that the only semi-reliable way to measure runtime
changes was by repeatedly running a shortish test which I could
instrument with bash's 'time' command, I provided this with the -2-
files.  During the tuning-3 builds I noticed that some *compiles*
such as LibRaw were faster using -mtune=native instead of -march
(and that 'sox' runtimes were faster like that), so I added an extra
test to use LibRaw from ImageMagick.  The runtime was slower, as I'd
hoped.

I also changed the labels for the systems, and reworked the percentage
calculations to allow for negative (i.e. faster) variations.

Updated again after the alignment tests, I've inserted them as 'E'

Updated for the -O3 tests as 'F', and the tests for kerneldocs,
biblatex and luaxindy were then rerun with texlive built using -O2 and
labelled as F'.

The tests for the build with binutils and gcc using -O2 and then -O3
for everything else have not been included in the results, they were
slooow.


Original versions : 2019-05-31.

Errata-1.txt added 2019-06-04.

Updated for -2- files 2019-06-27.

Updated for -2A notes, tuning-3, and updated version of
desktop-runtime-comparisons.ods 2019-07-06.

tuning-4-alignment-tests.txt added 2019-07-09.

Updated for -2B notes, tuning-5-O3.txt, updated version of
desktop-runtime-comparisons.odt, and test sources 2019-07-23.