For the past 25 years, most (probably all) computers have performed some operations in anticipation of instruction processing proceeding in certain possible ways. This includes things like prefetching sequential instructions, fetching instructions along alternate branch paths, and decoding, and fetching operands for instructions before it is known for certain whether they will be executed. Any experienced designer could list many such design features. While these features take many forms, each exists to improve performance, and each of them has its own unique set of characteristics.
Also, during the past 25 years, most computer designs have included a cache in their storage system. Caches also have many variations in their designs, but have in common that they keep recently referenced portions of storage in a local array with a fast access time. Most fetches from the processor can be handled quickly with a reference to the local cache, but in the small percentage of cases when it does not have the data, a longer time is required as the data is fetched from main storage (or another level of cache) and loaded into the local cache.
The interaction of these mechanisms is not necessarily a simple one in which the benefits are additive. In particular, consider the interaction of fetching Instructions and/or operands along a path which may not be taken (commonly called hedge fetches), and the loading of lines into the cache which may result from those fetches.
Hedge fetches pose a design problem. They can have a significant performance benefit if the processor actually proceeds along that path, because the prefetched data is in the processor ready for immediate usage, and there is no delay while it is fetched. The disadvantage is that a hedge fetch can cause some delay in processing, and if it is not used, that is wasted time. The amount of time used to do a hedge fetch can vary widely. In the best situation, it occurs during a cycle when no other fetch is wanted and the data is in the cache: then there is no penalty. A modest penalty occurs if the data is in the cache, but some other fetch is delayed because of the hedge fetch. In that case the penalty is whatever time is associated with delaying the other fetch, probably a cycle or two. A large penalty occurs if the data is not in the cache, since a cache line fetch makes main storage (or the next level cache) busy for some time, and if data needs to be loaded into the cache this makes the cache busy while the data is being loaded.
Hedge fetches pose a further problem from the cache point of view. Caches work because programs make repeated fetches in the same area of storage, and by loading data in the cache at the first reference to an area this data will be available for subsequent fetches to the same area. If data is brought into the cache which is not needed, performance is degraded both because of the time required to bring it in, and because it replaces other data which may still be needed. So how should hedge fetches be handled? Some hedge fetches to an area will probably be followed soon by fetches to the same area that are really needed and others will not. At the time a hedge fetch is made it is unclear which type it will be and, in general, it is difficult to say what percentage of hedge fetches fall into each category.
The dynamics of these mechanisms is sufficiently complicated that the net effect on performance is not easy to analyze. It will be different in different designs depending on the particular details of each design. It is not possible to generalize that one particular design, or combinations of design features, is the right one always, but following the teaching of the detailed description will improve handling of selected hedge fetches.