A program (e.g., computer application) can be partitioned into a plurality of program segments. For example, a program can be partitioned into program segments P1, P2, . . . , PN, where N is a number of program segments. Conventional computing systems can execute the program segments one after another in an enumerated order. For example, a single processor computing system can execute P1 before executing P2, P2 before executing P3, and PN-1 before executing PN. Executing program segments in this order respects sequential semantics (e.g., a program segment with a higher enumeration order x reads from a memory location before a program segment with a lower enumeration order y writes to the memory location, where x>y).
For example, a first program segment can have an enumeration order i, and a second program segment can have an enumeration order j, where i<j. The first program segment and the second program segment can be executed in parallel without violating sequential semantics if the program segments do not access the same memory locations. Furthermore, sequential semantics is not violated if the first program segment does not write to a memory location after the second program segment reads from the memory location.
Multiprocessor, multicore, or multithreading computing systems can execute program segments in parallel (e.g., executing program segments at substantially the same time) on a plurality of processors, processor cores, or threads. Executing program segments in parallel that were not originally designed to execute in parallel can be referred to as “speculative execution.”
Conventional compilers can partition a program into program segments by determining which program segments access the same memory locations. Due to limitations of conventional analysis methods, or because accessed memory locations are unknown at a time of compiling, many programs cannot be partitioned by conventional compilers to allow for parallel execution of the program segments.
For example, some conventional analysis methods execute write instructions of program segments at temporary memory locations. These conventional analysis methods create execution overhead associated with using the temporary memory locations (e.g., storing and moving data from the temporary memory locations). Other conventional analysis methods use centralized data structures to store original data of memory locations where write instructions of program segments write, so that the original data may be restored. Updating the centralized structure can cause excessive overhead, especially if the write log is implemented in software using a dedicated data structure. Furthermore, if a miss-speculation occurs (e.g., when a program segment with an enumeration order i has written to a location that a program segment with an enumeration order j has already read, where i<j), program segments with an enumeration order higher (e.g., greater) than j halt and redo their executions. Halting and redoing executions causes execution overhead that can make speculative execution inefficient.
Furthermore, typical hardware and software implementations of conventional analysis methods use complex mechanisms and are inefficient. For example, typical software implementations create execution overhead because they use extra instructions, and cause poor memory system performance by lowering the memory locality, which can result in cache misses.
Some implementations monitor fixed regions of memory (e.g., a fixed range of one or more consecutive memory locations) to track if the region has been modified. A region size that is too large can result in false determinations of violations of sequential semantics. For example, a program segment may be forced to halt and redo its execution if it accesses the same region as another program segment with a lower enumeration order, even if none of the program segments accesses the same location. Alternatively, a region size that is too small increases the overhead of monitoring read and write instructions.
In addition, conventional profiling methods (e.g., test executions of programs to determine properties of a program to predict gains from speculative execution of the programs) assume a single method for speculative execution. Methods for speculative execution are also referred to as “speculative methods” or “processes for speculative throughput computing.” Furthermore, conventional dependence analyzers often are not able to determine whether the program segments may be executed in parallel.