This work makes several contributions in the context of shared memory multiprocessors. The first contribution is the characterization of the data access patterns in parallel programs and a new performance model and hardware prefetch mechanism for a bus-based multiprocessor system. The second contribution is the design and evaluation of architectural support for parallel reductions. The proposed technique uses the caches of the processors as temporary storage where processors accumulate their partial results. As cache lines are displaced, their values are combined with the value in the shared memory location. The required architectural changes are mostly confined to the directory controllers. The third contribution is related with the buffering of state in speculative thread-level speculation. This work presents a novel taxonomy of the different approaches to handle speculative state. The classification is based on the support for multiple tasks and versions, and the main memory update policy. Finally, for a particular type of approaches, this work proposes an effective software scheme to buffer multi-version speculative state.