The increase in the amounts of available information stresses the need for effective information retrieval (IR) techniques. Specifically, this book is interested in the retrieval of textual information from large and heterogeneous collections. One of the most critical problems impeding the performance of retrieval systems is the gap between the way in which people think about information and the natural language form of textual documents. Bridging this gap requires that text documents be translated to semantic representations. For large text collections, the extraction of semantic representation has to be automated, as manual effort and the use of domain-specific resources are inappropriate. There are four fundamental types of artificial (i.e. automatically extracted) semantic units, which are the building blocks of IR representation: Tokens, Composite Concepts, Synonym Concepts, and Topics. This PhD thesis explores the relationships between these representations and the performance of retrieval systems.