Revision with unchanged content. "Present words as spoken text rather than printed text" is a common recommendation for designing multimedia instructions. The book takes a closer look at this recommendation by raising two questions: (1) How do learners distribute their visual attention during multimedia learning? And (2) which design features moderate effects of text presentation on perception and comprehension? The results of five empirical studies suggest that spoken text is only preferable under particular constraints. The learners' viewing behavior – observed via eye tracking – revealed a general preference for printed text that is distracted by particular design features. Once learners are relieved from time constrained presentation or from following motion in a dynamic visualization, the need to split visual attention between printed text and visualizations loses much of its impact. Understanding the demands of a learning material on the learner's perception and accounting for individual reading behavior by implementing user interaction appears promising to advance the design of multimedia instructions in a learner-supporting fashion.