The motivation behind most of the applications of off-line text recognition is to convert data from conventional media into electronic media. Such applications include cheques, security documents and form processing. A document analysis system is presented to transfer grey-level composite documents with complex backgrounds, watermarks and poor illumination into electronic format that is suitable for efficient storage, retrieval and interpretation. The preprocessing phase for the document analysis system requires the conversion of a paper-based document to a digital bit-map representation after optical scanning followed by techniques of thresholding, skew detection, page segmentation and Optical Character Recognition (OCR). The system as a whole operates in a pipeline manner, where each phase passes its output to the next phase. The success of each stage guarantees the operation of the system as a whole as long as no failures occur in any phase that may reduce the character recognition rate. Document Image Analysis provides an essential guide to the implementation of document analysis techniques, explaining techniques and fundamentals in a clear and concise manner.