The efficient high-throughput VLSI implementation of near-optimal multiple-input multiple-output (MIMO) detectors for MIMO systems with large number of antennas in high-order quadrature amplitude modulation (QAM) schemes has been a major challenge in the literature. To address this challenge, this book introduces a novel scalable pipelined VLSI ar- chitecture for a 4 X 4 64-QAM MIMO receiver based on K-Best lattice decoders. The key contribution is a means of expanding/visiting the intermediate nodes of the search tree on-demand, rather than exhaustively along with three types of dis- tributed sorters operating in a pipelined structure. The combined expansion and sorting cores are able to find the K best candidates in K clock cycles. The pro- posed architecture has a fixed critical path independent of the constellation order, on-demand expansion scheme, efficient distributed sorters, and is scalable to a higher number of antennas/constellation orders. Fabricated in 0.13um CMOS, it operates at a significantly higher throughput than currently reported schemes.