The accessibility of mobile terminals, which tend to be lighter and smaller, hampers the development of new services over wireless networks. This trend makes it more difficult, or even frustrates, the interaction of the user with the service. Thus, the development of new user interfaces, providing a ubiquitous, pervasive and multimodal interaction, is a necessary step for the next generation of mobile services. In this scene, automatic speech recognition is a promising way for an easy and natural user access to network services. However, mobile devices are characterized by a restricted computing power, small limited-speed memories and short battery life. In this work, we show how speech recognition based on VoIP technologies allows circumventing these hardware constraints by moving the most complex computational tasks of speech recognition to a remote server. Under this approximation, the user device has to send coded speech or speech parameters through IP networks, which were not designed for real-time communications. For this reason, special emphasis is placed on proposing efficient techniques to avoid the negative impact of network impairments on speech recognition performance.