Modelling real-life systems and phenomena using mathematical based formalisms is ubiquitous in science and engineering. The reason is that mathematics o?er a suitable framework to carry out formal and rigorous analysis of these systems. For instance, in software engineering, formal methods are among the most e?cient tools to identify ?aws in software. The behavior of many real-life systems is inherently stochastic which require stochastic models such as labelled Markov processes (LMPs), Markov decision processes (MDPs), predictive state representations (PSRs), etc. This thesis is about quantifying the di?erence between stochastic systems. The important point of the thesis is that reinforcement learning (RL), a branch of arti?cial intelligence particularly e?cient in presence of uncertainty, can be used to quantify e?ciently the divergence between stochastic systems. The key idea is to de?ne an MDP out of the systems to be compared and then to interpret the optimal value of the MDP as the divergence between them. The most appealing feature of the proposed approach is that it does not rely on the knowledge of the internal structure of the systems.