In the search algorithm, we determine the starting point as that root of a tree branch, including multiple paths in the direction of the target, we choose from these pathways one of the points to be the new starting point a new phase, and thus point to a new tree root, and so on until we reach the goal. The blind is not Known or the search each time the generation of new phases and then tested these stages, including what it achieves the goal? If so, the search ends, otherwise to be repeated Find generate new stage and so on. Find full-duplex operation is based on two research papers at one time, one stats from the starting point and the second starts from the last point and head towards the starting point and join the papers were in the mid-point. Empirical research or information is based on the information on the following stages and which may not lead to a goal. And Q-learning is one type of the search types , we chose it in our subject because it contains the basic properties that is : The speed , the accuracy and the easy arrival to the desired aim by the shortest time and minimum possible distance.