Embedded systems are becoming more and more important.Previously, because embedded systems were highly limited in computational capability, memory size, and power consumption, system performance issues, such as execution time, were traded off with system resources, and resources were carefully scheduled and utilized. With more available computational capability in embedded system devices, such as multi-core devices, and more complicated requirements demanding more intensive computation, the most critical design concerns are changing in some important application domains. Execution time is especially critical to real time systems, in the sense that it is related not only to system performance, but also to system correctness and reliability. This thesis explores modeling and optimization techniques for hardware-software co-design of parallel embedded systems. We propose a dataflow based framework, which covers modeling, analysis and optimization and bridges between user-friendly design and efficient implementation. The framework is applied to different kinds of applications.