Caching is primarily a memory performance optimization technique. In the presence of multiple copies of cached values, as in a multiprocessor system, issues of correctness and consistency arise, for which a cache coherence mechanism provides a solution. In this thesis, instead of using globally controlled directory based method, an alternate way is suggested, in which cache coherence is locally directed by individual processor. For this, compiler support in the form of program annotation is provided, which helps identify the cohrence boundary at run-time. A hardware support in the form of small buffer with 8 entry 4 way associative structure is devised for carrying out self-invalidation and update of memory. Performance evaluation of the proposed scheme using SPLASH-2 benchmark suite on RSIM simulator shows significant speed-up - a maximum of 4.31 - over directory based approach.