几年前曾给同事做过一个java内存模型的knowledge sharing, 摘取部分内容放这, 简单的回顾下内存模型, 然后从JVM hotSpot实现的角度来仔细看看volatile的语义. 就此契机,整理出来.
1. reorder和memory barrier
指令执行的reorder有两个原因,一是编译器的优化,二是CPU执行的优化.
cpu硬件优化的out-of-order机制会导致指令的reorder. 另外,CPU的各个核有自己的registers和cache(L1,L2,L3),cache更新一致性协议会导致reorder,从cache角度更明了的体现了memory model中visibility概念的来源.
为了避免reorder,保证逻辑正确性,我们需要memory barrier.
有读(Load)有写(Store), 组合成四种基本的reorder类型(memory barrier类型)
- LoadLoad
- LoadStore
- StoreLoad
- StoreStore
还会看到acquire/release,为啥弄出这两概念呢,不妨从应用场景来理解.
Critical Section,
12345678910EnterCriticalSectionacquire semantics/--------------------------------------------\/ do critical job \all memory operations stay below the lineall memory operations stay above the line\ /\-----------------------------------------------/LeaveCriticalSection两个线程之间的同步
12345678910thread1:result = 100flag = true\ /\---------------/ (release语义)thread2:/--------------------\ (acquire语义)/ if (flag) \print (result)
可以看到,
- acquire == LoadLoad | LoadStore
- release == StoreStore| LoadStore
这样,acquire/release概念就不晦涩了. btw, C++ 11里支持low-level的acquire/release语义.
2. x86/64 CPU的Memory Model
从Intel手册里能看到:
- Reads are not reordered with other reads.
不需要特殊fence指令就能保证LoadLoad - 2.Writes are not reordered with older reads.
不需要特殊fence指令就能保证LoadStore - 3.Writes to memory are not reordered with other writes.
不需要特殊fence指令就能保证StoreStore - 4.Reads may be reordered with older writes to different locations but not with older writes to the same location.
需要特殊fence指令才能保证StoreLoad, 有名的Peterson algorithm算法就是需要StoreLoad的典型场景
看下HotSpot中memory barrier的实现,除了storeload, 其他的barrier在x86上并不需要cpu的barrier指令,这里是c++代码,只是一个compiler_barrier告诉gcc别瞎优化了.
3. Java volatile
规范里说:
- reads & writes act as aquire & release
- make non atomic 64-bit operations atomic: long and double
- writing that variable become visible to another thread
有了reorder和memory barrier的概念和对x86 CPU的了解后,来看看在HotSpot在x86平台上volatile的实现,不妨只看对volatile double类型写的处理部分
|
|
从代码和注释能看到,原子性的保证,还有内存屏障的语义的实现.
写完后加的StoreLoad | StoreStore(注意这里是汇编的barrier了,没compiler的事),第2小节里我们知道StoreLoad需要一个lock,这个指令x86上可以做为memory barrier指令.
4. Java volatile 实验 (show me the code…)
|
|
|
|
把java运行时的cpu指令打印出来:
5. C++ volatile != Java volatile
Java volatile其实和C++ std::atomic