Java object header

overview

  • Object header
    Storage: basic information about the layout, type, GC status, synchronization status and identification hash code of heap objects. Java objects and vm internal objects have a common object header format.
    (to be described in detail later)

  • Instance data
    Store: class data information, parent class information, object field attribute information.
    If the object has attribute fields, there will be data information here. If the object has no attribute field, there will be no data here.
    According to different field types, it takes up different bytes, for example, boolean type takes up 1 byte, int type takes up 4 bytes, etc;

  • Align fill
    Storage: it is not necessary to fill in data for byte alignment.
    By default, the starting address of the object in the Java virtual machine heap needs to be aligned to a multiple of 8.
    If the size of the object header is 12, the size of the instance data is 5, and the nearest 8 times greater than 12 + 5 is 24, the alignment supplementary size is 24-12-5 = 7.

Why do I need object filling?

One of the reasons for field memory alignment is to make fields appear only in cache rows of the same CPU. If the fields are not aligned, there may be fields across cache rows. In other words, the reading of this field may need to replace two cache lines, and the storage of this field will pollute both cache lines at the same time. Both cases are unfavorable to the execution efficiency of the program. In fact, the ultimate purpose of filling it is to address the computer efficiently.

Object header

mark word

OpenJDK (JDK8) address: https://github.com/openjdk/jdk
According to markoop. Com in the official source code of OpenJDK The notes given in the HPP file can roughly see the composition of mark word.

MarkOop. Note 1 in HPP:

32 bits:
--------
           hash:25 ------------>| age:4    biased_lock:1 lock:2 (normal object)
           JavaThread*:23 epoch:2 age:4    biased_lock:1 lock:2 (biased object)
           size:32 ------------------------------------------>| (CMS free block)
           PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)

64 bits:
--------
           unused:25 hash:31 -->| unused:1   age:4    biased_lock:1 lock:2 (normal object)
           JavaThread*:54 epoch:2 unused:1   age:4    biased_lock:1 lock:2 (biased object)
           PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
           size:64 ----------------------------------------------------->| (CMS free block)

MarkOop. Note 2 in HPP:

    [JavaThread* | epoch | age | 1 | 01]       lock is biased toward given thread
    [0           | epoch | age | 1 | 01]       lock is anonymously biased

  - the two lock bits are used to describe three states: locked/unlocked and monitor.

    [ptr             | 00]  locked             ptr points to real header on stack
    [header      | 0 | 01]  unlocked           regular object header
    [ptr             | 10]  monitor            inflated lock (header is wapped out)
    [ptr             | 11]  marked             used by markSweep to mark an object
                                               not valid at any other time

MarkOop. Source code 1 in HPP:

  enum { age_bits                 = 4,
         lock_bits                = 2,
         biased_lock_bits         = 1,
         max_hash_bits            = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
         hash_bits                = max_hash_bits > 31 ? 31 : max_hash_bits,
         cms_bits                 = LP64_ONLY(1) NOT_LP64(0),
         epoch_bits               = 2
  };

As shown in the figure:

MarkOop. Source code 2 in HPP:

  enum { locked_value             = 0,
         unlocked_value           = 1,
         monitor_value            = 2,
         marked_value             = 3,
         biased_lock_pattern      = 5
  };
  • locked_value
    Lightweight lock status value, the last two bits of mark word are 00, which is converted to hexadecimal 0.
  • unlocked_value
    No lock status value. The last three digits of mark word are 001, which is converted to hexadecimal and 1.
  • monitor_value
    For the heavyweight lock status value, the last two bits of mark word are 10, which is converted to hexadecimal 2.
  • marked_value
    The last two digits of mark word are 11, which is converted to hexadecimal to 3.
    The function is complex,
    1: When a lock is upgraded to a heavyweight lock, markword will be set to this value.
    2: This value is also used when the object is GC.

markOop. The source code of HPP is as follows:

  // It is only used to store in the Lock Record to indicate that the lock is using the heavyweight monitor (this will be done before the lightweight lock expands to the heavyweight lock)
  static markOop unused_mark() {
    return (markOop) marked_value;
  }

  // age operations
  markOop set_marked()   { return markOop((value() & ~lock_mask_in_place) | marked_value); }
  markOop set_unmarked() { return markOop((value() & ~lock_mask_in_place) | unlocked_value); }
  • biased_lock_pattern
    Bias lock status value: the last three bits of mark word are 101, which is converted to hexadecimal and 5.

markOop.cpp also has the following codes to judge which lock state the current markword is in:

  // Lightweight Locking 
  bool is_locked()   const {
    return (mask_bits(value(), lock_mask_in_place) != unlocked_value);
  }
  // Bias lock
  bool is_unlocked() const {
    return (mask_bits(value(), biased_lock_mask_in_place) == unlocked_value);
  }
  // marked
  bool is_marked()   const {
    return (mask_bits(value(), lock_mask_in_place) == marked_value);
  }
  // No lock
  bool is_neutral()  const { return (mask_bits(value(), biased_lock_mask_in_place) == unlocked_value); }
  // Special temporary state of markOop during expansion. The code that looks at the tag outside the lock needs to take this into account.
  bool is_being_inflated() const { return (value() == 0); }
  // The lock object is in the process of being promoted to a heavyweight lock
  static markOop INFLATING() { return (markOop) 0; }

Why does the generation of object headers account for 4bit
Because the object will be put into the old generation after 15 GC, and the conversion of 15 into binary is 1111, accounting for 4 bits

The role of epoch
Copied from: http://www.itqiankun.com/article/bias-lock-epoch-effect

Its essence is a timestamp, which represents the validity of biased lock. epoch is stored in MarkWord of biased object.

① : in addition to the epoch in the object, an epoch value will also be saved in the class information of the class to which the object belongs.

② : whenever a global security point is encountered (this means that the batch re bias does not completely replace the global security point, and the global security point always exists), for example, to re bias class C in batch, first add the epoch saved in class C to get a new epoch_new.

③ : then scan all the thread stacks holding class C instances, judge whether the thread has locked the object according to the information of the thread stack, and only epoch_ The value of new is assigned to the locked object, that is, the object whose bias lock is still in use will be assigned epoch_new.

④ : after exiting the safety point, when a thread needs to try to obtain the bias lock, directly check whether the epoch value stored in class C is equal to the epoch value stored in the target object. If not, the bias lock of the object is invalid (because it has been said in step (3) that only the object with the bias lock is still in use will have an epoch_new, the reason for the inequality here is that the epoch value in class C is epoch_new, and the value in the epoch of the current object is still epoch). At this time, the competing thread can try to bias the object again.

klass point

The metadata pointer class pointer refers to the instanceKlass instance of the method area (the virtual machine uses this pointer to group the instances of which class this object is).

oop. Source code in HPP:

class oopDesc {
  friend class VMStructs;
 private:
  volatile markOop  _mark;
  union _metadata {
    Klass*      _klass;
    narrowKlass _compressed_klass;
  } _metadata;

length field

This property is only available for array objects and represents the length of the array.

arrayOop. There is a comment in HPP:

// The layout of array Oops is:
//
//  markOop
//  Klass*    // 32 bits if compressed but declared 64 in LP64.
//  length    // shares klass memory or allocated after declared fields.

summary

You may be asked this question during the interview: why can an object be used as a lock?
This aspect can be answered with the object header and markword mentioned above.

Posted by rhecker on Wed, 11 May 2022 14:51:52 +0300