Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8294366: RISC-V: Partially mark out incompressible regions #28

Merged
merged 1 commit into from Mar 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
98 changes: 44 additions & 54 deletions src/hotspot/cpu/riscv/assembler_riscv.hpp
Expand Up @@ -664,8 +664,8 @@ class Assembler : public AbstractAssembler {
emit(insn); \
}

INSN(_beq, 0b1100011, 0b000);
INSN(_bne, 0b1100011, 0b001);
INSN(beq, 0b1100011, 0b000);
INSN(bne, 0b1100011, 0b001);
INSN(bge, 0b1100011, 0b101);
INSN(bgeu, 0b1100011, 0b111);
INSN(blt, 0b1100011, 0b100);
Expand Down Expand Up @@ -882,7 +882,7 @@ class Assembler : public AbstractAssembler {
emit(insn); \
}

INSN(_jal, 0b1101111);
INSN(jal, 0b1101111);

#undef INSN

Expand Down Expand Up @@ -2104,20 +2104,30 @@ enum Nf {
// RISC-V Compressed Instructions Extension
// ========================================
// Note:
// 1. When UseRVC is enabled, 32-bit instructions under 'CompressibleRegion's will be
// transformed to 16-bit instructions if compressible.
// 2. RVC instructions in Assembler always begin with 'c_' prefix, as 'c_li',
// but most of time we have no need to explicitly use these instructions.
// 3. 'CompressibleRegion' is introduced to hint instructions in this Region's RTTI range
// are qualified to be compressed with their 2-byte versions.
// An example:
// 1. Assembler functions encoding 16-bit compressed instructions always begin with a 'c_'
// prefix, such as 'c_add'. Correspondingly, assembler functions encoding normal 32-bit
// instructions with begin with a '_' prefix, such as "_add". Most of time users have no
// need to explicitly emit these compressed instructions. Instead, they still use unified
// wrappers such as 'add' which do the compressing work through 'c_add' depending on the
// the operands of the instruction and availability of the RVC hardware extension.
//
// CompressibleRegion cr(_masm);
// __ andr(...); // this instruction could change to c.and if able to
// 2. 'CompressibleRegion' and 'IncompressibleRegion' are introduced to mark assembler scopes
// within which instructions are qualified or unqualified to be compressed into their 16-bit
// versions. An example:
//
// 4. Using -XX:PrintAssemblyOptions=no-aliases could distinguish RVC instructions from
// normal ones.
// CompressibleRegion cr(_masm);
// __ add(...); // this instruction will be compressed into 'c.and' when possible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments use c.and, but the example asm uses add, is this a typo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed typos, thanks for the catch. The openjdk/jdk repo also has these two typos. Not sure about the rule of backporting: should I fix it here and then file another patch on openjdk/jdk, or?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is better way.

  1. fix typos in opejdk/jdk
  2. combine two backports into one commit with two issue messages in it

alternative

  1. proper commit/backport ( with typos fixed) in 17u
  2. fix typos in openjdk/jdk but add 17-na label to the bug in jbs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed approaches. Going to choose the first one for it is cleaner. :-)

// {
// IncompressibleRegion ir(_masm);
// __ add(...); // this instruction will not be compressed
// {
// CompressibleRegion cr(_masm);
// __ add(...); // this instruction will be compressed into 'c.and' when possible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here

// }
// }
//
// 3. When printing JIT assembly code, using -XX:PrintAssemblyOptions=no-aliases could help
// distinguish compressed 16-bit instructions from normal 32-bit ones.

private:
bool _in_compressible_region;
Expand All @@ -2126,21 +2136,36 @@ enum Nf {
void set_in_compressible_region(bool b) { _in_compressible_region = b; }
public:

// a compressible region
class CompressibleRegion : public StackObj {
// an abstract compressible region
class AbstractCompressibleRegion : public StackObj {
protected:
Assembler *_masm;
bool _saved_in_compressible_region;
public:
CompressibleRegion(Assembler *_masm)
protected:
AbstractCompressibleRegion(Assembler *_masm)
: _masm(_masm)
, _saved_in_compressible_region(_masm->in_compressible_region()) {
, _saved_in_compressible_region(_masm->in_compressible_region()) {}
};
// a compressible region
class CompressibleRegion : public AbstractCompressibleRegion {
public:
CompressibleRegion(Assembler *_masm) : AbstractCompressibleRegion(_masm) {
_masm->set_in_compressible_region(true);
}
~CompressibleRegion() {
_masm->set_in_compressible_region(_saved_in_compressible_region);
}
};
// an incompressible region
class IncompressibleRegion : public AbstractCompressibleRegion {
public:
IncompressibleRegion(Assembler *_masm) : AbstractCompressibleRegion(_masm) {
_masm->set_in_compressible_region(false);
}
~IncompressibleRegion() {
_masm->set_in_compressible_region(_saved_in_compressible_region);
}
};

// patch a 16-bit instruction.
static void c_patch(address a, unsigned msb, unsigned lsb, uint16_t val) {
Expand Down Expand Up @@ -2841,43 +2866,8 @@ enum Nf {

#undef INSN

// --------------------------
// Conditional branch instructions
// --------------------------
#define INSN(NAME, C_NAME, NORMAL_NAME) \
void NAME(Register Rs1, Register Rs2, const int64_t offset) { \
/* beq/bne -> c.beqz/c.bnez */ \
if (do_compress() && \
(offset != 0 && Rs2 == x0 && Rs1->is_compressed_valid() && \
is_imm_in_range(offset, 8, 1))) { \
C_NAME(Rs1, offset); \
return; \
} \
NORMAL_NAME(Rs1, Rs2, offset); \
}

INSN(beq, c_beqz, _beq);
INSN(bne, c_bnez, _bne);

#undef INSN

// --------------------------
// Unconditional branch instructions
// --------------------------
#define INSN(NAME) \
void NAME(Register Rd, const int32_t offset) { \
/* jal -> c.j */ \
if (do_compress() && offset != 0 && Rd == x0 && is_imm_in_range(offset, 11, 1)) { \
c_j(offset); \
return; \
} \
_jal(Rd, offset); \
}

INSN(jal);

#undef INSN

// --------------------------
#define INSN(NAME) \
void NAME(Register Rd, Register Rs, const int32_t offset) { \
Expand Down
3 changes: 2 additions & 1 deletion src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.cpp
Expand Up @@ -342,8 +342,9 @@ void C1_MacroAssembler::verified_entry(bool breakAtEntry) {
// first instruction with a jump. For this action to be legal we
// must ensure that this first instruction is a J, JAL or NOP.
// Make it a NOP.
IncompressibleRegion ir(this); // keep the nop as 4 bytes for patching.
assert_alignment(pc());
nop();
nop(); // 4 bytes
}

void C1_MacroAssembler::load_parameter(int offset_in_words, Register reg) {
Expand Down
2 changes: 2 additions & 0 deletions src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp
Expand Up @@ -238,6 +238,8 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm) {
return;
}

Assembler::IncompressibleRegion ir(masm); // Fixed length: see entry_barrier_offset()

// RISCV atomic operations require that the memory address be naturally aligned.
__ align(4);

Expand Down
5 changes: 5 additions & 0 deletions src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
Expand Up @@ -245,6 +245,7 @@ void MacroAssembler::set_last_Java_frame(Register last_java_sp,
set_last_Java_frame(last_java_sp, last_java_fp, target(L), tmp);
} else {
L.add_patch_at(code(), locator());
IncompressibleRegion ir(this); // the label address will be patched back.
set_last_Java_frame(last_java_sp, last_java_fp, pc() /* Patched later */, tmp);
}
}
Expand Down Expand Up @@ -553,6 +554,7 @@ void MacroAssembler::unimplemented(const char* what) {
}

void MacroAssembler::emit_static_call_stub() {
IncompressibleRegion ir(this); // Fixed length: see CompiledStaticCall::to_interp_stub_size().
// CompiledDirectStaticCall::set_to_interpreted knows the
// exact layout of this stub.

Expand Down Expand Up @@ -757,6 +759,7 @@ void MacroAssembler::la(Register Rd, const Address &adr) {
}

void MacroAssembler::la(Register Rd, Label &label) {
IncompressibleRegion ir(this); // the label address may be patched back.
la(Rd, target(label));
}

Expand Down Expand Up @@ -2459,6 +2462,7 @@ void MacroAssembler::far_jump(Address entry, CodeBuffer *cbuf, Register tmp) {
assert(ReservedCodeCacheSize < 4*G, "branch out of range");
assert(CodeCache::find_blob(entry.target()) != NULL,
"destination of far call not found in code cache");
IncompressibleRegion ir(this); // Fixed length: see MacroAssembler::far_branch_size()
int32_t offset = 0;
if (far_branches()) {
// We can use auipc + jalr here because we know that the total size of
Expand All @@ -2476,6 +2480,7 @@ void MacroAssembler::far_call(Address entry, CodeBuffer *cbuf, Register tmp) {
assert(ReservedCodeCacheSize < 4*G, "branch out of range");
assert(CodeCache::find_blob(entry.target()) != NULL,
"destination of far call not found in code cache");
IncompressibleRegion ir(this); // Fixed length: see MacroAssembler::far_branch_size()
int32_t offset = 0;
if (far_branches()) {
// We can use auipc + jalr here because we know that the total size of
Expand Down
1 change: 1 addition & 0 deletions src/hotspot/cpu/riscv/nativeInst_riscv.cpp
Expand Up @@ -391,6 +391,7 @@ void NativeJump::patch_verified_entry(address entry, address verified_entry, add
void NativeGeneralJump::insert_unconditional(address code_pos, address entry) {
CodeBuffer cb(code_pos, instruction_size);
MacroAssembler a(&cb);
Assembler::IncompressibleRegion ir(&a); // Fixed length: see NativeGeneralJump::get_instruction_size()

int32_t offset = 0;
a.movptr_with_offset(t0, entry, offset); // lui, addi, slli, addi, slli
Expand Down
11 changes: 9 additions & 2 deletions src/hotspot/cpu/riscv/riscv.ad
Expand Up @@ -1325,8 +1325,11 @@ void MachPrologNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const {

// insert a nop at the start of the prolog so we can patch in a
// branch if we need to invalidate the method later
MacroAssembler::assert_alignment(__ pc());
__ nop();
{
Assembler::IncompressibleRegion ir(&_masm); // keep the nop as 4 bytes for patching.
MacroAssembler::assert_alignment(__ pc());
__ nop(); // 4 bytes
}

assert_cond(C != NULL);

Expand Down Expand Up @@ -1670,6 +1673,7 @@ void BoxLockNode::format(PhaseRegAlloc *ra_, outputStream *st) const {

void BoxLockNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const {
C2_MacroAssembler _masm(&cbuf);
Assembler::IncompressibleRegion ir(&_masm); // Fixed length: see BoxLockNode::size()

assert_cond(ra_ != NULL);
int offset = ra_->reg2offset(in_RegMask(0).find_first_elem());
Expand Down Expand Up @@ -2223,6 +2227,7 @@ encode %{

enc_class riscv_enc_java_static_call(method meth) %{
C2_MacroAssembler _masm(&cbuf);
Assembler::IncompressibleRegion ir(&_masm); // Fixed length: see ret_addr_offset

address addr = (address)$meth$$method;
address call = NULL;
Expand Down Expand Up @@ -2255,6 +2260,7 @@ encode %{

enc_class riscv_enc_java_dynamic_call(method meth) %{
C2_MacroAssembler _masm(&cbuf);
Assembler::IncompressibleRegion ir(&_masm); // Fixed length: see ret_addr_offset
int method_index = resolved_method_index(cbuf);
address call = __ ic_call((address)$meth$$method, method_index);
if (call == NULL) {
Expand All @@ -2273,6 +2279,7 @@ encode %{

enc_class riscv_enc_java_to_runtime(method meth) %{
C2_MacroAssembler _masm(&cbuf);
Assembler::IncompressibleRegion ir(&_masm); // Fixed length: see ret_addr_offset

// some calls to generated routines (arraycopy code) are scheduled
// by C2 as runtime calls. if so we can call them using a jr (they
Expand Down
14 changes: 10 additions & 4 deletions src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp
Expand Up @@ -1207,8 +1207,11 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
int vep_offset = ((intptr_t)__ pc()) - start;

// First instruction must be a nop as it may need to be patched on deoptimisation
MacroAssembler::assert_alignment(__ pc());
__ nop();
{
Assembler::IncompressibleRegion ir(masm); // keep the nop as 4 bytes for patching.
MacroAssembler::assert_alignment(__ pc());
__ nop(); // 4 bytes
}
gen_special_dispatch(masm,
method,
in_sig_bt,
Expand Down Expand Up @@ -1427,8 +1430,11 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,

// If we have to make this method not-entrant we'll overwrite its
// first instruction with a jump.
MacroAssembler::assert_alignment(__ pc());
__ nop();
{
Assembler::IncompressibleRegion ir(masm); // keep the nop as 4 bytes for patching.
MacroAssembler::assert_alignment(__ pc());
__ nop(); // 4 bytes
}

if (VM_Version::supports_fast_class_init_checks() && method->needs_clinit_barrier()) {
Label L_skip_barrier;
Expand Down