Java Bytecode AssemblerJava Byte code *.class... • Java source code to Java byte code javac –g -g: to generate debug information • Java byte code to Jasmin code java JasminVisitor Read
Trang 1Code Generation
Dr Nguyen Hua Phung
Faculty of CSE
HCMUT
Trang 3Java Bytecode Assembler
Java Byte code (*.class)
Trang 4source code - Java mapping
• A source program ⇒ Java class(es)
• A global variable ⇒ a static field
• A local variable ⇒ a local variable
• An expression ⇒ an expression
(except an expression statement)
• An invocation ⇒ a static invocation
Trang 5Some issues
• An array declaration ⇒ a declaration + code
– global ⇒ code in the class init
– local ⇒ code in the enclosing method
• A main method ⇒ a main method with different signature
int main() ⇒ void main(String [] args)
return <integer expr> ⇒ return
Trang 6public class Cout {
static int a,b;
static float c [] = new float[3 ];
static int foo(int a,int b[]) { boolean c;
Trang 7• Java source code to Java byte code
javac –g <java source code>
-g: to generate debug information
• Java byte code to Jasmin code
java JasminVisitor <java byte code>
Read JavaToJasmin script for details
• Jasmin code to Java byte code
java -jar jasmin.jar <Jasmin code>
Read RunJasmin script for details
Trang 8Example –Jasmin
public class Cout {
static int a,b;
static float c[] = new float[3];
static int foo(int a,int b[]) {
public static void main(String[] arg) {
float d[] = new float[3];
.field static b I field static c [F method static <clinit>()V limit stack 1
.limit locals 0 line 3
iconst_3 newarray float putstatic Cout.c [F return
.end method
Trang 9Example –Jasmin (cont’d)
public class Cout {
static int a,b;
static float c[] = new float[3];
static int foo(int a,int b[]) {
public static void main(String[] arg) {
float d[] = new float[3];
Label0:
.line 1 aload_0 invokespecial java/lang/Object/<init>()V Label1:
return
.end method
Trang 10Example –Jasmin (cont’d)
public class Cout {
static int a,b;
static float c[] = new float[3];
static int foo(int a,int b[]) {
public static void main(String[] arg) {
float d[] = new float[3];
.line 6 aload_1 iconst_1 iaload iload_0 if_icmple Label0 iconst_1
goto Label1 Label0:
iconst_0 Label1:
istore_2
Label7:
.line 7
iload_2 ifeq Label2 iload_0 ireturn Label2:
.line 8
aload_1 iconst_0 iaload Label4:
ireturn limit stack 2
.limit locals 3
Trang 11Example –Jasmin (cont’d)
public class Cout {
static int a,b;
static float c[] = new float[3];
static int foo(int a,int b[]) {
public static void main(String[] arg) {
float d[] = new float[3];
.var 1 is d [F from Label3 to Label2 Label1:
.line 11 iconst_3 newarray float astore_1
Label3:
.line 12 getstatic Cout.c [F iconst_0
aload_1 iconst_0
Label0:
.line 14 getstatic Cout.a I iconst_1
iadd putstatic Cout.a I line 15
getstatic Cout.a I bipush 10
if_icmple Label0 Label2:
.line 16 return
Trang 13Code Generation Design
Independent-machine Code Generation
Dependent-machine Code Generation
Trang 15Intermediate Code Generation
• Depend on both language and machine
• Select instructions
• Select data objects
• Simulate the execution of the machine
– emitICONST → push()
– emitISTORE → pop()
emitREADVAR
emitILOADemitFLOAD
…(a)
(index)(index)getstatic a …
Trang 16• Tools are used to manage information used to generate code for a
method
– isMain: generating code for main is different than doing for a normal method
– Labels: are valid in the body of a method
• getNewLabel(): return a new label
• getStartLabel(): return the beginning label of a scope
• getEndLabel(): return the end label of a scope
• getContinueLabel(): return the label where a continue should come
• getBreakLabel(): return the label where a break should come– Local variable array
• getNewIndex(): return a new index for a variable
• getMaxIndex(): return the size of the local variable array– Operand stack
• push(): simulating a push execution
• pop(): simulating a pop execution
• getMaxOpStackSize(): return the max size of the operand stack
• enterScope()
• exitScope()
• enterLoop()
• exitLoop()
Trang 17Independent-Machine Code
Generation
• Based on the source language
• Use facilities of Frame(.java) and
Intermediate Code Generation
(Emitter.java)
Trang 19Symbol Entries
• Token id;
• Type type;
• int scope; //GLOBAL or LOCAL
- the scope where the id is declared
• Object object;
- the index if id is a local variable
- the class name if id is a method name
Trang 20Variable Declarations
• Global variables
.field static name type-desc
- for an array, generating code to initialize the array in the class init
• Local variables
.var var-index is name type-desc scopeStart-label scopeEnd-label
- for an array, generating code to initialize the array in the method
• How ?
– Emitter.emitVARDECL(SymEntry sym)
– Put an array declaration to a list and pass the list to
Emitter.emitCLINIT(<list>) , for global declarations /*TODO*/, or Emitter.emitLISTARRAY(<list>), for locals
Trang 21Global Procedure Declarations
Trang 23visitCallExprAST
Trang 25imulidivfmulfdiv{left for you}
Trang 26else … return t1;
} Tool: Emitter.java
emitADDOP(String,Type) => iadd, isub,fadd,fsub
emitMULOP(String,Type) => imul,idiv,fmul,fdiv
Type t1=(Type) Type t2=(Type)
Trang 27• 8 * 10 – 12 / 4 bipush 8bipush 10
imulbipush 12iconst_4idiv
Trang 28Boolean Expressions
Object visitBinExprAST(ast,…) {
… else if (op.kind == Token.ANDOP) em.printout(em.emitANDOP());
else if (op.kind == Token.OROP) em.printout(em.emitOROP());
• (false && true)
• boolean foo() {putString(“true”); return true;}
(false && foo());
⇒ in short-circuit evaluation,
no string is printed out
iconst_0iconst_1iand
iconst_0invokestatic Cout/foo()Z
“true” is printed out
Use tool Emitter.ANDOP() and Emitter.OROP()
Trang 29Short-circuit Evaluation
• evaluation order is from left to right
• for andop , if left operand is false, the expr is false
• for orop , if left operand is true, the expr is true
• boolean foo() {putString(“true”); return true;}
ifeq LabelFinvokestatic Cout/foo()Zifeq LabelF
iconst_1goto LabelOLabelF:
Write your own Emitter.emitANDOP(
Trang 30Relational Expressions
a > b
iload_1 ; aiload_2 ; bif_icmple LabelFiconst_1
goto LabelOLabelF:
iconst_0LabelO:
fload_1 ; afload_2 ; bfcmpl
ifle LabelFiconst_1goto LabelOLabelF:
iconst_0LabelO:
Write your own Emitter.emitRELOP(String,Type,LabelF,LabelO)
Trang 31istore_1; a
bipush 8bipush 12imul
dupistore_1; adup
istore_2; bbipush 8
bipush 12dup
istore_1; aimul ; b
{E.code = E1.code+Emitter.emitDUP()+T.code;}
bipush 8bipush 12imul
dupistore_1;aistore_2;b
{left for you}
Trang 32• a + 8
• visitVarExprAST(ast,…) {
SymEntry s=sym.lookup(ast.name.Lexeme); em.printout(em.emitREADVAR(s));
Type t = s.getType();
return t;
}
iload_1 ; abipush 8iadd
class VarExprAST{
Token name; }
Tool:
Emitter.emitREADVAR(SymEntry)
Trang 33• method-name = class-name + ‘/’ + name.Lexeme
• class name of
iload_1iload_2iconst_4imul
invokestatic Cout/foo …
class CallExprAST {ExprListAST e;Token name;
Trang 35– Repeat statement (left for you)
– For statement (left for you)
• Continue statement
• Break statement
visitAssiStmtASTvisitCompStmtASTvisitIfThenStmtAST,visitIfThenElseStmtAST
visitWhileStmtASTvisitRepeatStmtASTvisitForStmtASTvisitContStmtASTvisitBreakStmtAST
Trang 36Assignment Statement
• a = 8 * 12;
• visitAssiStmtAST(ast,…){
ast.l.visit(this,true) ast.e.visit(this,o);
ast.l.visit(this,false)
• b[1] = 8 * 12 ; ???
bipush 8bipush 12imul
istore_1; a
aload_2iconst_1bipush 8bipush 12imul
iastore
class AssiStmtAST {LvalueAST l;
ExprAST e;
abstract class LvalueAST
class VarExprAST extends LvalueAST
Trang 37Lvalue v.s Rvalue
a = a + 1 visitVarExprAST(ast, o)
o is null → rvalue, use slide 32
o is boolean → lvalue
o is true → do nothing
o is false → use Emitter.emitWRITEVAR
Trang 39.var 2 is b Z from Label5 to Label6 Label5:
iconst_1 istore_2
index of local variable is reset when parsing out of a scope
Trang 40Compound Statements (cont’d)
.var 2 is a I from Label2 to Label3
create 2 new labels startLabel and endLabel put these labels onto stacks startStack and endStack keep the new index of local variables
pop on stacks startStack and endStack restore the new index of local variables
Class CompStmtAST {VarDeclPartAST v;
Trang 41goto Label2Label1:
iconst_0Label2:
ifeq Label3iconst_1istore_3goto Label4Label3:
Trang 43…
CE2
S
inc/dec
continue Label
continue Label
E1
Trang 44While Statements while (a > b) do a = a -1 ;
visitWhileStmtAST(ast
ast.e.visit ast.s.visit
Label1:
iload_1 iload_2 if_icmple Label3 iconst_1
goto Label4 Label3:
iconst_0 Label4:
ifeq Label2 iload_1 iconst_1 isub istore_1 goto Label1
Trang 45Continue and Break
S → continue;
{em.printout(em.emitGOTO(Frame.getContinueLabel()));}
S → break;
{em.printout(em.emitGOTO(Frame.getBreakLabel()));}
Trang 46Return Statementsreturn 0;
Trang 51Intermediate Languages (cont’d)
Semantic Rule Production
Trang 52Intermediate Languages (cont’d)
t4 = b * t3 t5 = t2 + t4
a = t5
t1 = -c t2 = b * t1 t3 = t2 + t2
a = t3
a = b * -c + b * -c
Trang 53Intermediate Languages (cont’d)
Trang 54– jump backward ⇒ easy
– jump forward ⇒ difficult ⇒ backpatching
• makelist(i)
• merge(l1,l2)
• backpatch(l,i)
Trang 55*R indirect register
1 c+contents(R)
c(R) indexed
0 R
R register
1 M
M absolute
ADDED COSTADDRESS
FORMMODE
Trang 57(R1,R2 contain values of b and c, respectively,
(R0,R1 and R2 contain addresses of a,b and c, respectively)
cost = 6
cost = 6
cost = 2
Trang 58Register Allocation
• Register allocation
– which variables will reside in registers at a
point in the program
• Register assignment
– which register a variable will reside in
Trang 59variable i should reside in a register, r0
variable c should reside in a register, r1
c resides in r0
where is c ?
Trang 60Basic Blocks
-Algorithm 9.1, page 529
Trang 61Flow Graphs
• A flow graph is a directed graph, whose
nodes are basic blocks and whose edges
are flow-of-control.
Trang 62
-Next-Use and Liveness
i: x =
…j: … = x
no assignment to xthere is a path from i to j
j uses the value of x computed at i
a variable x is live at point p if the value of x could be used along some
path in the flow graph starting at p Otherwise, it is dead
Read page 534 for details
Read page 633 for details
(1) a = 1
(2) b = a + 1
(3) print(b);
(2) uses the value of a computed at (1)
a is live at (2) but dead at (3)
Trang 63Simple Code Generator
• For a basic block
• Assume computed results can be left in registers
as long as possible, storing them only
– their register is needed for another computation
– just at the end of a basic block
Trang 64Simple Code-Generation Algorithm
• getreg() to know the location L where the result should
be stored.
• Consult the address descriptor for y to know y’, where the current value of y is
• If y’ is not L, generate MOV y’,L
• Generate OP z’, L, where z’ is the current location of z.
• Update address descriptor(x) = {L}
• If L is a register, update descriptor(L) = {x}
– has no next uses
– are not live on exit from the block
x := y op z
Trang 65Simple Function Getreg
return the location L that holds the value of x for the statement x = y op z
1 if y
• is in a register (holds the value of no other names)
• is not live
• has not next use
⇒ return the register as L and descriptor(y) ∉ L
2 else return an empty register if there is one
3 else if x has next use,
or op is an operator that requires a register
⇒ find an occupied register R
store the value of R into a memory location (MOV R,M)update descriptor(M)
return R
Trang 66d lives at the end while the others are dead
all registers (r0,r1) are empty
Code ???
Trang 67d = v + u
u in r1
v in r0
r0 contains vr1 contains u
MOV a,r1SUB c,r1
u = a - c
t in r0r0 contains t
MOV a, r0SUB b,r0
t = a - b
Address descriptor
Register descriptorCode generated
Statements