THE Java® programming language is a generalpurpose, concurrent, classbased, objectoriented language. It is designed to be simple enough that manyprogrammers can achieve fluency in the language. The Java programming languageis related to C and C++ but is organized rather differently, with a number of aspectsof C and C++ omitted and a few ideas from other languages included. It is intendedto be a production language, not a research language, and so, as C. A. R. Hoaresuggested in his classic paper on language design, the design has avoided includingnew and untested features
Trang 1The Java ® Language
Specification
Java SE 8 Edition
James Gosling Bill Joy Guy Steele Gilad Bracha Alex Buckley
2014-03-03
Trang 2Specification: JSR-337 Java SE 8 Release Contents ("Specification")
Version: 8
Status: Final Release
Release: March 2014
Copyright © 1997, 2014, Oracle America, Inc and/or its affiliates All rights reserved
500 Oracle Parkway, Redwood City, California 94065, U.S.A
Oracle and Java are registered trademarks of Oracle and/or its affiliates Other names may
be trademarks of their respective owners
The Specification provided herein is provided to you only under the Limited License Grant
included herein as Appendix A Please see Appendix A, Limited License Grant.
Trang 3To Maurizio, with deepest thanks.
Trang 52.2 The Lexical Grammar 9
2.3 The Syntactic Grammar 10
3.10.6 Escape Sequences for Character and String Literals 37
3.10.7 The Null Literal 38
3.11 Separators 38
3.12 Operators 39
Trang 6The Java Language Specification
4 Types, Values, and Variables 41
4.1 The Kinds of Types and Values 41
4.2 Primitive Types and Values 42
4.2.1 Integral Types and Values 434.2.2 Integer Operations 434.2.3 Floating-Point Types, Formats, and Values 454.2.4 Floating-Point Operations 48
4.2.5 The boolean Type and boolean Values 514.3 Reference Types and Values 52
4.3.1 Objects 534.3.2 The Class Object 564.3.3 The Class String 564.3.4 When Reference Types Are the Same 574.4 Type Variables 57
4.5 Parameterized Types 59
4.5.1 Type Arguments of Parameterized Types 604.5.2 Members and Constructors of Parameterized Types 634.6 Type Erasure 64
4.10.4 Least Upper Bound 734.11 Where Types Are Used 76
4.12 Variables 80
4.12.1 Variables of Primitive Type 814.12.2 Variables of Reference Type 814.12.3 Kinds of Variables 83
4.12.4 final Variables 854.12.5 Initial Values of Variables 874.12.6 Types, Classes, and Interfaces 88
5 Conversions and Contexts 91
5.1 Kinds of Conversion 94
5.1.1 Identity Conversion 945.1.2 Widening Primitive Conversion 945.1.3 Narrowing Primitive Conversion 965.1.4 Widening and Narrowing Primitive Conversion 995.1.5 Widening Reference Conversion 99
5.1.6 Narrowing Reference Conversion 995.1.7 Boxing Conversion 100
5.1.8 Unboxing Conversion 1025.1.9 Unchecked Conversion 1035.1.10 Capture Conversion 103
Trang 7The Java Language Specification
5.5.1 Reference Type Casting 118
5.5.2 Checked Casts and Unchecked Casts 122
5.5.3 Checked Casts at Run Time 123
5.6 Numeric Contexts 125
5.6.1 Unary Numeric Promotion 125
5.6.2 Binary Numeric Promotion 126
6.5 Determining the Meaning of a Name 148
6.5.1 Syntactic Classification of a Name According to Context 149
6.5.2 Reclassification of Contextually Ambiguous Names 152
6.5.3 Meaning of Package Names 154
6.5.3.1 Simple Package Names 1556.5.3.2 Qualified Package Names 1556.5.4 Meaning of PackageOrTypeNames 155
6.5.4.1 Simple PackageOrTypeNames 155
6.5.4.2 Qualified PackageOrTypeNames 155
6.5.5 Meaning of Type Names 155
6.5.5.1 Simple Type Names 1566.5.5.2 Qualified Type Names 1566.5.6 Meaning of Expression Names 156
6.5.6.1 Simple Expression Names 1566.5.6.2 Qualified Expression Names 1576.5.7 Meaning of Method Names 160
6.5.7.1 Simple Method Names 1606.6 Access Control 161
6.6.1 Determining Accessibility 162
6.6.2 Details on protected Access 166
6.6.2.1 Access to a protected Member 1676.6.2.2 Qualified Access to a protected Constructor 1676.7 Fully Qualified Names and Canonical Names 169
7 Packages 173
7.1 Package Members 173
Trang 8The Java Language Specification
7.2 Host Support for Packages 175
7.3 Compilation Units 177
7.4 Package Declarations 178
7.4.1 Named Packages 1787.4.2 Unnamed Packages 1797.4.3 Observability of a Package 1797.5 Import Declarations 180
7.5.1 Single-Type-Import Declarations 1807.5.2 Type-Import-on-Demand Declarations 1837.5.3 Single-Static-Import Declarations 1847.5.4 Static-Import-on-Demand Declarations 1847.6 Top Level Type Declarations 185
8 Classes 189
8.1 Class Declarations 191
8.1.1 Class Modifiers 191
8.1.1.1 abstract Classes 1928.1.1.2 final Classes 1948.1.1.3 strictfp Classes 1948.1.2 Generic Classes and Type Parameters 1948.1.3 Inner Classes and Enclosing Instances 1978.1.4 Superclasses and Subclasses 200
8.1.5 Superinterfaces 2028.1.6 Class Body and Member Declarations 2058.2 Class Members 206
8.3 Field Declarations 211
8.3.1 Field Modifiers 215
8.3.1.1 static Fields 2168.3.1.2 final Fields 2198.3.1.3 transient Fields 2198.3.1.4 volatile Fields 2208.3.2 Field Initialization 221
8.3.3 Forward References During Field Initialization 2228.4 Method Declarations 225
8.4.1 Formal Parameters 2268.4.2 Method Signature 2308.4.3 Method Modifiers 231
8.4.3.1 abstract Methods 2328.4.3.2 static Methods 2338.4.3.3 final Methods 2348.4.3.4 native Methods 2358.4.3.5 strictfp Methods 2358.4.3.6 synchronized Methods 2358.4.4 Generic Methods 237
8.4.5 Method Result 2378.4.6 Method Throws 2388.4.7 Method Body 240
Trang 9The Java Language Specification
8.4.8 Inheritance, Overriding, and Hiding 240
8.4.8.1 Overriding (by Instance Methods) 2418.4.8.2 Hiding (by Class Methods) 2458.4.8.3 Requirements in Overriding and Hiding 2468.4.8.4 Inheriting Methods with Override-Equivalent
Signatures 2508.4.9 Overloading 250
8.5 Member Type Declarations 254
8.5.1 Static Member Type Declarations 254
9.1.3 Superinterfaces and Subinterfaces 280
9.1.4 Interface Body and Member Declarations 282
9.2 Interface Members 282
9.3 Field (Constant) Declarations 283
9.3.1 Initialization of Fields in Interfaces 285
9.4 Method Declarations 286
9.4.1 Inheritance and Overriding 287
9.4.1.1 Overriding (by Instance Methods) 2889.4.1.2 Requirements in Overriding 2899.4.1.3 Inheriting Methods with Override-Equivalent
Signatures 2899.4.2 Overloading 290
9.4.3 Interface Method Body 291
Trang 10The Java Language Specification
9.5 Member Type Declarations 291
9.6 Annotation Types 292
9.6.1 Annotation Type Elements 2939.6.2 Defaults for Annotation Type Elements 2979.6.3 Repeatable Annotation Types 298
9.6.4 Predefined Annotation Types 302
9.6.4.1 @Target 3029.6.4.2 @Retention 3039.6.4.3 @Inherited 3049.6.4.4 @Override 3049.6.4.5 @SuppressWarnings 3059.6.4.6 @Deprecated 3069.6.4.7 @SafeVarargs 3079.6.4.8 @Repeatable 3089.6.4.9 @FunctionalInterface 3089.7 Annotations 308
9.7.1 Normal Annotations 3099.7.2 Marker Annotations 3119.7.3 Single-Element Annotations 3129.7.4 Where Annotations May Appear 3139.7.5 Multiple Annotations Of The Same Type 3189.8 Functional Interfaces 319
10.8 Class Objects for Arrays 338
10.9 An Array of Characters is Not a String 339
11 Exceptions 341
11.1 The Kinds and Causes of Exceptions 342
11.1.1 The Kinds of Exceptions 34211.1.2 The Causes of Exceptions 34311.1.3 Asynchronous Exceptions 34411.2 Compile-Time Checking of Exceptions 345
11.2.1 Exception Analysis of Expressions 34611.2.2 Exception Analysis of Statements 34711.2.3 Exception Checking 348
11.3 Run-Time Handling of an Exception 350
Trang 11The Java Language Specification
12 Execution 355
12.1 Java Virtual Machine Startup 355
12.1.1 Load the Class Test 356
12.1.2 Link Test: Verify, Prepare, (Optionally) Resolve 356
12.1.3 Initialize Test: Execute Initializers 357
12.1.4 Invoke Test.main 358
12.2 Loading of Classes and Interfaces 358
12.2.1 The Loading Process 359
12.3 Linking of Classes and Interfaces 360
12.3.1 Verification of the Binary Representation 360
12.3.2 Preparation of a Class or Interface Type 361
12.3.3 Resolution of Symbolic References 361
12.4 Initialization of Classes and Interfaces 362
12.4.1 When Initialization Occurs 363
12.4.2 Detailed Initialization Procedure 365
12.5 Creation of New Class Instances 367
12.6 Finalization of Class Instances 371
12.6.1 Implementing Finalization 372
12.6.2 Interaction with the Memory Model 374
12.7 Unloading of Classes and Interfaces 375
12.8 Program Exit 376
13 Binary Compatibility 377
13.1 The Form of a Binary 378
13.2 What Binary Compatibility Is and Is Not 384
13.4.4 Superclasses and Superinterfaces 386
13.4.5 Class Type Parameters 387
13.4.6 Class Body and Member Declarations 388
13.4.7 Access to Members and Constructors 389
13.4.8 Field Declarations 390
13.4.9 final Fields and static Constant Variables 393
13.4.10 static Fields 395
13.4.11 transient Fields 395
13.4.12 Method and Constructor Declarations 396
13.4.13 Method and Constructor Type Parameters 396
13.4.14 Method and Constructor Formal Parameters 397
13.4.15 Method Result Type 398
Trang 12The Java Language Specification
13.4.21 Method and Constructor Throws 40013.4.22 Method and Constructor Body 40013.4.23 Method and Constructor Overloading 40113.4.24 Method Overriding 402
13.4.25 Static Initializers 40213.4.26 Evolution of Enums 40213.5 Evolution of Interfaces 402
13.5.1 public Interfaces 40213.5.2 Superinterfaces 40313.5.3 Interface Members 40313.5.4 Interface Type Parameters 40313.5.5 Field Declarations 40413.5.6 Interface Method Declarations 40413.5.7 Evolution of Annotation Types 405
14 Blocks and Statements 407
14.1 Normal and Abrupt Completion of Statements 407
14.2 Blocks 409
14.3 Local Class Declarations 409
14.4 Local Variable Declaration Statements 410
14.4.1 Local Variable Declarators and Types 41114.4.2 Execution of Local Variable Declarations 41214.5 Statements 412
14.6 The Empty Statement 414
14.11 The switch Statement 421
14.12 The while Statement 425
14.12.1 Abrupt Completion of while Statement 42614.13 The do Statement 426
14.13.1 Abrupt Completion of do Statement 42714.14 The for Statement 428
14.14.1 The basic for Statement 428
14.14.1.1 Initialization of for Statement 42914.14.1.2 Iteration of for Statement 42914.14.1.3 Abrupt Completion of for Statement 43014.14.2 The enhanced for statement 431
14.15 The break Statement 434
14.16 The continue Statement 436
14.17 The return Statement 438
14.18 The throw Statement 439
14.19 The synchronized Statement 441
14.20 The try statement 442
Trang 13The Java Language Specification
14.20.1 Execution of try-catch 446
14.20.2 Execution of try-finally and try-catch-finally 447
14.20.3 try-with-resources 449
14.20.3.1 Basic try-with-resources 45014.20.3.2 Extended try-with-resources 45314.21 Unreachable Statements 454
15.5 Expressions and Run-Time Checks 464
15.6 Normal and Abrupt Completion of Evaluation 466
15.7 Evaluation Order 468
15.7.1 Evaluate Left-Hand Operand First 468
15.7.2 Evaluate Operands before Operation 470
15.7.3 Evaluation Respects Parentheses and Precedence 471
15.7.4 Argument Lists are Evaluated Left-to-Right 472
15.7.5 Evaluation Order for Other Expressions 473
15.9 Class Instance Creation Expressions 478
15.9.1 Determining the Class being Instantiated 479
15.9.2 Determining Enclosing Instances 481
15.9.3 Choosing the Constructor and its Arguments 483
15.9.4 Run-Time Evaluation of Class Instance Creation
Expressions 48515.9.5 Anonymous Class Declarations 487
15.9.5.1 Anonymous Constructors 48715.10 Array Creation and Access Expressions 488
15.10.1 Array Creation Expressions 488
15.10.2 Run-Time Evaluation of Array Creation Expressions 489
15.10.3 Array Access Expressions 493
15.10.4 Run-Time Evaluation of Array Access Expressions 493
15.11 Field Access Expressions 496
15.11.1 Field Access Using a Primary 496
15.11.2 Accessing Superclass Members using super 499
15.12 Method Invocation Expressions 500
15.12.1 Compile-Time Step 1: Determine Class or Interface to
Search 50215.12.2 Compile-Time Step 2: Determine Method Signature 504
15.12.2.1 Identify Potentially Applicable Methods 510
Trang 14The Java Language Specification
15.12.2.2 Phase 1: Identify Matching Arity Methods Applicable
by Strict Invocation 51315.12.2.3 Phase 2: Identify Matching Arity Methods Applicable
by Loose Invocation 51415.12.2.4 Phase 3: Identify Methods Applicable by Variable Arity
Invocation 51415.12.2.5 Choosing the Most Specific Method 51515.12.2.6 Method Invocation Type 518
15.12.3 Compile-Time Step 3: Is the Chosen Method Appropriate? 51815.12.4 Run-Time Evaluation of Method Invocation 521
15.12.4.1 Compute Target Reference (If Necessary) 52215.12.4.2 Evaluate Arguments 523
15.12.4.3 Check Accessibility of Type and Method 52415.12.4.4 Locate Method to Invoke 525
15.12.4.5 Create Frame, Synchronize, Transfer Control 52915.13 Method Reference Expressions 531
15.13.1 Compile-Time Declaration of a Method Reference 53415.13.2 Type of a Method Reference 539
15.13.3 Run-time Evaluation of Method References 54115.14 Postfix Expressions 544
15.14.1 Expression Names 54515.14.2 Postfix Increment Operator ++ 54515.14.3 Postfix Decrement Operator 54515.15 Unary Operators 546
15.15.1 Prefix Increment Operator ++ 54815.15.2 Prefix Decrement Operator 54815.15.3 Unary Plus Operator + 54915.15.4 Unary Minus Operator - 54915.15.5 Bitwise Complement Operator ~ 55015.15.6 Logical Complement Operator ! 55015.16 Cast Expressions 550
15.17 Multiplicative Operators 552
15.17.1 Multiplication Operator * 55315.17.2 Division Operator / 55415.17.3 Remainder Operator % 55515.18 Additive Operators 558
15.18.1 String Concatenation Operator + 55815.18.2 Additive Operators (+ and -) for Numeric Types 56115.19 Shift Operators 563
15.22.1 Integer Bitwise Operators &, ^, and | 570
Trang 15The Java Language Specification
15.22.2 Boolean Logical Operators &, ^, and | 571
15.23 Conditional-And Operator && 571
15.24 Conditional-Or Operator || 572
15.25 Conditional Operator ? : 573
15.25.1 Boolean Conditional Expressions 580
15.25.2 Numeric Conditional Expressions 580
15.25.3 Reference Conditional Expressions 581
15.26 Assignment Operators 582
15.26.1 Simple Assignment Operator = 583
15.26.2 Compound Assignment Operators 589
15.27 Lambda Expressions 595
15.27.1 Lambda Parameters 597
15.27.2 Lambda Body 600
15.27.3 Type of a Lambda Expression 603
15.27.4 Run-time Evaluation of Lambda Expressions 605
15.28 Constant Expressions 606
16 Definite Assignment 609
16.1 Definite Assignment and Expressions 615
16.1.1 Boolean Constant Expressions 615
16.1.2 Conditional-And Operator && 615
16.2.3 Local Class Declaration Statements 621
16.2.4 Local Variable Declaration Statements 621
16.2.14 synchronized Statements 626
16.2.15 try Statements 626
Trang 16The Java Language Specification
16.3 Definite Assignment and Parameters 628
16.4 Definite Assignment and Array Initializers 628
16.5 Definite Assignment and Enum Constants 628
16.6 Definite Assignment and Anonymous Classes 629
16.7 Definite Assignment and Member Types 629
16.8 Definite Assignment and Static Initializers 630
16.9 Definite Assignment, Constructors, and Instance Initializers 630
17 Threads and Locks 633
17.1 Synchronization 634
17.2 Wait Sets and Notification 634
17.2.1 Wait 63517.2.2 Notification 63617.2.3 Interruptions 63717.2.4 Interactions of Waits, Notification, and Interruption 63717.3 Sleep and Yield 638
17.4 Memory Model 639
17.4.1 Shared Variables 64217.4.2 Actions 642
17.4.3 Programs and Program Order 64317.4.4 Synchronization Order 64417.4.5 Happens-before Order 64517.4.6 Executions 648
17.4.7 Well-Formed Executions 64917.4.8 Executions and Causality Requirements 64917.4.9 Observable Behavior and Nonterminating Executions 65217.5 final Field Semantics 654
17.5.1 Semantics of final Fields 65617.5.2 Reading final Fields During Construction 65617.5.3 Subsequent Modification of final Fields 65717.5.4 Write-protected Fields 658
18.2 Reduction 665
18.2.1 Expression Compatibility Constraints 66518.2.2 Type Compatibility Constraints 67018.2.3 Subtyping Constraints 670
18.2.4 Type Equality Constraints 67218.2.5 Checked Exception Constraints 67318.3 Incorporation 675
18.3.1 Complementary Pairs of Bounds 676
Trang 17The Java Language Specification
18.3.2 Bounds Involving Capture Conversion 676
18.4 Resolution 677
18.5 Uses of Inference 679
18.5.1 Invocation Applicability Inference 680
18.5.2 Invocation Type Inference 681
18.5.3 Functional Interface Parameterization Inference 687
18.5.4 More Specific Method Inference 688
19 Syntax 691
Index 717
A Limited License Grant 757
Trang 19Preface to the Java SE 8 Edition
IN 1996, James Gosling, Bill Joy, and Guy Steele wrote for the First Edition of
The Java Language Specification:
"We believe that the Java programming language is a mature language, ready for
widespread use Nevertheless, we expect some evolution of the language in the
years to come We intend to manage this evolution in a way that is completely
compatible with existing applications."
Java SE 8 represents the single largest evolution of the Java language in its history
A relatively small number of features - lambda expressions, method references, and
functional interfaces - combine to offer a programming model that fuses the
object-oriented and functional styles Under the leadership of Brian Goetz, this fusion
has been accomplished in a way that encourages best practices - immutability,
statelessness, compositionality - while preserving "the feel of Java" - readability,
simplicity, universality
Crucially, the libraries of the Java SE platform have co-evolved with the Java
language This means that using lambda expressions and method references to
represent behavior - for example, an operation to be applied to each element in
a list - is productive and performant "out of the box" In a similar fashion, the
Java Virtual Machine has co-evolved with the Java language to ensure that default
methods support library evolution as consistently as possible across compile time
and run time, given the constraints of separate compilation
Initiatives to add first-class functions to the Java language have been around since
the 1990s The BGGA and CICE proposals circa 2007 brought new energy to
the topic, while the creation of Project Lambda in OpenJDK circa 2009 attracted
unprecedented levels of interest The addition of method handles to the JVM in
Java SE 7 opened the door to new implementation techniques while retaining
"write once, run anywhere" In time, language changes were overseen by JSR 335,
Lambda Expressions for the Java Programming Language, whose Expert Group
consisted of Joshua Bloch, Kevin Bourrillion, Andrey Breslav, Rémi Forax, Dan
Heidinga, Doug Lea, Bob Lee, David Lloyd, Sam Pullara, Srikanth Sankaran, and
Vladimir Zakharov
Programming language design typically involves grappling with degrees of
complexity utterly hidden from the language's users (For this reason, it is often
compared to an iceberg: 90% of it is invisible, "below the water line".) In JSR
Trang 20PREFACE TO THE JAVA SE 8 EDITION
335, the greatest complexity lurked in the interaction of implicitly typed lambdaexpressions with overload resolution In this and many other areas, Dan Smith
at Oracle did an outstanding job of thoroughly specifying the desired behavior.His words are to be found throughout this specification, including an entirely newchapter on type inference
Another initiative in Java SE 8 has been to enhance the utility of annotations, one
of the most popular features of the Java language First, the Java grammar has beenextended to allow annotations on types in many linguistic constructs, forming the
basis for novel static analysis tools such as the Checker Framework This feature was specified by JSR 308, Annotations on Java Types, led by Michael Ernst with an
Expert Group of myself, Doug Lea, and Srikanth Sankaran The changes involved
in this specification were wide-ranging, and the unstinting efforts of Michael Ernstand Werner Dietl over many years are warmly recognized Second, annotationsmay be "repeated" on a linguistic construct, to the great benefit of APIs thatmodel domain-specific configuration with annotation types Michael Keith and BillShannon in Java EE initiated and guided this feature
Many colleagues in the Java Platform Group at Oracle have provided valuablesupport to this specification: Leonid Arbouzov, Mandy Chung, Joe Darcy, RobertField, Joel Franck, Sonali Goel, Jon Gibbons, Jeannette Hung, Stuart Marks, EricMcCorkle, Matherey Nunez, Mark Reinhold, Vicente Romero, John Rose, GeorgesSaab, Steve Sides, Bernard Traversat, and Michel Trudeau
Perhaps the greatest acknowledgement must go to the compiler engineers whomake this specification "real" Maurizio Cimadamore at Oracle worked heroicallyfrom the earliest days on the design of lambda expressions and their implementation
in javac Support for Java SE 8 features in Eclipse was contributed by JayaprakashArthanareeswaran, Shankha Banerjee, Anirban Chakraborty, Andrew Clement,Stephan Herrmann, Markus Keller, Jesper Møller, Manoj Palat, Srikanth Sankaran,and Olivier Thomann; and in IntelliJ by Anna Kozlova, Alexey Kudravtsev, andRoman Shevchenko They deserve the thanks of the entire Java community.Java SE 8 is a renaissance for the Java language While some search for the
"next great language", we believe that programming in Java is more exciting andproductive than ever We hope that it continues to wear well for you
Alex Buckley
Santa Clara, California February, 2014
Trang 21C H A P T E R 1
Introduction
THE Java® programming language is a general-purpose, concurrent,
class-based, object-oriented language It is designed to be simple enough that many
programmers can achieve fluency in the language The Java programming language
is related to C and C++ but is organized rather differently, with a number of aspects
of C and C++ omitted and a few ideas from other languages included It is intended
to be a production language, not a research language, and so, as C A R Hoare
suggested in his classic paper on language design, the design has avoided including
new and untested features
The Java programming language is strongly and statically typed This specification
clearly distinguishes between the compile-time errors that can and must be detected
at compile time, and those that occur at run time Compile time normally consists
of translating programs into a machine-independent byte code representation
Run-time activities include loading and linking of the classes needed to execute
a program, optional machine code generation and dynamic optimization of the
program, and actual program execution
The Java programming language is a relatively high-level language, in that details
of the machine representation are not available through the language It includes
automatic storage management, typically using a garbage collector, to avoid
the safety problems of explicit deallocation (as in C's free or C++'s delete)
High-performance garbage-collected implementations can have bounded pauses to
support systems programming and real-time applications The language does not
include any unsafe constructs, such as array accesses without index checking, since
such unsafe constructs would cause a program to behave in an unspecified way
The Java programming language is normally compiled to the bytecoded instruction
set and binary format defined in The Java Virtual Machine Specification, Java SE
8 Edition.
Trang 221.1 Organization of the Specification INTRODUCTION
1.1 Organization of the Specification
Chapter 2 describes grammars and the notation used to present the lexical andsyntactic grammars for the language
Chapter 3 describes the lexical structure of the Java programming language, which
is based on C and C++ The language is written in the Unicode character set Itsupports the writing of Unicode characters on systems that support only ASCII.Chapter 4 describes types, values, and variables Types are subdivided intoprimitive types and reference types
The primitive types are defined to be the same on all machines and in allimplementations, and are various sizes of two's-complement integers, single- anddouble-precision IEEE 754 standard floating-point numbers, a boolean type, and
a Unicode character char type Values of the primitive types do not share state.Reference types are the class types, the interface types, and the array types Thereference types are implemented by dynamically created objects that are eitherinstances of classes or arrays Many references to each object can exist All objects(including arrays) support the methods of the class Object, which is the (single)root of the class hierarchy A predefined String class supports Unicode characterstrings Classes exist for wrapping primitive values inside of objects In manycases, wrapping and unwrapping is performed automatically by the compiler (inwhich case, wrapping is called boxing, and unwrapping is called unboxing) Classand interface declarations may be generic, that is, they may be parameterized byother reference types Such declarations may then be invoked with specific typearguments
Variables are typed storage locations A variable of a primitive type holds a value
of that exact primitive type A variable of a class type can hold a null reference or
a reference to an object whose type is that class type or any subclass of that classtype A variable of an interface type can hold a null reference or a reference to aninstance of any class that implements the interface A variable of an array type canhold a null reference or a reference to an array A variable of class type Object canhold a null reference or a reference to any object, whether class instance or array.Chapter 5 describes conversions and numeric promotions Conversions change thecompile-time type and, sometimes, the value of an expression These conversionsinclude the boxing and unboxing conversions between primitive types andreference types Numeric promotions are used to convert the operands of a numericoperator to a common type where an operation can be performed There are no
Trang 23INTRODUCTION Organization of the Specification 1.1
loopholes in the language; casts on reference types are checked at run time to ensure
type safety
Chapter 6 describes declarations and names, and how to determine what names
mean (denote) The language does not require types or their members to be declared
before they are used Declaration order is significant only for local variables, local
classes, and the order of initializers of fields in a class or interface
The Java programming language provides control over the scope of names
and supports limitations on external access to members of packages, classes,
and interfaces This helps in writing large programs by distinguishing the
implementation of a type from its users and those who extend it Recommended
naming conventions that make for more readable programs are described here
Chapter 7 describes the structure of a program, which is organized into packages
similar to the modules of Modula The members of a package are classes, interfaces,
and subpackages Packages are divided into compilation units Compilation units
contain type declarations and can import types from other packages to give them
short names Packages have names in a hierarchical name space, and the Internet
domain name system can usually be used to form unique package names
Chapter 8 describes classes The members of classes are classes, interfaces, fields
(variables) and methods Class variables exist once per class Class methods operate
without reference to a specific object Instance variables are dynamically created
in objects that are instances of classes Instance methods are invoked on instances
of classes; such instances become the current object this during their execution,
supporting the object-oriented programming style
Classes support single implementation inheritance, in which the implementation
of each class is derived from that of a single superclass, and ultimately from the
class Object Variables of a class type can reference an instance of that class or of
any subclass of that class, allowing new types to be used with existing methods,
polymorphically
Classes support concurrent programming with synchronized methods Methods
declare the checked exceptions that can arise from their execution, which allows
compile-time checking to ensure that exceptional conditions are handled Objects
can declare a finalize method that will be invoked before the objects are discarded
by the garbage collector, allowing the objects to clean up their state
For simplicity, the language has neither declaration "headers" separate from the
implementation of a class nor separate type and class hierarchies
Trang 241.1 Organization of the Specification INTRODUCTION
A special form of classes, enums, support the definition of small sets of values andtheir manipulation in a type safe manner Unlike enumerations in other languages,enums are objects and may have their own methods
Chapter 9 describes interface types, which declare a set of abstract methods,member types, and constants Classes that are otherwise unrelated can implementthe same interface type A variable of an interface type can contain a reference
to any object that implements the interface Multiple interface inheritance issupported
Annotation types are specialized interfaces used to annotate declarations Suchannotations are not permitted to affect the semantics of programs in the Javaprogramming language in any way However, they provide useful input to varioustools
Chapter 10 describes arrays Array accesses include bounds checking Arrays aredynamically created objects and may be assigned to variables of type Object Thelanguage supports arrays of arrays, rather than multidimensional arrays
Chapter 11 describes exceptions, which are nonresuming and fully integrated withthe language semantics and concurrency mechanisms There are three kinds ofexceptions: checked exceptions, run-time exceptions, and errors The compilerensures that checked exceptions are properly handled by requiring that a method
or constructor can result in a checked exception only if the method or constructordeclares it This provides compile-time checking that exception handlers exist, andaids programming in the large Most user-defined exceptions should be checkedexceptions Invalid operations in the program detected by the Java Virtual Machineresult in run-time exceptions, such as NullPointerException Errors result fromfailures detected by the Java Virtual Machine, such as OutOfMemoryError Mostsimple programs do not try to handle errors
Chapter 12 describes activities that occur during execution of a program Aprogram is normally stored as binary files representing compiled classes andinterfaces These binary files can be loaded into a Java Virtual Machine, linked toother classes and interfaces, and initialized
After initialization, class methods and class variables may be used Some classesmay be instantiated to create new objects of the class type Objects that are classinstances also contain an instance of each superclass of the class, and objectcreation involves recursive creation of these superclass instances
When an object is no longer referenced, it may be reclaimed by the garbagecollector If an object declares a finalizer, the finalizer is executed before the object
Trang 25INTRODUCTION Organization of the Specification 1.1
is reclaimed to give the object a last chance to clean up resources that would not
otherwise be released When a class is no longer needed, it may be unloaded
Chapter 13 describes binary compatibility, specifying the impact of changes to
types on other types that use the changed types but have not been recompiled These
considerations are of interest to developers of types that are to be widely distributed,
in a continuing series of versions, often through the Internet Good program
development environments automatically recompile dependent code whenever a
type is changed, so most programmers need not be concerned about these details
Chapter 14 describes blocks and statements, which are based on C and C++
The language has no goto statement, but includes labeled break and continue
statements Unlike C, the Java programming language requires boolean (or
Boolean) expressions in control-flow statements, and does not convert types to
boolean implicitly (except through unboxing), in the hope of catching more errors
at compile time A synchronized statement provides basic object-level monitor
locking A try statement can include catch and finally clauses to protect against
non-local control transfers
Chapter 15 describes expressions This document fully specifies the (apparent)
order of evaluation of expressions, for increased determinism and portability
Overloaded methods and constructors are resolved at compile time by picking the
most specific method or constructor from those which are applicable
Chapter 16 describes the precise way in which the language ensures that
local variables are definitely set before use While all other variables are
automatically initialized to a default value, the Java programming language does
not automatically initialize local variables in order to avoid masking programming
errors
Chapter 17 describes the semantics of threads and locks, which are based on
the monitor-based concurrency originally introduced with the Mesa programming
language The Java programming language specifies a memory model for
shared-memory multiprocessors that supports high-performance implementations
Chapter 18 describes a variety of type inference algorithms used to test applicability
of generic methods and to infer types in a generic method invocation
Chapter 19 presents a syntactic grammar for the language
Trang 261.2 Example Programs INTRODUCTION
1.2 Example Programs
Most of the example programs given in the text are ready to be executed and aresimilar in form to:
class Test {
public static void main(String[] args) {
for (int i = 0; i < args.length; i++)
System.out.print(i == 0 ? args[i] : " " + args[i]); System.out.println();
java Test Hello, world.
producing the output:
Non-normative information, designed to clarify the specification, is given insmaller, indented text
This is non-normative information It provides intuition, rationale, advice, examples, etc.The type system of the Java programming language occasionally relies on the
notion of a substitution The notation [F 1 :=T 1 , ,F n :=T n] denotes substitution
of F i by T i for 1 ≤ i ≤ n.
Trang 27INTRODUCTION Relationship to Predefined Classes and Interfaces 1.4
1.4 Relationship to Predefined Classes and Interfaces
As noted above, this specification often refers to classes of the Java SE
platform API In particular, some classes have a special relationship with
the Java programming language Examples include classes such as Object,
Class, ClassLoader, String, Thread, and the classes and interfaces in package
java.lang.reflect, among others This specification constrains the behavior of
such classes and interfaces, but does not provide a complete specification for them
The reader is referred to the Java SE platform API documentation
Consequently, this specification does not describe reflection in any detail
Many linguistic constructs have analogs in the Core Reflection API
(java.lang.reflect) and the Language Model API (javax.lang.model), but
these are generally not discussed here For example, when we list the ways in which
an object can be created, we generally do not include the ways in which the Core
Reflection API can accomplish this Readers should be aware of these additional
mechanisms even though they are not mentioned in the text
1.5 Feedback
Readers may send feedback about errors, omissions, and ambiguities in The Java
Language Specification to jls-comments_ww@oracle.com
Questions concerning the behavior of javac (the reference compiler for the Java
programming language), and in particular its conformance to this specification,
may be sent to compiler-dev@openjdk.java.net
1.6 References
Apple Computer Dylan Reference Manual Apple Computer Inc., Cupertino, California.
September 29, 1995
Bobrow, Daniel G., Linda G DeMichiel, Richard P Gabriel, Sonya E Keene, Gregor Kiczales,
and David A Moon Common Lisp Object System Specification, X3J13 Document
88-002R, June 1988; appears as Chapter 28 of Steele, Guy Common Lisp: The Language,
2nd ed Digital Press, 1990, ISBN 1-55558-041-6, 770-864
Ellis, Margaret A., and Bjarne Stroustrup The Annotated C++ Reference Manual
Addison-Wesley, Reading, Massachusetts, 1990, reprinted with corrections October 1992, ISBN
0-201-51459-1
Trang 28Hoare, C A R Hints on Programming Language Design Stanford University Computer
Science Department Technical Report No CS-73-403, December 1973 Reprinted inSIGACT/SIGPLAN Symposium on Principles of Programming Languages Associationfor Computing Machinery, New York, October 1973
IEEE Standard for Binary Floating-Point Arithmetic ANSI/IEEE Std 754-1985 Available
from Global Engineering Documents, 15 Inverness Way East, Englewood, Colorado80112-5704 USA; 800-854-7179
Kernighan, Brian W., and Dennis M Ritchie The C Programming Language, 2nd ed Prentice
Hall, Englewood Cliffs, New Jersey, 1988, ISBN 0-13-110362-8
Madsen, Ole Lehrmann, Birger Møller-Pedersen, and Kristen Nygaard Object-Oriented Programming in the Beta Programming Language Addison-Wesley, Reading,
Massachusetts, 1993, ISBN 0-201-62430-3
Mitchell, James G., William Maybury, and Richard Sweet The Mesa Programming Language, Version 5.0 Xerox PARC, Palo Alto, California, CSL 79-3, April 1979.
Stroustrup, Bjarne The C++ Progamming Language, 2nd ed Addison-Wesley, Reading,
Massachusetts, 1991, reprinted with corrections January 1994, ISBN 0-201-53992-6
Unicode Consortium, The The Unicode Standard, Version 6.0.0 Mountain View, CA, 2011,
ISBN 978-1-936213-01-6
Trang 29C H A P T E R 2
Grammars
THIS chapter describes the context-free grammars used in this specification to
define the lexical and syntactic structure of a program
2.1 Context-Free Grammars
A context-free grammar consists of a number of productions Each production has
an abstract symbol called a nonterminal as its left-hand side, and a sequence of
one or more nonterminal and terminal symbols as its right-hand side For each
grammar, the terminal symbols are drawn from a specified alphabet.
Starting from a sentence consisting of a single distinguished nonterminal, called the
goal symbol, a given context-free grammar specifies a language, namely, the set of
possible sequences of terminal symbols that can result from repeatedly replacing
any nonterminal in the sequence with a right-hand side of a production for which
the nonterminal is the left-hand side
2.2 The Lexical Grammar
A lexical grammar for the Java programming language is given in §3 (Lexical
Structure) This grammar has as its terminal symbols the characters of the Unicode
character set It defines a set of productions, starting from the goal symbol Input
(§3.5), that describe how sequences of Unicode characters (§3.1) are translated into
a sequence of input elements (§3.5)
These input elements, with white space (§3.6) and comments (§3.7) discarded,
form the terminal symbols for the syntactic grammar for the Java programming
language and are called tokens (§3.5) These tokens are the identifiers (§3.8),
Trang 302.3 The Syntactic Grammar GRAMMARS
keywords (§3.9), literals (§3.10), separators (§3.11), and operators (§3.12) of theJava programming language
2.3 The Syntactic Grammar
The syntactic grammar for the Java programming language is given in Chapters
4, 6-10, 14, and 15 This grammar has tokens defined by the lexical grammar
as its terminal symbols It defines a set of productions, starting from the goal
symbol CompilationUnit (§7.3), that describe how sequences of tokens can form
syntactically correct programs
For convenience, the syntactic grammar is presented all together in Chapter 19
2.4 Grammar Notation
Terminal symbols are shown in fixed width font in the productions of the lexicaland syntactic grammars, and throughout this specification whenever the text isdirectly referring to such a terminal symbol These are to appear in a programexactly as written
Nonterminal symbols are shown in italic type The definition of a nonterminal is
introduced by the name of the nonterminal being defined, followed by a colon One
or more alternative definitions for the nonterminal then follow on succeeding lines.For example, the syntactic production:
IfThenStatement:
if ( Expression ) Statement
states that the nonterminal IfThenStatement represents the token if , followed by a left
parenthesis token, followed by an Expression, followed by a right parenthesis token, followed by a Statement.
The syntax {x} on the right-hand side of a production denotes zero or more occurrences of x.
For example, the syntactic production:
ArgumentList:
Argument {, Argument}
Trang 31GRAMMARS Grammar Notation 2.4
states that an ArgumentList consists of an Argument, followed by zero or more occurrences
of a comma and an Argument The result is that an ArgumentList may contain any positive
number of arguments.
The syntax [x] on the right-hand side of a production denotes zero or one
occurrences of x That is, x is an optional symbol The alternative which contains
the optional symbol actually defines two alternatives: one that omits the optional
symbol and one that includes it
This means that:
for ( [ForInit] ; [Expression] ; [ForUpdate] ) Statement
is a convenient abbreviation for:
BasicForStatement:
for ( ; [Expression] ; [ForUpdate] ) Statement
for ( ForInit ; [Expression] ; [ForUpdate] ) Statement
which in turn is an abbreviation for:
BasicForStatement:
for ( ; ; [ForUpdate] ) Statement
for ( ; Expression ; [ForUpdate] ) Statement
for ( ForInit ; ; [ForUpdate] ) Statement
for ( ForInit ; Expression ; [ForUpdate] ) Statement
which in turn is an abbreviation for:
Trang 322.4 Grammar Notation GRAMMARS
BasicForStatement:
for ( ; ; ) Statement
for ( ; ; ForUpdate ) Statement
for ( ; Expression ; ) Statement
for ( ; Expression ; ForUpdate ) Statement
for ( ForInit ; ; ) Statement
for ( ForInit ; ; ForUpdate ) Statement
for ( ForInit ; Expression ; ) Statement
for ( ForInit ; Expression ; ForUpdate ) Statement
so the nonterminal BasicForStatement actually has eight alternative right-hand sides.
A very long right-hand side may be continued on a second line by clearly indentingthe second line
For example, the syntactic grammar contains this production:
NormalClassDeclaration:
{ClassModifier} class Identifier [TypeParameters]
[Superclass] [Superinterfaces] ClassBody
which defines one right-hand side for the nonterminal NormalClassDeclaration.
When the words "one of" follow the colon in a production, they signify that each
of the terminal symbols on the following line or lines is an alternative definition.For example, the lexical grammar contains the production:
of characters that would make up such a token
Thus, the production:
BooleanLiteral: one of
true false
is shorthand for:
Trang 33GRAMMARS Grammar Notation 2.4
BooleanLiteral:
t r u e
f a l s e
The right-hand side of a production may specify that certain expansions are not
permitted by using the phrase "but not" and then indicating the expansions to be
excluded
For example:
Identifier:
IdentifierChars but not a Keyword or BooleanLiteral or NullLiteral
Finally, a few nonterminals are defined by a narrative phrase in roman type where
it would be impractical to list all the alternatives
For example:
RawInputCharacter:
any Unicode character
Trang 35C H A P T E R 3
Lexical Structure
THIS chapter specifies the lexical structure of the Java programming language
Programs are written in Unicode (§3.1), but lexical translations are provided (§3.2)
so that Unicode escapes (§3.3) can be used to include any Unicode character using
only ASCII characters Line terminators are defined (§3.4) to support the different
conventions of existing host systems while maintaining consistent line numbers
The Unicode characters resulting from the lexical translations are reduced to a
sequence of input elements (§3.5), which are white space (§3.6), comments (§3.7),
and tokens The tokens are the identifiers (§3.8), keywords (§3.9), literals (§3.10),
separators (§3.11), and operators (§3.12) of the syntactic grammar
3.1 Unicode
Programs are written using the Unicode character set Information about this
character set and its associated character encodings may be found at http://
www.unicode.org/
The Java SE platform tracks the Unicode specification as it evolves The precise
version of Unicode used by a given release is specified in the documentation of
the class Character
Versions of the Java programming language prior to 1.1 used Unicode version 1.1.5.
Upgrades to newer versions of the Unicode Standard occurred in JDK 1.1 (to Unicode 2.0),
JDK 1.1.7 (to Unicode 2.1), Java SE 1.4 (to Unicode 3.0), and Java SE 5.0 (to Unicode 4.0).
The Unicode standard was originally designed as a fixed-width 16-bit character
encoding It has since been changed to allow for characters whose representation
requires more than 16 bits The range of legal code points is now U+0000
to U+10FFFF, using the hexadecimal U+n notation Characters whose code
Trang 363.2 Lexical Translations LEXICAL STRUCTURE
points are greater than U+FFFF are called supplementary characters To representthe complete range of characters using only 16-bit units, the Unicode standarddefines an encoding called UTF-16 In this encoding, supplementary characters arerepresented as pairs of 16-bit code units, the first from the high-surrogates range,(U+D800 to U+DBFF), the second from the low-surrogates range (U+DC00 to U+DFFF) For characters in the range U+0000 to U+FFFF, the values of code pointsand UTF-16 code units are the same
The Java programming language represents text in sequences of 16-bit code units,using the UTF-16 encoding
Some APIs of the Java SE platform, primarily in the Character class, use 32-bit integers
to represent code points as individual entities The Java SE platform provides methods to convert between 16-bit and 32-bit representations.
This specification uses the terms code point and UTF-16 code unit where the representation is relevant, and the generic term character where the representation
is irrelevant to the discussion
Except for comments (§3.7), identifiers, and the contents of character and stringliterals (§3.10.4, §3.10.5), all input elements (§3.5) in a program are formedonly from ASCII characters (or Unicode escapes (§3.3) which result in ASCIIcharacters)
ASCII (ANSI X3.4) is the American Standard Code for Information Interchange The first
128 characters of the Unicode UTF-16 encoding are the ASCII characters.
3.2 Lexical Translations
A raw Unicode character stream is translated into a sequence of tokens, using thefollowing three lexical translation steps, which are applied in turn:
1 A translation of Unicode escapes (§3.3) in the raw stream of Unicode characters
to the corresponding Unicode character A Unicode escape of the form \uxxxx,where xxxx is a hexadecimal value, represents the UTF-16 code unit whoseencoding is xxxx This translation step allows any program to be expressedusing only ASCII characters
2 A translation of the Unicode stream resulting from step 1 into a stream of inputcharacters and line terminators (§3.4)
3 A translation of the stream of input characters and line terminators resultingfrom step 2 into a sequence of input elements (§3.5) which, after white space
Trang 37LEXICAL STRUCTURE Unicode Escapes 3.3
(§3.6) and comments (§3.7) are discarded, comprise the tokens (§3.5) that are
the terminal symbols of the syntactic grammar (§2.3)
The longest possible translation is used at each step, even if the result does not
ultimately make a correct program while another lexical translation would There
is one exception: if lexical translation occurs in a type context (§4.11) and the
input stream has two or more consecutive > characters that are followed by a non->
character, then each > character must be translated to the token for the numerical
comparison operator >
The input characters a b are tokenized (§3.5) as a , , b , which is not part of any
grammatically correct program, even though the tokenization a , - , - , b could be part of a
grammatically correct program.
Without the rule for > characters, two consecutive > brackets in a type such as
List<List<String>> would be tokenized as the signed right shift operator >> , while
three consecutive > brackets in a type such as List<List<List<String>>> would be
tokenized as the unsigned right shift operator >>> Worse, the tokenization of four or more
consecutive > brackets in a type such as List<List<List<List<String>>>> would be
ambiguous, as various combinations of > , >> , and >>> tokens could represent the >>>>
characters.
3.3 Unicode Escapes
A compiler for the Java programming language ("Java compiler") first recognizes
Unicode escapes in its input, translating the ASCII characters \u followed by four
hexadecimal digits to the UTF-16 code unit (§3.1) for the indicated hexadecimal
value, and passing all other characters unchanged Representing supplementary
characters requires two consecutive Unicode escapes This translation step results
in a sequence of Unicode input characters
Trang 383.3 Unicode Escapes LEXICAL STRUCTURE
RawInputCharacter:
any Unicode character
The \ , u , and hexadecimal digits here are all ASCII characters.
In addition to the processing implied by the grammar, for each raw input characterthat is a backslash \, input processing must consider how many other \ characterscontiguously precede it, separating it from a non-\ character or the start of the inputstream If this number is even, then the \ is eligible to begin a Unicode escape; ifthe number is odd, then the \ is not eligible to begin a Unicode escape
For example, the raw input "\\u2122=\u2122" results in the eleven characters " \ \ u
2 1 2 2 = ™ " ( \u2122 is the Unicode encoding of the character ™ ).
If an eligible \ is not followed by u, then it is treated as a RawInputCharacter and
remains part of the escaped Unicode stream
If an eligible \ is followed by u, or more than one u, and the last u is not followed
by four hexadecimal digits, then a compile-time error occurs
The character produced by a Unicode escape does not participate in further Unicodeescapes
For example, the raw input \u005cu005a results in the six characters \ u 0 0 5 a , because 005c is the Unicode value for \ It does not result in the character Z, which is Unicode character 005a , because the \ that resulted from the \u005c is not interpreted as the start of a further Unicode escape.
The Java programming language specifies a standard way of transforming aprogram written in Unicode into ASCII that changes a program into a form thatcan be processed by ASCII-based tools The transformation involves convertingany Unicode escapes in the source text of the program to ASCII by adding an extra
u - for example, \uxxxx becomes \uuxxxx - while simultaneously converting ASCII characters in the source text to Unicode escapes containing a single u each.This transformed version is equally acceptable to a Java compiler and representsthe exact same program The exact Unicode source can later be restored from thisASCII form by converting each escape sequence where multiple u's are present to asequence of Unicode characters with one fewer u, while simultaneously convertingeach escape sequence with a single u to the corresponding single Unicode character
non-A Java compiler should use the \uxxxx notation as an output format to display Unicode characters when a suitable font is not available.
Trang 39LEXICAL STRUCTURE Line Terminators 3.4
3.4 Line Terminators
A Java compiler next divides the sequence of Unicode input characters into lines
by recognizing line terminators.
LineTerminator:
the ASCII LF character, also known as "newline"
the ASCII CR character, also known as "return"
the ASCII CR character followed by the ASCII LF character
InputCharacter:
UnicodeInputCharacter but not CR or LF
Lines are terminated by the ASCII characters CR, or LF, or CR LF The two
characters CR immediately followed by LF are counted as one line terminator, not
two
A line terminator specifies the termination of the // form of a comment (§3.7)
The lines defined by line terminators may determine the line numbers produced by a Java
compiler.
The result is a sequence of line terminators and input characters, which are the
terminal symbols for the third step in the tokenization process
3.5 Input Elements and Tokens
The input characters and line terminators that result from escape processing (§3.3)
and then input line recognition (§3.4) are reduced to a sequence of input elements.
Trang 403.6 White Space LEXICAL STRUCTURE
the ASCII SUB character, also known as "control-Z"
Those input elements that are not white space or comments are tokens The tokens
are the terminal symbols of the syntactic grammar (§2.3)
White space (§3.6) and comments (§3.7) can serve to separate tokens that, ifadjacent, might be tokenized in another manner For example, the ASCII characters
- and = in the input can form the operator token -= (§3.12) only if there is nointervening white space or comment
As a special concession for compatibility with certain operating systems, the ASCIISUB character (\u001a, or control-Z) is ignored if it is the last character in theescaped input stream
Consider two tokens x and y in the resulting input stream If x precedes y, then wesay that x is to the left of y and that y is to the right of x
For example, in this simple piece of code:
class Empty {
}
we say that the } token is to the right of the { token, even though it appears, in this dimensional representation, downward and to the left of the { token This convention about the use of the words left and right allows us to speak, for example, of the right-hand operand
two-of a binary operator or two-of the left-hand side two-of an assignment.
3.6 White Space
White space is defined as the ASCII space character, horizontal tab character, formfeed character, and line terminator characters (§3.4)