• Fundamental semantic issues of variables – Imperative languages are abstractions of von Neumann architecture – Variables characterized by attributes, the most important of which is d
Trang 1Chapter 5
Names, Binding,
Type Checking and
Scopes
Trang 3• Fundamental semantic issues of variables
– Imperative languages are abstractions of von
Neumann architecture
– Variables characterized by attributes, the most
important of which is data type
• The design of the data types of a language
requires that a variety of issues be considered
– The scope and lifetime of variables
– Type checking and initialization
– Type compatibility, …
Trang 4– Formal parameters, and other program constructs
• The term identifier is often used interchangeably with name
Trang 5Design Issues for Names
• Maximum length?
• Are connector characters allowed?
• Are names case sensitive?
• Are special words reserved words or keywords?
Trang 6– FORTRAN 90 and ANSI C: maximum 31
– Ada and Java: no limit
– C++: indetermination
• Connectors:
– Modula-2 and FORTRAN 77 don't allow
– Others do
Trang 7Name Forms (cont.)
Trang 9• A variable is an abstraction of a memory cell
– Abstract memory cell - the physical cell or
collection of cells associated with a variable
– Replace absolute numeric memory addresses with name
– Escape from the problem of absolute addressing
• Variables can be characterized as a sextuple of
attributes:
(Name, Address, Value, Type, Lifetime, Scope)
– Name : considerations of length, case, character, …
Trang 10Variables (cont.)
– Address (also called L-value): the memory address with which it is associated
The same variable name may have different addresses
at different places in the program
A variable may have different addresses at different times during program execution
If two or more variable names can be used to access the same memory location, they are called aliases
• Aliasing is harmful to readability
• Aliasing makes program verification more difficult
Trang 11How aliases can be created
• In FORTRAN, aliases can be explicitly created
with the EQUIVALENCE statement
CHARACTER A*4, B*4, C(2)*3
EQUIVALENCE (A,C(1)), (B,C(2))
| | | | | | | | Variable A: | A |
Variable B: | B | Array C: | C(1) | C(2) |
Trang 12How aliases can be created (cont.)
• Pointers, reference variables, C/C++ unions
Trang 13Variables (cont.)
– Type : determines the range of values of variables and the set of operations that are defined for
values of that type
In the case of floating point, type also determines the precision
– Value (also called R-value): the contents of the
location with which the variable is associated
– Lifetime
– Scope
Trang 14The Concept of Binding
• A binding is an association, such as between an attribute and an entity, or between an operation and a symbol
• Binding time is the time at which a binding takes place
Trang 15The Concept of Binding (cont.)
• Possible binding times
– Language design time - e.g., bind operator
symbols to operations
– Language implementation time - e.g., bind
floating point type to a representation
– Compile time - e.g., bind a variable to a type
– Link time – e.g., a call to a library subprogram is bound to the subprogram code
– Load time - e.g., bind a FORTRAN 77 variable to a
memory cell (or a C static variable)
– Runtime - e.g., bind a nonstatic local variable to a memory cell
Trang 16• Type of count: bound at compile time
• Set of possible values of count: bound at compiler design time
• Value of count: bound at execution time with this statement
• Set of possible meanings for the operator symbol +: bound
at language definition time
• Meaning of the operator symbol + in this statement: bound
at compile time
• Internal representation of the literal 5: bound at compiler
design time
Trang 17The Concept of Binding (cont.)
• A binding is static if it first occurs before run
time and remains unchanged throughout
program execution
• A binding is dynamic if it first occurs during
execution or can change during execution of the program
Trang 18Type Bindings
• Before a variable can be referenced in a
program, it must be bound to a data type
• The two important aspects of this binding are:
– How the type is specified
– When the binding takes place
• Types can be specified statically through some form of explicit or implicit declaration
– Both explicit and implicit declarations create
static bindings to types
Trang 19Type Bindings: Variable Declaration
• An explicit declaration is a program statement used for declaring the types of variables
• An implicit declaration is a default mechanism for specifying types of variables (the first
appearance of the variable in the program)
• FORTRAN, PL/I, BASIC, and Perl provide implicit declarations
– Advantage: writability
– Disadvantage: reliability (typographical errors,
undeclared variables)
Trang 21Type Bindings: Dynamic Type Binding
• Specified through an assignment statement e.g., JavaScript
list = [2, 4.33, 6, 8];
list = 17.3;
– Advantage: flexibility
– Disadvantages:
High cost (dynamic type checking and interpretation)
Type error detection by the compiler is difficult
• Languages must be implemented using pure
interpreters rather than compilers
Trang 22Type Bindings: Type Inference
• Type inference is used in the programming
languages ML, Miranda and Haskell
– Rather than by assignment statement, types are determined from the context of the reference
Example: In ML
fun square(x) = x * x; square(3); square(3.3);
Trang 23Storage Bindings & Lifetime
• Allocation - getting a cell from some pool of
available cells
• Deallocation - putting a cell back into the pool
• The lifetime of a variable is the time during
which it is bound to a particular memory cell
– It’s convenient to separate variables into four
categories, according to their lifetimes
Trang 24Categories of variables by lifetimes
• Static variable - bound to memory cells before execution begins and remains bound to the
same memory cell throughout execution
– Advantages
Efficiency (direct addressing)
History-sensitive subprogram support
– Disadvantages
Lack of flexibility no recursion
Storage cannot be shared among variables
Trang 25Categories of variables …
• Stack-dynamic variable - storage bindings are
created for variables when their declaration
statements are elaborated
• All attributes except address are statically bound
Overhead of allocation and deallocation
Subprograms cannot be history sensitive
Inefficient references (indirect addressing)
Trang 26Categories of variables …
• Explicit heap-dynamic - Allocated and
deallocated by explicit directives, specified by the programmer, which take effect during execution
• Referenced only through pointers or references
– Advantage: provides for dynamic storage
management
– Disadvantage: inefficient and unreliable
Trang 27Categories of variables …
• Implicit heap-dynamic - Allocation and
deallocation caused by assignment statements
Trang 28– A type error is the application of an operator to
an operand of an inappropriate type
– If all type bindings are static, nearly all type
checking can be static If type bindings are
dynamic, type checking must be dynamic
Trang 29Type Checking (cont.)
• Type checking will touch on the topics of:
– Type equivalence
– Type compatibility
– Type inference
Trang 30Strong Typing
• A programming language is strongly typed if
type errors are always detected
• A strongly typed language allows:
– Each name in a program in the language has a
single type associated with it, and that type is
known at compile time
– The detection of the misuses of variables that
result in type errors
– The detection, at run time, of uses of the
incorrect type values in variables that can store
values of more than one type
Trang 31Example - Pascal
type Value = record
case numType : (float, int) of
float: (val : real);
int: (parts : array [1 4] of char);
Trang 32Example - Ada
generic
type Source (<>) is limited private;
type Target (<>) is limited private;
function Ada.Unchecked_Conversion(Source_Object : Source) return Target;
Trang 33Example - ML
• ML strongly typed but in a slightly different
sense than that of the imperative languages
– Types are all statically known, either from
declarations or from its type inference rules
fun square(x) = x * 10;
fun square(x) : int = x * x;
Trang 34Other Languages
• FORTRAN 77 is not a strongly typed language
(EQUIVALENCE)
• C/C++ are not strongly typed languages
(actual/formal parameters, union types are not type checked)
• Modula-3 has a predefined procedure named
LOOPHOLE that serves the same purpose as Ada's unchecked_conversion
Trang 35– Name equivalence is based on the lexical
occurrence of type definitions (each definition
introduces a new type)
Trang 36Structural Equivalence
• The exact definition of structural equivalence
varies from one language to another
• The format of a declaration should not matter
• In Pascal-like language with structural
type str = array [1 10] of char;
()type str = array [1 2 * 5] of char;
() type str = array [0 9] of char;
Trang 37Structural Equivalence (cont.)
• To determine if two types are structurally lent, a compiler can expand their definitions by replacing any embedded type names with their
equiva-respective definitions, recursively, until nothing is left but a string of type constructors, field names,
and built-in types
• If these expanded strings are the same, then the types are equivalent, and conversely
Trang 38Structural Equivalence (cont.)
• Structural equivalence disallows differentiating between types that the programmer may think of
as distinct, but which happen (by coincidence) to have the same internal structure
type
celsius = real;
fahrenheit = real;
Variables of these two types are considered
compatible under structural equivalence, allowing
them to be mixed in expressions
Trang 39type student = record
name, address : string age : integer
end
type school = record
name, address : string age : integer
Trang 40Name Equivalence
• It is based on the assumption that if the
programmer takes the effort to write two type
definitions, then those definitions are probably meant to represent different types
• In the above example, the types will be
considered different under name equivalence
• An alias type, whose definition simply specifies the name of some other type, is a variant of
name equivalence
TYPE new_type = old_type
Trang 41Name Equivalence (cont.)
• A language in which aliased types are considered
• A language in which aliased types are considered
Trang 42Name Equivalence (cont.)
• Ada achieves the best of both worlds by
allowing the programmer to indicate whether an alias represents a derived type or a subtype
– A subtype is compatible with its base type
(Subtypes of the same base type are also
compatible with each other)
– A derived type is incompatible
subtype int is integer; Compatible
type celsius is new integer; Incompatible
type fahrenheit is new integer;
Trang 43type cell = … –– whatever
type alink = pointer to cell
type blink = alink alias type
• Under strict name equivalence
alink blink pointer to cell
• Under loose name equivalence
alink blink pointer to cell
• Under structural equivalence
alink blink pointer to cell
Trang 44Type Compatibility
• Most languages do not require equivalence of
types in every context Instead, a value’s type
must be compatible with that of the context in which it appears
– In an assignment statement, the type of the RHS must be compatible with that of the LHS
– In a subroutine call, the types of any arguments
passed into the subroutine must be compatible
with the types of the corresponding formal
parameters and vice versa
Trang 45Type Compatibility
• The definition of type compatibility varies
greatly from language to language
• Ada takes a relatively restrictive approach: an
Ada type S is compatible with an expected type
T if and only if
1 S and T are equivalent, or
2 One is a subtype of the other (or both are
subtypes of the same base type), or
3 Both are arrays, with the same numbers (in each
dimension) and types of elements
Trang 46• A type coercion is an automatic, implicit
conversion of one type to the expected type
• Because coercions allows types to be mixed
without an explicit indication of intent on the part
of the programmer, they represent a significant weakening of type security
– Languages with a great deal of coercion (FORTRAN, C/C++) are significantly less reliable than those
with little coercion, such as Ada
– The value of strong typing is weakened by coersion
Trang 47Example: In Ada
type
weekday = (sun, mon, tue, wed, thu, fri, sat);
workday = mon fri;
calendar_column is new weekday;
…
d: weekday; c: calendar_column; k : workday;
…
k := d; coercion, run-time check required
c := d; static semantic error (they are not compatible)
c := calendar_column(d); cast
Trang 48• The scope of a variable is the range of statements in which the variable is visible
• A variable is visible in a statement if it can be
referenced in that statement
• The nonlocal variables of a program unit are those that are visible but not declared in current program unit
• A variable is local in a program unit or block if it is
Trang 49Static scope
• The scope of a variable can be statically
determined, that is, prior to execution
• To connect a reference to a variable, the compiler must find the declaration
• Search process: search declarations, first locally,
then in increasingly larger enclosing scopes, until
one is found for the given name
• Enclosing static scopes (to a specific scope) are called its static ancestors; the nearest static
ancestor is called a static parent
Trang 50procedure sub2;
var x: integer;
begin
… x … end;
begin
… x … end;
Trang 51Static scope (cont.)
• Variables can be hidden from a unit by having a
"closer" variable with the same name
• C++ and Ada allow access to these "hidden"
variables
– In Ada: unit.name
– In C++: ::name
Trang 52Static scope - Blocks
• Many languages allow a section of code to have its own local variables whose scope is minimized Such a section of code is called a block
• Variables are typically stack dynamic, so they
have their storage allocated when the section is entered and deallocated when the section is
exited
• The scopes created by blocks are treated exactly like those created by parameterless subprograms
Trang 54Evaluation of Static Scoping
Assume MAIN calls A and B, A calls C and D, B calls A and E,
E
main
Trang 55Dynamic Scope
• Dynamic scoping is based on the calling
sequence of subprograms, not on their spatial relationship to each other
• The scope can be determined only at run-time
• References to variables are connected to
declarations by searching back through the
chain of subprogram calls that forced execution
to this point
Trang 56procedure sub2;
var x: integer;
begin
… end;
begin
…
end;
Call sequence:
big sub2 sub1
The search proceeds from the local
procedure, sub1, to its caller,
sub2, where a declaration for x is
found So the reference to x in
sub1 is to the x declared in sub2
Trang 57procedure sub2;
var x: integer;
begin
… end;
begin
…
end;
Call sequence:
big sub2 sub1
The search proceeds from the local
procedure, sub1, to its caller,
sub2, where a declaration for x is
found So the reference to x in
sub1 is to the x declared in sub2
Trang 58Evaluation of Dynamic Scoping
• The local variables of the active subprogram are
regardless of its textual proximity
• The inability to statically type check references to nonlocals
• Poor readability and reliability
• Accesses to nonlocal variables in dynamic scoped languages take far longer than accesses to
nonlocals when static scoping is used
• Advantage: In some cases, the passing
parameters are not needed