Chapter 5 names, binding, type checking and scopes

• Fundamental semantic issues of variables – Imperative languages are abstractions of von Neumann architecture – Variables characterized by attributes, the most important of which is d

Trang 1

Chapter 5

Names, Binding,

Type Checking and

Scopes

Trang 3

• Fundamental semantic issues of variables

– Imperative languages are abstractions of von

Neumann architecture

– Variables characterized by attributes, the most

important of which is data type

• The design of the data types of a language

requires that a variety of issues be considered

– The scope and lifetime of variables

– Type checking and initialization

– Type compatibility, …

Trang 4

– Formal parameters, and other program constructs

• The term identifier is often used interchangeably with name

Trang 5

Design Issues for Names

• Maximum length?

• Are connector characters allowed?

• Are names case sensitive?

• Are special words reserved words or keywords?

Trang 6

– FORTRAN 90 and ANSI C: maximum 31

– Ada and Java: no limit

– C++: indetermination

• Connectors:

– Modula-2 and FORTRAN 77 don't allow

– Others do

Trang 7

Name Forms (cont.)

Trang 9

• A variable is an abstraction of a memory cell

– Abstract memory cell - the physical cell or

collection of cells associated with a variable

– Replace absolute numeric memory addresses with name

– Escape from the problem of absolute addressing

• Variables can be characterized as a sextuple of

attributes:

(Name, Address, Value, Type, Lifetime, Scope)

– Name : considerations of length, case, character, …

Trang 10

Variables (cont.)

– Address (also called L-value): the memory address with which it is associated

 The same variable name may have different addresses

at different places in the program

 A variable may have different addresses at different times during program execution

 If two or more variable names can be used to access the same memory location, they are called aliases

• Aliasing is harmful to readability

• Aliasing makes program verification more difficult

Trang 11

How aliases can be created

• In FORTRAN, aliases can be explicitly created

with the EQUIVALENCE statement

CHARACTER A*4, B*4, C(2)*3

EQUIVALENCE (A,C(1)), (B,C(2))

| | | | | | | | Variable A: | A |

Variable B: | B | Array C: | C(1) | C(2) |

Trang 12

How aliases can be created (cont.)

• Pointers, reference variables, C/C++ unions

Trang 13

Variables (cont.)

– Type : determines the range of values of variables and the set of operations that are defined for

values of that type

 In the case of floating point, type also determines the precision

– Value (also called R-value): the contents of the

location with which the variable is associated

– Lifetime

– Scope

Trang 14

The Concept of Binding

• A binding is an association, such as between an attribute and an entity, or between an operation and a symbol

• Binding time is the time at which a binding takes place

Trang 15

The Concept of Binding (cont.)

• Possible binding times

– Language design time - e.g., bind operator

symbols to operations

– Language implementation time - e.g., bind

floating point type to a representation

– Compile time - e.g., bind a variable to a type

– Link time – e.g., a call to a library subprogram is bound to the subprogram code

– Load time - e.g., bind a FORTRAN 77 variable to a

memory cell (or a C static variable)

– Runtime - e.g., bind a nonstatic local variable to a memory cell

Trang 16

• Type of count: bound at compile time

• Set of possible values of count: bound at compiler design time

• Value of count: bound at execution time with this statement

• Set of possible meanings for the operator symbol +: bound

at language definition time

• Meaning of the operator symbol + in this statement: bound

at compile time

• Internal representation of the literal 5: bound at compiler

design time

Trang 17

The Concept of Binding (cont.)

• A binding is static if it first occurs before run

time and remains unchanged throughout

program execution

• A binding is dynamic if it first occurs during

execution or can change during execution of the program

Trang 18

Type Bindings

• Before a variable can be referenced in a

program, it must be bound to a data type

• The two important aspects of this binding are:

– How the type is specified

– When the binding takes place

• Types can be specified statically through some form of explicit or implicit declaration

– Both explicit and implicit declarations create

static bindings to types

Trang 19

Type Bindings: Variable Declaration

• An explicit declaration is a program statement used for declaring the types of variables

• An implicit declaration is a default mechanism for specifying types of variables (the first

appearance of the variable in the program)

• FORTRAN, PL/I, BASIC, and Perl provide implicit declarations

– Advantage: writability

– Disadvantage: reliability (typographical errors,

undeclared variables)

Trang 21

Type Bindings: Dynamic Type Binding

• Specified through an assignment statement e.g., JavaScript

list = [2, 4.33, 6, 8];

list = 17.3;

– Advantage: flexibility

– Disadvantages:

 High cost (dynamic type checking and interpretation)

 Type error detection by the compiler is difficult

• Languages must be implemented using pure

interpreters rather than compilers

Trang 22

Type Bindings: Type Inference

• Type inference is used in the programming

languages ML, Miranda and Haskell

– Rather than by assignment statement, types are determined from the context of the reference

Example: In ML

fun square(x) = x * x;  square(3); square(3.3);

Trang 23

Storage Bindings & Lifetime

• Allocation - getting a cell from some pool of

available cells

• Deallocation - putting a cell back into the pool

• The lifetime of a variable is the time during

which it is bound to a particular memory cell

– It’s convenient to separate variables into four

categories, according to their lifetimes

Trang 24

Categories of variables by lifetimes

• Static variable - bound to memory cells before execution begins and remains bound to the

same memory cell throughout execution

– Advantages

 Efficiency (direct addressing)

 History-sensitive subprogram support

– Disadvantages

 Lack of flexibility  no recursion

 Storage cannot be shared among variables

Trang 25

Categories of variables …

• Stack-dynamic variable - storage bindings are

created for variables when their declaration

statements are elaborated

• All attributes except address are statically bound

 Overhead of allocation and deallocation

 Subprograms cannot be history sensitive

 Inefficient references (indirect addressing)

Trang 26

• Explicit heap-dynamic - Allocated and

deallocated by explicit directives, specified by the programmer, which take effect during execution

• Referenced only through pointers or references

– Advantage: provides for dynamic storage

management

– Disadvantage: inefficient and unreliable

Trang 27

• Implicit heap-dynamic - Allocation and

deallocation caused by assignment statements

Trang 28

– A type error is the application of an operator to

an operand of an inappropriate type

– If all type bindings are static, nearly all type

checking can be static If type bindings are

dynamic, type checking must be dynamic

Trang 29

Type Checking (cont.)

• Type checking will touch on the topics of:

– Type equivalence

– Type compatibility

– Type inference

Trang 30

Strong Typing

• A programming language is strongly typed if

type errors are always detected

• A strongly typed language allows:

– Each name in a program in the language has a

single type associated with it, and that type is

known at compile time

– The detection of the misuses of variables that

result in type errors

– The detection, at run time, of uses of the

incorrect type values in variables that can store

values of more than one type

Trang 31

Example - Pascal

type Value = record

case numType : (float, int) of

float: (val : real);

int: (parts : array [1 4] of char);

Trang 32

Example - Ada

generic

type Source (<>) is limited private;

type Target (<>) is limited private;

function Ada.Unchecked_Conversion(Source_Object : Source) return Target;

Trang 33

Example - ML

• ML strongly typed but in a slightly different

sense than that of the imperative languages

– Types are all statically known, either from

declarations or from its type inference rules

fun square(x) = x * 10;

fun square(x) : int = x * x;

Trang 34

Other Languages

• FORTRAN 77 is not a strongly typed language

(EQUIVALENCE)

• C/C++ are not strongly typed languages

(actual/formal parameters, union types are not type checked)

• Modula-3 has a predefined procedure named

LOOPHOLE that serves the same purpose as Ada's unchecked_conversion

Trang 35

– Name equivalence is based on the lexical

occurrence of type definitions (each definition

introduces a new type)

Trang 36

Structural Equivalence

• The exact definition of structural equivalence

varies from one language to another

• The format of a declaration should not matter

• In Pascal-like language with structural

type str = array [1 10] of char;

()type str = array [1 2 * 5] of char;

() type str = array [0 9] of char;

Trang 37

Structural Equivalence (cont.)

• To determine if two types are structurally lent, a compiler can expand their definitions by replacing any embedded type names with their

equiva-respective definitions, recursively, until nothing is left but a string of type constructors, field names,

and built-in types

• If these expanded strings are the same, then the types are equivalent, and conversely

Trang 38

Structural Equivalence (cont.)

• Structural equivalence disallows differentiating between types that the programmer may think of

as distinct, but which happen (by coincidence) to have the same internal structure

type

celsius = real;

fahrenheit = real;

Variables of these two types are considered

compatible under structural equivalence, allowing

them to be mixed in expressions

Trang 39

type student = record

name, address : string age : integer

end

type school = record

name, address : string age : integer

Trang 40

Name Equivalence

• It is based on the assumption that if the

programmer takes the effort to write two type

definitions, then those definitions are probably meant to represent different types

• In the above example, the types will be

considered different under name equivalence

• An alias type, whose definition simply specifies the name of some other type, is a variant of

name equivalence

TYPE new_type = old_type

Trang 41

Name Equivalence (cont.)

• A language in which aliased types are considered

Trang 42

Name Equivalence (cont.)

• Ada achieves the best of both worlds by

allowing the programmer to indicate whether an alias represents a derived type or a subtype

– A subtype is compatible with its base type

(Subtypes of the same base type are also

compatible with each other)

– A derived type is incompatible

subtype int is integer;  Compatible

type celsius is new integer;  Incompatible

type fahrenheit is new integer;

Trang 43

type cell = … –– whatever

type alink = pointer to cell

type blink = alink  alias type

• Under strict name equivalence

alink  blink  pointer to cell

• Under loose name equivalence

alink  blink  pointer to cell

• Under structural equivalence

alink  blink  pointer to cell

Trang 44

Type Compatibility

• Most languages do not require equivalence of

types in every context Instead, a value’s type

must be compatible with that of the context in which it appears

– In an assignment statement, the type of the RHS must be compatible with that of the LHS

– In a subroutine call, the types of any arguments

passed into the subroutine must be compatible

with the types of the corresponding formal

parameters and vice versa

Trang 45

Type Compatibility

• The definition of type compatibility varies

greatly from language to language

• Ada takes a relatively restrictive approach: an

Ada type S is compatible with an expected type

T if and only if

1 S and T are equivalent, or

2 One is a subtype of the other (or both are

subtypes of the same base type), or

3 Both are arrays, with the same numbers (in each

dimension) and types of elements

Trang 46

• A type coercion is an automatic, implicit

conversion of one type to the expected type

• Because coercions allows types to be mixed

without an explicit indication of intent on the part

of the programmer, they represent a significant weakening of type security

– Languages with a great deal of coercion (FORTRAN, C/C++) are significantly less reliable than those

with little coercion, such as Ada

– The value of strong typing is weakened by coersion

Trang 47

Example: In Ada

type

weekday = (sun, mon, tue, wed, thu, fri, sat);

workday = mon fri;

calendar_column is new weekday;

…

d: weekday; c: calendar_column; k : workday;

…

k := d; coercion, run-time check required

c := d; static semantic error (they are not compatible)

c := calendar_column(d); cast

Trang 48

• The scope of a variable is the range of statements in which the variable is visible

• A variable is visible in a statement if it can be

referenced in that statement

• The nonlocal variables of a program unit are those that are visible but not declared in current program unit

• A variable is local in a program unit or block if it is

Trang 49

Static scope

• The scope of a variable can be statically

determined, that is, prior to execution

• To connect a reference to a variable, the compiler must find the declaration

• Search process: search declarations, first locally,

then in increasingly larger enclosing scopes, until

one is found for the given name

• Enclosing static scopes (to a specific scope) are called its static ancestors; the nearest static

ancestor is called a static parent

Trang 50

procedure sub2;

var x: integer;

begin

… x … end;

begin

… x … end;

Trang 51

Static scope (cont.)

• Variables can be hidden from a unit by having a

"closer" variable with the same name

• C++ and Ada allow access to these "hidden"

variables

– In Ada: unit.name

– In C++: ::name

Trang 52

Static scope - Blocks

• Many languages allow a section of code to have its own local variables whose scope is minimized Such a section of code is called a block

• Variables are typically stack dynamic, so they

have their storage allocated when the section is entered and deallocated when the section is

exited

• The scopes created by blocks are treated exactly like those created by parameterless subprograms

Trang 54

Evaluation of Static Scoping

Assume MAIN calls A and B, A calls C and D, B calls A and E,

E

main

Trang 55

Dynamic Scope

• Dynamic scoping is based on the calling

sequence of subprograms, not on their spatial relationship to each other

• The scope can be determined only at run-time

• References to variables are connected to

declarations by searching back through the

chain of subprogram calls that forced execution

to this point

Trang 56

procedure sub2;

var x: integer;

begin

… end;

begin

…

end;

Call sequence:

big  sub2  sub1

The search proceeds from the local

procedure, sub1, to its caller,

sub2, where a declaration for x is

found So the reference to x in

sub1 is to the x declared in sub2

Trang 57

procedure sub2;

var x: integer;

begin

… end;

begin

…

end;

Call sequence:

big  sub2  sub1

The search proceeds from the local

procedure, sub1, to its caller,

sub2, where a declaration for x is

found So the reference to x in

sub1 is to the x declared in sub2

Trang 58

Evaluation of Dynamic Scoping

• The local variables of the active subprogram are

regardless of its textual proximity

• The inability to statically type check references to nonlocals

• Poor readability and reliability

• Accesses to nonlocal variables in dynamic scoped languages take far longer than accesses to

nonlocals when static scoping is used

• Advantage: In some cases, the passing

parameters are not needed

Tiêu đề	Names, Binding, Type Checking and Scopes
Trường học	Addison-Wesley
Thể loại	chapter
Năm xuất bản	2006

Định dạng
Số trang	67
Dung lượng	380,87 KB