Tìm hiểu về các loại tham khảo và các loại giá trị potx

CHAPTER 4 Understanding Reference Types and Value TypesAs its name suggests, a reference type has a value that is a reference to an object inmemory.. The Unified Type SystemSystem.Object

Trang 1

This page intentionally left blank

Trang 2

CHAPTER 4

Understanding Reference Types and

Value Types

IN THIS CHAPTER

A Quick Introduction toReference Types and ValueTypes

The Unified Type System

Reference Type and Value TypeMemory Allocation

Reference Type and Value TypeAssignment

More Differences BetweenReference Types and ValueTypes

C# and NET Framework Types

Nullable Types

Deep in the course of coding, you’re often immersed in

logic, solving the problem at hand Simple actions, such as

assignment and instantiation, are tasks you perform

regu-larly, without much thought However, when writing C#

programs, or using any language that targets the Common

Language Runtime (CLR), you might want to take a second

look What appears to be simple can sometimes result in

hard-to-find bugs This chapter goes into greater depth on

CLR types and shows you a few things about coding in C#

that often catch developers off guard More specifically,

you learn about the differences between reference types

and value types

The NET type system, which C# is built upon, is divided

into reference types and value types You’ll work with each

of these types all the time, and it’s important to know the

differences between them This chapter shows you the

differences via memory allocation and assignment

behav-iors This understanding should translate into helping you

make smart design decisions that improve application

performance and reduce errors

A Quick Introduction to Reference

Types and Value Types

There is much to be said about reference types and value

types, but this section gives a quick introduction to the

essentials You learn a little about their behaviors and what

they look like in code

Trang 3

CHAPTER 4 Understanding Reference Types and Value Types

As its name suggests, a reference type has a value that is a reference to an object inmemory However, a value type has a value that contains the object itself

Up until now, you’ve been creating custom reference types, which is defined with the

classkeyword shown here:

Thestructkeyword classifies the Moneytype as a value type

In both of these examples, I used the publicmodifier on the NameandAmountfields Thisallows code using the CustomerandMoneytypes to access the NameandAmountfields,respectively You can learn more about the different access modifiers in Chapter 9,

“Implementing Object-Oriented Principles.”

Later sections of this chapter go into even greater depth on the differences between thesetypes, but at least you now know the bare minimum to move forward The next sectionstarts your journey into understanding the differences between reference types and valuetypes and how these differences affect you

Before looking at the specific behaviors of reference types and value types, you shouldunderstand the relationship between them and how the C# type system, the Unified TypeSystem, works The details described here help you understand the coding practices andperformance issues that you learn later in the chapter

How the Unified Type System Works

Essentially, the Unified Type System ensures that all C# types derive from a commonancestor, System.Object The C# object type is an alias for System.Object, and furtherdiscussion will use a C# perspective and refer to System.Objectasobject Figure 4.1 illus-trates this relationship

Trang 4

System.Object

Reference Type System.ValueType

Reference Type Value Type

FIGURE 4.1 In the Unified Type System, all objects derive from the object type

WHAT IS INHERITANCE

Inheritance is an object-oriented principle that promotes reuse and helps build

hierar-chical frameworks of objects In the context of this chapter, you learn that all types

derive from object This gives you the ability to assign a derived object to a variable of

type object Also, whatever belongs to object is also a member of a derived class

In Chapter 8, “Designing Objects,” and Chapter 9, “Implementing Object-Oriented

Principles,” you can learn a lot more about C# syntax supporting inheritance and how

to use it Throughout the rest of the book, too, you’ll see many examples of how to use

inheritance

In Figure 4.1, the arrows are Unified Modeling Language (UML) generalization symbols,showing how one type, a box, derives from the type being pointed to The direction of

inheritance shows that all types derive either directly or indirectly from object

Reference types can either derive directly from System.Objector from another referencetype However, the relationship between value type objects and objectis indirect All

value types implicitly derive from the System.ValueTypeclass, a reference type object,

which inherits object For simplicity, further discussion omits the fact of either explicit orimplicit inheritance relationships

At this point, you might be scratching your head and wondering why you should care (anatural reaction) The big deal is that your coding experience with treating types in a

generic manner is simplified (the good news), but you must also be aware of performancepenalties that are possible when treating types in a generic manner In Chapter 17,

“Parameterizing Type with Generics and Writing Iterators,” you can learn about the bestway to manage generic code, but the next two sections explain the implications of the

Unified Type System and how it affects you

Trang 5

Using object for Generic Programming

Because both reference types and value types inherit object, you can assign any type to avariable of type object as shown here:

decimal amount = 3.50m;

object obj1 = amount;

Customer cust = new Customer();

object obj2 = cust;

Theamountvariable is a decimal, a value type, and the custvariable is a Customerclass, areference type

Any assignment to object is an implicit conversion, which is always safe However, doing

an assignment from type object to a derived type may or may not be safe C# forces you

to state your intention with a cast operator, as shown here:

Customer cust2 = (Customer)obj2;

The cast operator is necessary because the C# compiler can’t tell whether obj2is actually a

Customertype Chapter 10, “Coding Methods and Custom Operators,” goes into greaterdepth on conversions, but the basic idea is that C# is type-safe and has features thatensure safe assignments of one object to another

A more concrete example of when you might see a situation where a variable can beassigned to another variable of type object is with standard collection classes The firstversion of the NET Framework Class Library (FCL) included a library of collection classes,one of them being ArrayList These collections offered many conveniences that youdon’t have in C# arrays or would have to create yourself

One of the features of these collections, including ArrayList, was that they could workgenerically with any type The Unified Type System makes this possible because the collec-tions operate on the object type, meaning that you can use them with any NET type.Here’s an example that uses an ArrayListcollection:

ArrayList customers = new ArrayList();

Customer cust1 = new Customer();

cust1.Name = “John Smith”;

cust2.Name = “Jane Doe”;

Trang 6

customers ArrayList Notice that the foreachloop works seamlessly with collections aswell as it does with arrays

Again, because the ArrayListoperates on type object, it is convenient to use with any

type, whether it is a reference type or value type The preceding example showed you how

to assign a reference type, the Customerclass, to an ArrayList, which is convenient

However, there is a hidden cost when assigning value types to object type variables, such

as the elements of an ArrayList The next section explains this phenomenon, which isknown as boxing and unboxing

Performance Implications of Boxing and Unboxing

Boxing occurs when you assign a value type variable to a variable of type object

Unboxing occurs when you assign a variable of type object to a variable with the same

type as the true type of the object The following code is a minimal example that causesboxing and unboxing to occur:

decimal amountIn = 3.50m;

object obj = amountIn; // box

decimal amountOut = (decimal)obj; // unbox

Figures 4.2 to 4.4 illustrate what is happening in the preceding algorithm Figure 4.2

shows the first line

Before boxing, as in the declaration ofamountIn, the variable is just a value type that

contains the data directly However, as soon as you assign that value type variable to anobject, as in Figure 4.3, the value is boxed

Managed Heap

amountIn

FIGURE 4.2 A value type variable before boxing

Trang 7

(boxed amountIn)

FIGURE 4.4 Unboxing a value

As shown in Figure 4.3, boxing causes a new object to be allocated on the heap and acopy of the original value to be put into the boxed object Now, you have two copies ofthe original variable: one in amountInand another in the boxed decimal, obj, on theheap You can pull that value out of the boxed decimal, as shown in Figure 4.4

In Figure 4.4, the boxed value in objis copied into the decimal variable, amountOut Now,you have three copies of the original value that was assigned to amountIn

Writing code as shown here is pointless because the specific example doesn’t do

anything useful However, the point of this boxing and unboxing exercise is so that youcan see the mechanics of what is happening and understand the overhead associated with

it On the other hand, you could write a lot of code similar to the ArrayListexample inthe previous section; that is, unless you understood the information in this section Here’s

Trang 8

Reference Type and Value Type Memory Allocation

Because of the Unified Type System, this code is as convenient as the code written for the

Customerclass, but beware If the prices ArrayListheld 10, 20, or 100 decimal type ables, you probably wouldn’t care However, what if it contains 10,000 or 100,000? In thatcase, you should be concerned because this could have a serious impact on the perfor-

vari-mance of your application

Generally, any time you assign a value type to any object variable, whether a collection or

a method parameter, take a second look to see whether there is potential for performanceproblems In development, you might not notice any performance problem; after deploy-ment to production, however, you could get slammed by a slow application with a hard-to-find bug

From the perspective of collections, you have two choices: arrays or generics You can

learn more about arrays in Chapter 6, “Using Arrays and Enums.” If you are programming

in C# 1.0, your only choices will be arrays or collections, and you’ll have to design withtradeoffs between convenience and performance, or type safety and no type safety If

you’re using C# 2.0 or later, you can have the best of both worlds, performance and typesafety, by using generics, which you can learn more about in Chapter 17

Now that you know the performance characteristics of boxing and unboxing, let’s dig alittle deeper The next sections tell you more about what reference types and value typesare, their differences, and what you need to know

Reference type and value type objects are allocated differently in memory This can affectyour code in the area of method call parameters and is the basis for understanding assign-ment behavior in the next section This section takes a quick look at memory allocationand the differences between reference types and value types

Trang 9

Managed Heap

cust

FIGURE 4.5 Reference type declaration

Reference Type Memory Allocation

Reference type objects are always allocated on the heap The following code is a typicalreference type object declaration and instantiation:

Customer cust = new Customer();

In earlier chapters, I explained that this was how you declare and instantiate a referencetype, but there is much more to the preceding line By declaring custas type Customer,the variable custis strongly typed, meaning that only compatible objects can be assigned

to it Figure 4.5 shows the declaration of cust, from the left side of the statement

In Figure 4.5, the custbox is in your code, representing the declaration of custas

Customer On the right side of the preceding code, the new Customer()is what creates thenew instance of a Customerobject The assignment puts a reference into custthat refers

to the new Customerobject on the heap, as shown in Figure 4.6

Figure 4.6 shows how the custvariable holds a reference to the new instance of a

Customerobject on the heap The heap is a portion of memory that the CLR uses to cate objects This is what you should remember: A reference type variable will either hold

allo-a reference to allo-an object on the heallo-ap or it will be set to the C# vallo-alue null

Next, you learn about value type memory allocation and how it is different from referencetype memory allocation

Value Type Memory Allocation

The answer to where value type variables are allocated is “It depends.” The two places that

a value type variable can be allocated is either the stack or along with a reference type onthe heap See the sidebar “What Is the Stack?” if you’re curious about what the stack is

Trang 10

Managed Heap

cust new Customer()

FIGURE 4.6 Reference type object allocated on the heap

WHAT IS THE STACK?

The CLR has a stack for keeping track of the path from the entry point to the currently

executing method in an application Just like any other stack, the CLR stack works on a

last-in, first-out fashion When your program runs,Main(the entry point) is pushed onto

the stack Any method that Maincalls is then pushed onto the top of the stack

Method parameter arguments and local variables are pushed onto the stack, too When

a method completes, it is popped off the top of the stack, and control returns to the

next method in the stack, which was the caller of the method just popped

Value type variables passed as arguments to methods, as well as local variables defined

inside a method, are pushed onto the stack with the method However, if the value typevariable is a field of a reference type object, it will be stored on the heap along with thereference type object

Regardless of memory allocation, a value type variable will always hold the object that isassigned to it An uninitialized value type field will have a value that defaults to some

form of zero (booldefaults to false), as described in Chapter 2

C# 2.0 and later versions have a feature known as nullable types, which also allow valuetypes to contain the value null A later section of this chapter explains how to use

nullable types

Now you know where reference type and value type variables are allocated in memory, butmore important, you understand the type of data they can hold and why This opens thedoor to understanding the assignment differences between reference types and value

types, which is discussed next

Trang 11

Reference Type and Value Type Assignment

Based on what you know so far about reference types and value types—their relationshipthrough the Unified Type System and memory allocation—the step to understandingassignment behavior is easier This section examines assignment among reference typesand assignment among value types You’ll see how reference type and value type assign-ment differs and what happens when assigned values are subsequently modified We look

at reference type assignment first

Reference Type Assignment

To understand reference type assignment, it’s helpful to look at previous sections of thischapter, focusing on reference type features Because the value of a reference type resides

on the heap, the reference type variable holds a reference (to the object on the heap).Keeping this in mind, here’s an example of reference type assignment:

cust6.Name = “Jane Doe”;

Console.WriteLine(“Before Reference Assignment:”);

In the preceding example, you can see there are two variables, cust5andcust6, of type

Customerthat are declared and initialized Between sets of Console.WriteLinestatements,there is an assignment of cust6tocust5 The Console.WriteLinestatements show theeffect of the assignment, and here’s what they show when the program runs:

Before Reference Assignment:

cust5: John Smith

cust6: Jane Doe

After Reference Assignment:

cust5: Jane Doe

cust6: Jane Doe

You can see from the preceding output that the value of the Nameproperty in cust5and

cust6is different There are no surprises here because that is what the code explicitly didwhen declaring and instantiating the variables What could be misleading is the result,

Trang 12

after assignment, where both cust5.Nameandcust6.Nameproduce the same results Thefollowing statement and results show why the preceding results could be misleading:

Console.WriteLine(“After modifying the contents of a Reference type object:”);

Console.WriteLine(“cust5: {0}”, cust5.Name);

Console.WriteLine(“cust6: {0}”, cust6.Name);

And here’s the output:

After modifying the contents of a Reference type object:

cust5: John Smith

cust6: John Smith

In the preceding code, the only assignment was to change theNamefield ofcust6to”JohnSmith”, but look at the results TheNamefields of bothcust5andcust6are set to the

same value What’s tricky is that the code in this last example didn’t use thecust5able at all

vari-Now, let’s see what happened To start off, look at Figure 4.7, which shows what the

memory layout is right after cust5andcust6were declared and initialized

Becausecust5andcust6are reference type variables, they hold a reference (address) of anobject on the heap Figure 4.7 represents the reference as an arrow coming from the vari-ablescust5andcust6toCustomerobjects These Customerobjects were allocated duringruntime when the new Customerexpression ran Each object contains a different value in

Managed Heap

cust5 new Customer()

{Name = “John Smith”}

FIGURE 4.7 Two reference type variables declared and initialized separately

Trang 13

Managed Heap

{Name = “Jane Doe”}

FIGURE 4.8 Assigning one reference type variable to another

itsNamefield Next, you see the effects of assigning the cust6variable to cust5, shown inFigure 4.8

The assignment of cust6tocust5didn’t copy the object; it actually copied the contents ofthecust6variable, which was the reference to the object So, as shown in Figure 4.8, both

cust5andcust6now refer to the same object Figure 4.9 shows what happens after fying the Namefield of cust6

modi-Figure 4.9 shows that changing the Namefield of cust6actually modified the objectreferred to by cust6 This is important because both cust5andcust6refer to the sameobject, and any modification to that object will be seen through both the cust5andcust6

reference That’s why the output, after we modified cust6, showed that the Namefield of

cust5andcust6were the same

Managed Heap

{Name = “Jane Doe”}

FIGURE 4.9 Affects of modifying the contents of an object that has multiple references to it

Trang 14

Looking at reference type assignment in a more general perspective, you can assume

that modifications to the contents of an object are visible through any reference to thesame object

Assignment behavior isn’t the same for reference types and value types The next sectionshows how value type assignment works and how reference type and value type assign-ment differs

Value Type Assignment

Value type assignment is a whole lot simpler than reference type assignment Because avalue type variable holds the entire object, rather than only a reference to the object, nospecial actions can be occurring behind the scenes to affect that value Here’s an example

of a value type assignment, using the Money struct that was created earlier in the chapter:

field of both cash1andcash2is the same, as shown in the following output:

Before Value Assignment:

Trang 15

same as reference type assignment The next example demonstrates how value types areseparate entities that hold their own values:

After modifying contents of Value type object:

assign-In the next section, you learn a few more of the differences between reference types andvalue types

More Differences Between Reference Types and Value Types

In addition to memory allocation, variable contents, and assignment behavior, there areother differences between reference types and value types These differences can be catego-rized as inheritance, construction, finalization, and size recommendations These issuesare covered thoroughly in later chapters, so I just give you a quick overview here of whatthey mean and let you know where in this book you can get more in-depth information

Inheritance Differences Between Reference Types and Value Types

Reference types support implementation and interface inheritance They can derive fromanother reference type or have a reference type derive from them However, value typescan’t derive from other value types Chapter 9 goes into detail about how implementationinheritance works in C#, but here’s a quick example:

class Customer

Trang 16

More Differences Between Reference Types and Value Types

In the preceding example, Customeris the base class PotentialCustomerand

RegularCustomerderive from Customer, that is, they are types of Customer, as indicated

by the :(colon) on the right side of the classidentifier

SINGLE-IMPLEMENTATION INHERITANCE

Reference types support single-implementation inheritance, meaning that they can

derive only from a single class However, both reference types and value types support

multiple-interface inheritance where they can implement many interfaces

Construction and Finalization Differences Between Reference Types and Value Types

Construction is the process that occurs during the instantiation process to help ensure anobject has the information you want it to have when it starts up Chapter 15, “ManagingObject Lifetime,” discusses this in detail, but for reference, you should know that you

can’t create a default constructor for a value type Chapter 3, “Writing C# Expressions andStatements,” contains the default values of built-in types, which are some form of zero.Value types are automatically initialized when declared, which is why the cash1andcash2

variables in the previous section didn’t need to be instantiated with new Money()

Finalization is a process that occurs when the CLR is performing garbage collection, ing up objects from memory Value type objects don’t have a finalizer, but reference types

clean-do A finalizer is a special class member that could be called by the CLR garbage collectorduring cleanup Value types are either garbage collected with their containing type or

when the method they are associated with ends Therefore, value types don’t need a izer Because garbage collection is a process that operates on heap objects, it is possible for

final-a reference type to hfinal-ave final-a finfinal-alizer Chfinal-apter 15 goes into the gfinal-arbfinal-age collection process indetail, giving you the information you need to make effective design decisions on imple-menting finalizers, and other techniques for managing object lifetime effectively

Trang 17

Object Size Considerations for Reference Types and Value Types

Because of the way an object is allocated differently for reference types and value types,you might need to consider the impact of the size of the object on resources and perfor-mance Reference type objects can generally be whatever size you need because the vari-able just holds a reference, which is 32 bits on a 32-bit CLR and 64 bits on a 64-bit CLR.However, value type size might need more thought

If a value type is a field inside of a class or at some level of containment that puts it into aclass, its size shouldn’t be much concern However, think about scenarios where you mightneed to pass a value type to a method In the case of a reference type argument, it is simplythe reference being passed, but for a value type argument, the entire object is passed.With local variables and parameters that are value types, the CLR pushes the entire objectonto the stack when calling the associated method Now, instead of the 4 or 8 bytes thatwould have been pushed with a reference type, you have potentially much more informa-tion to push, which represents overhead A recommended rule of thumb for value typesize is 16 bytes I’ve benchmarked this by calling methods that have value type parameters

of differing sizes and verified that performance does tend to deteriorate faster as the size ofthe value type increases above 16 bytes That said, you should also look at how manytimes you’ll call the method before the performance implications matter to you; that is,consider whether the method is called frequently or in a loop

REFERENCE TYPE OR VALUE TYPE: WHICH TO CHOOSE?

As a rule of thumb, I typically create new types as classes, reference types The tion is when I have a type that should behave more like a value type For example, a

excep-ComplexNumberwould probably be better as a struct value type, because of its

memo-ry allocation, assignment behavior, and other capabilities such as math operations thatare similar to built-in value types such as int and double

Among the many tips you get from this chapter for working with both reference types andvalue types, you also have a lot of facts related to tradeoffs Look at the differences to seewhat matters the most in your situation and choose the tradeoffs that are best for you.The next section looks at specific NET types, building on what you learned so far in thischapter

In Chapter 1, “Introducing the NET Platform,” you learned about the CTS and in

Chapter 3, you learned how to use the C# built-in types This section melds these twofeatures together and builds upon them so that you can see how C# types support theCTS We also look at a couple NET Framework types (specifically, DateTimeandGuid) thatare important but don’t have C# keyword aliases

Trang 18

C# Aliases and the CTS

C# types are specified with keywords that alias NET CLR types Table 4.1 shows all the

.NET types and which C# keywords alias them

Some of the types in Table 4.1 are marked as “No alias” because C# doesn’t have a

keyword that aliases that type, but the type is still important For example, DBNullis a

value that comes from a database field that is set to NULLbut is not equal to the C# null

value The following sections show you how to work with the System.Guidand

System.DateTimetypes, which don’t have C# aliases either

Using System.Guid

A globally unique identifier (GUID) is a 128-bit string of characters used whenever there is

a need for a unique way to identify something You can see GUIDs used throughout theMicrosoft operating system Just look at the registry; all of those long strings of characters

TABLE 4.1 NET Types with Matching C# Aliases

.NET Type C# Alias

Trang 19

are GUIDs Another place GUIDs are used is as unique columns in SQL Server for whenyou need unique IDs across separate databases Generally, any time you need a uniquevalue, you can reliably use a GUID

GUIDs are Microsoft’s implementation of universally unique identifiers (UUID), an OpenSoftware Foundation (OSF) standard You can find more information about UUIDs atWikipedia, http://en.wikipedia.org/wiki/Universally_Unique_Identifier

.NET implements the GUID as the System.Guid(Guid) struct You can use the Guidtype togenerate new GUIDs or work with an existing Guidvalue Here’s an example of how youdon’t want to create a new GUID:

Guid uniqueVal1 = new Guid();

Console.WriteLine(“uniqueVal1: {0}”, uniqueVal1.ToString());

The problem here is that Guidis a value type and it is immutable (can’t be modified) Ifyou recall from a previous section, value types have a default (no parameter) constructorthat you can’t override, and the default value is some form of zero Therefore, the follow-ing output from the preceding code makes sense:

uniqueVal1: 00000000-0000-0000-0000-000000000000

BecauseGuidis immutable, you can’t change this value Fortunately, if you have a Guid

value already defined, you are still able to work with it because Guidhas several overloadsfor specifying an existing GUID Here’s the Guidconstructor overload that takes a string:

uniqueVal1 = new Guid(“89e9f11b-00ee-47dc-be15-01f70eeac3f9”);

Trang 20

Working with System.DateTime

Many programs need to work with dates and times Fortunately, NET has the

System.DateTime(DateTime) type to help out You can use DateTimeto hold DateTime

values, extract portions such as the day of a DateTime, and perform arithmetic

calcula-tions You can also parse strings into DateTimeinstances and emit DateTimeinstances as astring in the format of your choice

Creating New DateTime Objects

The default value ofDateTimeis Jan 1, 0001 at 12:00 midnight Here’s how to create the

You can initialize the DateTimethrough the constructor, which has several overloads

Here’s an example of how to use one of the more detailed overloads:

date = new DateTime(2008, 7, 4, 21, 35, 15, 777);

This section worked with the entire DateTime, but sometimes you only want to have

access to parts of a DateTime The next section shows you how to extract different parts of

aDateTime

Extracting Parts of a DateTime

You can access any part of a DateTimeinstance, including parts of the date, day of the

week, or day of the year Here’s an example:

Console.WriteLine(

“{0} day {1} of the month is day {2} of the year”,

date.DayOfWeek, date.Day, date.DayOfYear);

Trang 21

And here’s the output

Friday day 4 of the month is day 186 of the year

You can also extract other parts of the date (for example, month and hour) If you’re usingVS2008, you can see all of them in IntelliSense

Next, you learn how to manipulate DateTimeobjects

DateTime Math and TimeSpan

You often need to manipulate DateTimeobjects However, because they are immutable,you need to create a new instance with a modified value Here’s an example:

Console.WriteLine(“date before AddDays(1): {0}”, date);

date.AddDays(1);

Console.WriteLine(“date after AddDays(1): {0}”, date);

The preceding code calls the AddDaysmethod, trying to add a day, but the original value,

date, doesn’t change, as shown by the following output:

date before AddDays(1): 11/4/2007 8:48:26 PM

date after AddDays(1): 11/4/2007 8:48:26 PM

This just proves that DateTimeis immutable and hopefully saves you from making thiscommon mistake yourself Here’s how you can change the datevariable:

Console.WriteLine(“date before AddDays(1): {0}”, date);

date = date.AddDays(1);

Console.WriteLine(“date after AddDays(1): {0}”, date);

If you look at the documentation for AddDaysand other methods of DateTimethat ulate dates, you see that they return a DateTime Just reassign the return value to the origi-nal variable, as in the preceding example, and it will work fine Here’s the output:

manip-date before AddDays(1): 11/4/2007 8:52:08 PM

date after AddDays(1): 11/5/2007 8:52:08 PM

The preceding date shows that datewas truly modified because the day was incremented

by one as intended

You can also use the DateTimetype for quick-and-dirty performance benchmarks Here’ssome code that does DateTimemath and produces a TimeSpanobject to tell how long analgorithm took Here’s an example of how you might go about this:

int testIterations = int.MaxValue/4;

DateTime start = DateTime.Now;

Trang 22

DateTime finish = DateTime.Now;

TimeSpan elapsedTime = finish - start;

Console.WriteLine(“Elapsed Time: {0}”, elapsedTime);

Theforloop is code that you might change to hold whatever algorithm you need to

benchmark The example gets the current time before and after the forloop Notice howthe mathematical operation, subtracting startfromfinish,produced a TimeSpan A

TimeSpanis used to represent an amount of time, as opposed to an exact time as held by

DateTime Any time you perform a mathematical operation on DateTimetypes, the returnvalue is a TimeSpan Here’s the output:

Elapsed Time: 00:00:16.2834144

Here’s an exercise that you might find fun Create a few methods that take value type

parameters of varying sizes For example, you could create multiple versions of Moneyandadd more decimal fields to make them bigger Then use the benchmark preceding code tocall each method a specified number of times and compare the TimeSpanresults of each

Do the same with a reference type This exercise will let you know at what point the size

of the value type affects performance

Converting Between DateTime and string Types

If the user inputs a date and/or time,it will often reach the code in the form of a string.Alternatively, sometimes a DateTimeneeds to be formatted and presented in the form of astring This section shows you how to read string types into a DateTimeand how to formatthe output of a DateTime

TheDateTimetype has a Parsemethod you can use to get the value of a string Here’s

how you can use it:

Console.Write(“Please enter a date (mm/dd/yyyy): “);

string dateStr = Console.ReadLine();

date = DateTime.Parse(dateStr);

Console.WriteLine(“You entered ‘{0}’”, date);

The user’s input, retrieved by the call to Console.ReadLine, came back in the form of a

string,dateStr The call to DateTime.Parseconverted the string to a DateTime, which cannow be manipulated with DateTimemembers as described in previous sections

Trang 23

You could see an exception message after typing in the date on the command line Thiswould be because the date was not typed in the correct format In Chapter 11, “Error andException Handling,” you’ll learn how to handle errors like this, and Chapter 10 shows youhow to use theTryParsemethod, which is effective for handling user input Here’s theoutput:

Please enter a date (mm/dd/yyyy): 11/04/2007

You entered ‘11/4/2007 12:00:00 AM’

Notice from the output that the response to the Please enter a date (mm/dd/yyyy)

prompt was 11/04/2007 However, the response included the time, which you may or maynot want In case you don’t want the time to show or you want the output to appeardifferently, you have the option to specify the output format Here’s what you could do toremove the time from the preceding output:

Console.WriteLine(“Date Only: {0:d}”, date);

The preceding example used the format specifier in the placeholder of

Console.WriteLine’s format string parameter You could have also used the DateTimeToStringmethod like this:

Console.WriteLine(“Date Only: {0}”, date.ToString(“d”));

A lowercase dmeans to print a short date time Here’s what it looks like:

Date Only: 11/4/2007

In addition to thed, there are several other predefined format specifiers, shown in Table 4.2

TABLE 4.2 StandardDateTime Format Strings

Format String Output for date = new DateTime(2008, 7, 4, 21, 35, 15, 777);

Trang 24

Table 4.2 shows a predefined set of strings for formatting dates, but you aren’t limited bythis list You can also customize DateTimestrings Here’s an example that ensures two

characters for each part of the date:

Console.WriteLine(“MM/dd/yy: {0:MM/dd/yy}”, date);

Based on the input used for Table 4.2, the output would be this:

be useful; it also includes the ones used in the preceding example

As with DateTime, most of the other built-in types are value types whose value is alwaysdefined The next section discusses nullable types and helps you deal with those situationswhere the value you have to work with is not defined, but is null

TABLE 4.3 Common Custom DateTime Format Strings

Format String Purpose

Ddd Abbreviated name of day (for example, Fri)

dddd Full name of day (for example, Friday)

MMM Abbreviated name of month (for example, Jul)

MMMM Full month name (for example, July)

Trang 25

Nullable Types

As you’ve learned previously, the default value for reference types is null, and the defaultvalue for value types is some form of zero Sometimes, you receive values from externalsources, such as XML files or databases that don’t have a value—they could be nilor

null, respectively For reference types, this is no problem However, for value types, youhave to find your own solution for working with nullvalues

This problem, not being able to assign nullto value types, was alleviated in C# 2.0 withthe introduction of a feature called nullable types It essentially allows you to declarenullable value types to which, as the name suggests, you can assign the value null.Think about how useful this is SQL Server has column types that map to the C# built-intypes For example, SQL Server moneyanddatetimecolumn types map to C# decimaland

DateTimetypes In SQL Server, these values can be null However, that is particularlyproblematic when dealing with an application that interfaces with multiple databases Youcould be working with FoxPro, SQL Server, and another database, and they all return adifferent default DateTimevalue, which makes a mapping between DBNulland the default

DateTimevalue impractical This is just one element of complexity you have to deal withfornulldatabase values, and there are many more By having nullable types, we can morequickly write easier-to-maintain code

An entire part of this book, Chapters 19 to 23, provides extensive coverage of workingwith data in NET, and this material is applicable in the context of those chapters

However, the examples here assume that there is code that has extracted data from a datasource that contains nullvalues The following example assumes there is a value from adatabase for the creation date of a record:

DateTime? createDate = null;

The most noticeable part of the preceding statement is the question mark suffix, ?, on the

DateTimetype The proper terminology for this is that the type of createDateis a nullable

DateTime It is explicitly set to the value null, which is not possible in non-nullable valuetype objects

There are a couple ways to check a nullable type to see whether it has the value null.Here’s an example:

bool isNull;

isNull = createDate == null;

isNull = createDate.HasValue;

Using the C# equals operator, you can learn whether createDateis set to null Calling

HasValuewill return trueifcreateDateis not null The C# not equals operator, !=, isequivalent in behavior to HasValue

Trang 26

createDate = createDate ?? DateTime.Now;

As you can see, the ??operator is quicker to code for such a simple task Here’s anotherexample:

DateTime? defaultDate = null;

createDate = createDate ?? defaultDate ?? DateTime.Now;

In the preceding code, if createDateisnull,defaultDateis evaluated If defaultDateisnotnull,defaultDateis assigned to createDate Otherwise, the next expression,

DateTime.Now, is evaluated If none of the expressions are non-null, the last expression inthe chain of ??operators is returned, even if it’s null, too

mater-back to this chapter for clarification

A related subject is the C# built-in types and how they relate to NET types Some NETtypes don’t have a C# keyword equivalent, such as GuidandDateTime Whereas you

might use Guidjust occasionally, you will probably use DateTimea lot, and this chaptershowed you much of the common usage you’ll need

This chapter discussed nullable types, which are very applicable for working with valuetype data In later Chapters, 19 through 23 to be specific, you’ll see extensive discussion

of C# and NET data capabilities, which demonstrates effective implementation of

Nullable types

Up until now, we’ve mostly discussed the in value types However, there is one

built-in reference type, strbuilt-ing, that is pervasive for most programs The next chapter goes built-intodepth about the string type and how to use it

Trang 27

This page intentionally left blank

Trang 28

CHAPTER 5

Manipulating Strings

IN THIS CHAPTER

The C# String Type

TheStringBuilderClass

Regular Expressions

Strings are ubiquitous in programming, so much so that

the NET Framework Class Library (FCL) has extensive

support for strings Besides the string type with numerous

methods, there is a special class called StringBuilderfor

manipulating strings efficiently In this chapter, you’ll read

about the string and StringBuildertypes

A related feature, regular expressions, has FCL APIs that

offer even greater flexibility for working with strings In this

chapter, you’ll learn how to build a regular expression and

use it for pattern matching on blocks of text Let’s look at

the C# string type first

The C# String Type

Among the C# built-in types, string is the only reference

type This suprises people sometimes because of the fact

that it is a built-in type and has behavior similar to value

types If you are a little fuzzy on the differences between

reference types and value types, you might want to refer to

Chapter 4, “Understanding Reference Types and Value

Types,” for a quick refresher The features of a string type

that makes it behave like a value type are immutability and

being sealed

The string type is immutable, meaning that a string can’t be

modified once created All methods that appear to modify a

string really don’t; they create a new stringobject on the

heap and return a reference to the new stringobject

The string type is also sealed, meaning that it can’t be

derived from

Trang 29

CHAPTER 5 Manipulating Strings

Being immutable and sealed makes the string type more efficient and secure The ciency comes from the way the Common Language Runtime (CLR) manages strings inmemory with an intern pool and limits the overhead of changing string content From asecurity perspective, sealing a string keeps derived classes from manipulating stringcontent Sealing also supports CLR memory efficiencies and eliminates the overhead ofvirtual type member management

effi-Now, let’s check out what you can do with string types The following sections describemembers of the stringclass Remember that members called on the string type (forexample,string.Format) are static methods, and those called on a stringinstance areinstance methods

YOU CAN FIND OVERLOADS WITH INTELLISENSE

Many string methods have overloads, allowing you to use the method with different

types or numbers of parameters A quick way to see the overloads is to take advantage

of IntelliSense in the editor

For example, if you type str, type (dot) to fill in the string, type C, and type ( (left

parenthesis) to fill in the Compare, you’ll see IntelliSense pop up On the left side of

the IntelliSense pop-up, you’ll find up and down arrows labeled 1 to 8 You can pressthe up and down arrows on the keyboard to traverse the available overloads

to implement the Formatmethod:

para-string.Format({0,15}, string 2) = [string 2]

You might want to take a closer look at how this output occurred by noticing that the

formatStringvariable itself was used as input to the first index, {0}, which is part of

Trang 30

The C# String Type

Rorr Round trip (guarantees conversion from floating point to string

and back again)

Console.WriteLine’s format string parameter Used with Format, the formatStringable becomes a format item; otherwise, it is a normal string

vari-As you can see, the result is a 15-character string, between brackets, with the text right

aligned and padded to the left with spaces The comma between the 0 and 15 in{0,15}

separates the index from alignment (specifies both alignment and character width)

If you don’t want the result to be right-aligned, make the alignment negative so that it

reads as {0,-15}, which will look like this:

string.Format({0,-15}, string 2) = [string 2]

In addition to alignment, you can control the output format of the parameter matching

an index with a format string The following example applies a numeric parameter, 10, totwo different format items, currency and hex:

Currency: $10.00, Hex: ‘ A’

You can see that currency used a U.S dollar sign and a period to separate dollars from

cents If your machine were set to another locale, the output would have matched yourcurrency symbol and other punctuation Table 5.1 shows several other standard numericformat strings

Trang 31

TABLE 5.2 Custom Numeric Format Strings

Custom Numeric Format Character Meaning

The standard numeric format strings are useful because they are quick to use for commonscenarios Sometimes, however, you need more control over the output, building your owncustom format strings Table 5.2 has a list of custom numeric format strings

Using Table 5.2, we can re-create the currency format like this:

Trang 32

The C# String Type

The first format string matches a positive number, the second format string matches a

negative number, and the third format string matches zero

Comparing Strings

When comparing strings, it’s often easier to use comparison operators, such as ==,<, or >.However, the CompareandCompareOrdinalmethods are available to retrieve a single int

value for the results of the comparison Some types in the FCL actually require an int

value specifying less than, equal, or greater than, so having this available to call is nient The following paragraphs discuss the CompareandCompareOrdinalmethods To

conve-keep from repeating code, you can assume the values being used are as follows:

The variable, intResult, is –1

TheCompareOrdinalmethod compares two strings, independent of localization It

produces the following integer results:

Trang 33

Notice how I switched the order of strings from ComparetoCompareOrdinal The

intResultfrom the call to CompareOrdinalis1

TheCompareTomethod compares the value of the thisinstance with a parameter string Itproduces the following integer results:

this < string= negative

this == string= zero

this > string= positive

The result, intResult, is –1

Checking for String Equality

Comparemethods, as you learned about earlier, are good for sorting algorithms becausethey help figure out which value comes before another However, sometimes you justneed to know whether two strings are equal (for example, in a search algorithm)

A quick and common way to check for string equality is to use the==operator Here’s

Trang 34

The C# String Type

The instance Equalsmethod also determines whether two strings are equal, returning a

boolvalue of truewhen they are equal and a boolvalue of falsewhen they’re not

Here’s an example:

boolResult = str1.Equals(str2);

Console.WriteLine(“{0}.Equals({1}) = {2}\n”,

str1, str2, boolResult);

In this example, the Equalsmethod accepts one string parameter Because str1has a

different value than str2, the return value is false

Concatenating Strings

C# has a concatenation operator, +, that makes is easy to concatenate strings Here’s howyou can use it:

strResult = str1 + “, “ + str2;

ThestrResultvariable will equal “string 1, string 2” after the preceding statement

executes This is equivalent to calling the Concatmethod, but with shorter syntax

As you’ve seen previously, most of the Console.WriteLinestatements have used

place-holders to define where a parameter should go You can also use the concatenation tor instead, like this:

The first string uses the results of the previous statement, the second uses the format

string technique you’ve seen in all earlier examples, and the third uses concatenation toproduce a single parameter for the Console.WriteLinecall You have several choices, andall are valid

Trang 35

Yet another concatenation method is the Concatmethod, which creates a new string fromone or more input strings or objects Here’s an example of how to implement the Concat

method using two strings:

TheCopymethod makes a copy of str1 The result is a copy of str1placed in

stringResult This is not the same as assignment, shown here:

strResult = str1;

After executing the preceding line, both strResultandstr1hold identical references tothe same string in memory However, Copycreated a new instance of the string Rememberthat string is a reference type (Chapter 4 has more info if you need a refresher.)

If you don’t want to copy an entire string, perhaps just a subset, you can use the CopyTo

method, which copies a specified number of characters from one string to an array ofcharacters Here’s an example of how to implement the CopyTomethod:

char[] charArr = new char[str1.Length];

str1.CopyTo(0, charArr, 0, str1.Length);

Console.WriteLine(

“{0}.CopyTo(0, charArr, 0, str1.Length) = “,

str1);

Trang 36

The C# String Type

And here’s the output:

string 1.CopyTo(0, charArr, 0, str1.Length) =

s t r i n g 1

This example shows the CopyTomethod filling a character array It copies each characterfromstr1intocharArr, beginning at position 0 and continuing for the length of str1.Theforeachloop iterates through each element of charArr, printing the results

TheClonemethod returns a copy of a string Here’s an example of how to implement the

return value must be cast to a string before assignment to stringResult

Inspecting String Content

Sometimes you need to search for a string to see whether it begins, ends, or contains a

substring anywhere in between For these tasks, you can use the StartsWith,EndsWith,

andContainsstring methods

TheStartsWithmethod determines whether a string prefix matches a specified string

Here’s an example of how to implement the StartsWithmethod:

an example of how to implement the EndsWithmethod:

Trang 37

The results of the call to Containsaretruebecause the value of str1does contain ”ring”

Extracting String Information

Beyond just checking to see whether a string contains a value, you can find out tion about where the string is located by using the IndexOfandLastIndexOfmethods.You can use these results in the CopyTomethod, shown earlier, or for explicitly extractingthe contents of a substring with the SubStringmethod

informa-TheIndexOfmethod returns the position of a string IndexOfreturns–1if the string isn’tfound Here’s an example of how to implement the IndexOfmethod:

intResult = str1.IndexOf(‘1’);

Console.WriteLine(“str1.IndexOf(‘1’): {0}”, intResult);

The return value of this operation is 7because that’s the zero-based position within str1

where the character ’1’occurs

TheLastIndexOfmethod returns the position of the last occurrence of a string or ters within a string Here’s an example of how to implement the LastIndexOfmethod:

charac-string filePath = @”c:\Windows\Microsoft.NET\Framework\v3.5.x.x\csc.exe”;

Trang 38

The C# String Type

You can use the IndexOfandLastIndexOfmethods to extract substrings from a string

using the SubStringmethod The SubStringmethod retrieves a substring at a specifiedlocation of a string Here’s an example of how to implement the SubStringmethod:

strResult = str1.Substring(str1.IndexOf(“ring”), 4);

Console.WriteLine(“str1.Substring(str1.IndexOf(\”ring\”), 4) : {0}”, strResult);

Here’s the output:

str1.Substring(str1.IndexOf(“ring”), 4) : ring

The first parameter was the position in str1to begin, which was returned by the call to

IndexOf The second parameter was the length of the substring

Padding and Trimming String Output

When displaying strings, you’ll often want to control the spacing or characters surroundingeach side of the string For example, you might want to apply spacing or zero padding on oneside or the other of a string to get it to line up properly, perhaps in a column, in the output.Other times, you’ll receive strings with spaces or some other character on the beginning, end,

or both sides of a string (things you would rather not see) This section introduces you to

padding methods for adding characters and trimming methods for removing characters

ThePadLeftmethod right-aligns the characters of a string and pads the left with spaces(by default) or a specified character Here’s an example of how to implement the PadLeft

Opposite to the PadLeftmethod, the PadRightmethod left aligns the characters of a

string and pads on the right with spaces (by default) or a specified character Here’s an

example of how to implement the PadRightmethod:

strResult = str1.PadRight(15, ‘*’);

Console.WriteLine(“str1.PadRight(15, ‘*’): ”, strResult);

Trang 39

The example shows the PadRightmethod creating a 15-character string with the originalstring left-aligned and filled to the right with *characters, as shown here:

TheTrimEndmethod removes a specified set of characters from the end of a string Here’s

an example of how to implement the TrimEndmethod:

strResult = trimString.TrimEnd(new char[] {‘ ‘});

Console.WriteLine(“trimString.TrimEnd(): ”,

strResult);

In this example, the TrimEndmethod removes all the whitespace from the end of

trimString The result is ”nonwhitespace”, with no spaces on the right side

TheTrimStartmethod removes whitespace or a specified number of characters from thebeginning of a string Here’s an example of how to implement the TrimStartmethod:

strResult = trimString.TrimStart(new char[] {‘ ‘});

Console.WriteLine(“trimString.TrimStart(): ”,

strResult);

Here, the TrimStart()method removes all the whitespace from the beginning of

trimString The result is ”nonwhitespace”, with no spaces on the left side

Trang 40

The C# String Type

Modifying String Content

A few string methods return a modified version of a string You can insert, remove, or

replace the content of a string by using the Insert,Remove, and Replacemethods, tively Other modification methods include ToLowerandToUpper, which convert all stringcharacters to lowercase and uppercase, respectively

respec-TheInsertmethod returns a string where a specified string is placed in a specified tion of an original string All characters at and to the right of the insertion point are

posi-pushed right to make room for the inserted string Here’s an example of how to

imple-ment the Insertmethod:

Strictly speaking, you never really modify a string A string is immutable, meaning that

it can’t change What really happens when calling a method such as Insert,Remove,

orReplaceis that the CLR creates a new string object and returns a reference to that

new string object The original string never changed

This is a common mistake by people just getting started with C# programming, so

remember this any time you look at a string after one of these operations, thinking that

it should be changed Instead, assign the results of the operation to a new string

vari-able Assigning the result of the string manipulation to the same variable will work, too;

it just assigns the new string object reference to the same variable

TheRemovemethod deletes a specified number of characters from a position in a string.Here’s an example of how to implement the Removemethod:

strResult = str2.Remove(3, 3);

Console.WriteLine(“str2.Remove(3, 3): {0}”,

strResult);

This example shows the Removemethod deleting the fourth, fifth, and sixth characters

fromstr2 The first parameter is the zero-based starting position to begin deleting, andthe second parameter is the number of characters to delete The result is ”str 2”, wherethe”ing”was removed from the original string

TheReplacemethod replaces all occurrences of a character or string with a new character

or string, respectively Here’s an example of how to implement the Replacemethod:

Định dạng
Số trang	105
Dung lượng	12,95 MB