C# Language Reference
This appendix gives a summary of the definition of the C# language. It
is not intended to be read from beginning to end in one go. Rather, it is
there to provide a reference for those times when you need to check up some
particular bit of C# syntax. Accordingly, although the topics are presented
roughly in order, with the most basic topics first and the more advanced
topics towards the end of the appendix, you will find that there is no strict
progression in the examples. Any example might assume knowledge of any other
part of the C# language, other than the point being illustrated.
Note that all the examples here assume that you are an experienced programmer,
with a knowledge of the principles of object-oriented programming. For example,
we assume that you are familiar with the broad concepts of inheritance and
method overrides. If you need to revise any of these concepts, or the details
of the .NET framework, then you will be better off reading the chapters in
this book that cover those concepts. The appendix covers the C# language
itself. It does not deal with the .NET base classes except where certain
aspects of language syntax depend on certain base classes. Nor does it cover
assemblies, the C# compiler, or the .NET environment in which compiled Intermediate
Language code will run.
Note also that this appendix, while intended as a reference, is also designed
to be easy to read rather than exhaustive and formal in its coverage of C#.
If you require detailed syntax of the most esoteric features of the language
or a rigorous, mathematical-style, definition of the syntax, then the appropriate
place to look is the C# language specification in the MSDN documentation.
Conventions
Throughout much of the appendix, we try to present real-world samples.
To this end, many examples involve classes or methods that we presume have
been defined elsewhere. In general, if any class or method is introduced
without comment, you can assume that it is intended as an example of a class
or method that you might have written. .NET base classes are not generally
used to provide samples unless explicitly indicated, with the exception of
Console.WriteLine() , which is used to write the contents of its parameters in various formats to the console window.
Also, note that C# classes and structs contain a number of items that
can contain code: methods, properties, indexers, constructors, destructors,
operator overloads or custom casts. The general term for such members is
function . We use the term member to refer to any item, function or data, that is defined in a class or struct.
Where the meaning is clear, we will use "class etc." or simply "class",
when our discussion applies equally to classes or structs.
C# Program Syntax
In this section we will cover the main principles of overall C# syntax.
Basic Syntax
Statements in C# are separated by semicolons:
int X = 3;
Console.WriteLine("X is " + X);
All keywords in C# are completely lowercase
The compiler also ignores all excess white space - including carriage
returns, space characters and tab characters, so that for example the following
code snippet is completely equivalent to the above.
int X =
3 ; Console.WriteLine("X is "
+ X);
In particular, note that it is quite legitimate for statements to be spread
across multiple lines - it is always the semicolon that marks the end of
the statement.
Although the first snippet would be regarded as a better programming style
because it is easier to read. On the other hand this code would not compile.
int X = 3 // error - no semicolon
Console.WriteLine("X is " + X)
C# is also case sensitive, so that for example X and x are considered different variables.
int X = 3;
int x = 6;
Microsoft has provided fairly detailed guidelines for naming conventions
(detailed in the MSDN documentation for the .NET framework). Here we will
simply note that you are advised to take advantage of the case sensitivity
by declaring names that differ only in case only in code that is not visible
to outside classes etc., since your code may be called by VB.NET code - and
VB.NET is case insensitive. We will also note that the examples in this appendix
have wherever possible been designed according to the Microsoft naming guidelines.
Block Statements
C# allows several statements to be grouped together into a single block
statement (sometimes also referred to as a compound statement), by enclosing
them in curly braces.
{
int X = 3;
Console.WriteLine("X is " + X);
}
A block statement is treated as if it was a single statement, and may
be used anywhere that a single statement would be acceptable. For example,
an if statement takes one other statement, which will be conditionally executed depending on whether the if expression evaluates to true
. If you wish more than one statement to be conditionally executed, you can
simply put all of the single statements into one block statement.
if (Profit > 500000) // single statement in if block
Console.WriteLine("Wow!!!!!!!!!!");
Or:
if (Profit > 500000) // block statement in if block
{
string Message = "Wow!!!!!!!!!!";
Console.WriteLine(Message);
}
Braces also serve to define the scope of variables. Any variable declared
anywhere in a block statement is valid until the closing brace that ends
that block statement.
if (Profit > 500000) // block statement in if block
{
string Message = "Wow!!!!!!!!!!";
Console.WriteLine(Message);
}
Message = "if block is over"; // WRONG. The variable Message no
// longer exists
Besides their use in marking block statements, curly braces are used to
mark the extent of definitions of classes, structs, methods etc.
class MyClass
{
// definition of MyClass
}
Note that there is no need to add a semicolon after a closing curly brace
that is used to make a block statement or scope a definition. However, a
trailing semicolon is required if curly braces are used to delimit a list
of values for an array:
double [] d = {2.0, 4.0, 6.0};
Comments
C# has three syntaxes for marking comments:
If the character sequence // occurs in any line, then the rest of that line is assumed to be a comment.
int X = 5; // this is a comment to say that X is a groovy int
int Y = 6;
If the character sequence /* occurs anywhere, then the compiler assumes that everything that follows is a comment until it discovers a */ sequence. The comment may last for part of a line or for several lines
/* The following statement ought to tell us how much money we will make
from UnitsSold units, assuming costs remain as specified earlier.
Not sure if this is working so we temporarily substitute 100 to see
what happens */
decimal Profit = EstimateProfitFromUnitsSold( /* UnitsSold */ 100);
Note that it is not possible to nest /* */ comments. The compiler will assume the comment ends the moment it encounters a */ sequence, no matter how many initial /* it encountered.
If the character sequence /// occurs
anywhere, then the rest of that line is assumed to be a comment. However,
this is also regarded as a special comment from which XML documentation will
be automatically generated, if the option to generate documentation is specified
with the compiler options.
Note that if any of the symbols /* , // , and /// occur inside a string constant (in other words, after a " ), then they will be interpreted as part of the string, not as a comment marker.
string Message = "/* is one of the symbols used for comments";
Console.WriteLine(Message); // this line will execute normally
Program Structure A C# program can be considered
informally to consist of the following items: Namespaces - labels that can
be used to provide extended names to classes and so avoid naming conflicts.
Types - Definitions of data types, including classes, structs, enumerations,
and delegates. Variables - declarations of instances of the data types. Executable
statements - instructions that will be executed when the program is run (as
opposed to generic statements, which might contain instructions or might
simply declare variables etc.) Functions - declarations of sequences of statements,
grouped into a callable entity, including methods, properties, indexers,
constructors, destructors, operator overloads and casts. C# programs also
contain interfaces and preprocessor directives. Interfaces are definitions
of contracts that define the functions a class guarantees to implement. Preprocessor
directives are commands that are formed using a special syntax and which
generally inform the compiler how the code should be compiled. However, since
these are compiler directives and do not actually exist in the compiled code,
we won't consider them in this section - they are covered later in the appendix.
Most preprocessor directives can be placed anywhere in your code. C# has
strict rules for which items in your code can be contained in which other
items. The situation is shown in the diagram, which indicate what items can
be contained in what other items. The arrows in this diagram lead from the
contained item to the containing item. All items that exist in the compiled
code should be contained in a namespace. This is the only rule that is not
rigidly enforced. It is legal C# code to define types that are not in a namespace,
but this is not recommended. In almost all cases, all your types and other
items will be in at least one namespace.
class MyClass { /*etc.*/} // OK but not advised - ought to be in a
// namespace
namespace Wrox.ProfessionalCSharp.AppendixA
{
// all other items must go in here
}
Because declaring types outside namespaces is not recommended,
we have not shown this situation in the above diagram. Within a namespace,
the only top level items that can be defined are definitions of classes,
structs, enumerations, and delegates, interfaces or other namespaces. In
particular, it is not permitted to instantiate a variable of any type outside
the definition of a class.
namespace Wrox.ProfessionalCSharp.AppendixA
{
int Y; // WRONG - must be inside a data type definition
class MyClass
{
public int X; // OK
}
}
This rule is because C# is designed to encourage an object-programming
style, as opposed to the procedural style of languages such as Visual Basic.
The idea is that everything in C# is an object or part of an object.
Definitions of data types (classes etc) may also occur inside definitions
of other classes or structs. However, data types may not be defined inside
methods.
// inside a namespace definition
class MyClass
{
public enum Shape {Square, Circle, Pentagon}; // OK
public void DoSomething()
{
enum HouseType {Terrace, EndTerrace, Detached}; // WRONG! Declared
// inside a method
}
}
Variables may be instantiated either inside class definitions (member
fields) or inside method definitions (local variables). They may not be instantiated
anywhere else.
// inside a namespace definition
class MyClass
{
private int X ; // OK - member field
public void DoSomething()
{
int X; // OK - local variable
}
}
Note that it is perfectly acceptable - as in the above example shows -
to declare a variable in a method that has the same name as a field in the
containing class. In this case the local variable hides the field, and if
you need to access the field inside that method, you must refer to it as
this.X .
class Customer
{
private string name;
public Customer(string name)
{
this.name = name;
}
On the other hand, it does appear to be an error to declare a variable
inside a block statement that would hide another variable of that name in
the same method:
void DoSomething()
{
int X; // OK - local variable
{
int X; // WRONG - hides the other X variable
}
}
Obviously, the only place in which statements, other than definitions
of data types and declarations of variables, can occur is inside method definitions.
Namespaces are normally the topmost level in your code. Everything else
is defined inside a namespace. The only thing that can contain a namespace
is an other name space.
namespace Wrox
{
namespace ProfessionalCSharp.AppendixA
{
// class definitions etc.
}
}
The only statements that should normally occur outside a namespace if you are following good programming principles are
- Preprocessor directives - since these do not actually
compile to any items in the compiled assembly, but simply affect the way
the compiler does its work
- The using statement. This statement
has the syntax of a normal statement, but has a similar effect to a preprocessor
directive in that all it does is modify how the compiler recognizes symbols
in your code as it is compiling. It doesn't cause any items in the compiled
code to be created.
#define DEBUG // OK
using System; // OK
namespace Wrox
{
Identifers
Identifiers (that is to say, names of types, variables, and name spaces)
must, with a couple of exceptions, conform to the Unicode 3.0 standard. For
practical purposes this means that they must begin with a capital or lower
case letter ( A-Z , a-z, or a valid letter in another language, for example the letters æ, ø, or å used in some Scandinavian languages) or the underscore character ( _ ), and any of the digits ( 0-9 ). If you wish to use a Unicode character from outside the Latin alphabet in an identifier, you can do so using the syntax \uxxxx where xxxx
is the hexadecimal representation of the Unicode character in question. For
simplicity we'll demonstrate this with an example that does come from the
Latin alphabet: The Unicode number for the letter 'e' is 0x0065. Hence the
following two lines of code are completely equivalent:
int Result = 20;
int R\u0065sult = 20;
You cannot use any C# keyword as the name of any item - for obvious reasons - unless you prefix it with the character @ to indicate that it is to be interpreted as an identifier rather than a keyword. If do this, the @
character does not form part of the identifier as it appears in the compiled
assembly. A full list of C# keywords to avoid are listed at the end of this
appendix. Although it does not count as an error, you are also strongly advised
not to use any Visual Basic.NET keywords as names anywhere where the name
may be visible to other assemblies - as this could cause problems for any
developer who wants to access your assembly from Visual Basic.NET code. The
Visual Basic .NET keywords are listed in Chapter 8.
class foreach // WRONG - foreach is a keyword
{
public int Inherits; // Correct but might cause confusion as Inherits is
//a keyword in VB.NET
public int X0_B; // OK
public @if; // also OK
}
Statements
All statements (pieces of code that can be executed) in C# consist of one of the following:
- A call to a method, property or constructor, possibly involving an assignment
NewAccount = Account.Open(NewCustomerName, AccountType, Branch);
- An expression:
Balance *= (1.0 + MonthlyInterestRate);
++MonthsAccountOpen;
Note that expressions are only permitted if they do assign a value to something. Do nothing statements are not valid C#.
Y+1; // WRONG - doesn't actually do anything that can have any
// effect in the code
- A variable declaration:
int X;
- One of the C# commands that controls execution flow
if (X > 5)
DoSomething();
- A block statement consisting of several simple statements
enclosed by braces. Note that a block statement may itself contain further
block statements (nested block statements). There is no limit to the level
of nesting (other than that beyond 3 or 4 levels, your code can start to
get hard to read).
// this is all one single block statement
{
foreach (Customer NextCustomer in Customers)
{
if (NextCustomer.Balance < 0)
{ // this starts another block statement - this one is
// conditionally executed depending on the if condition.
NextCustomer.SendReminderLetter();
++BalancesOwing;
}
if (NextCustomer.Balance < -1000)
DebtorsList.Add(NextCustomer);
}
}
Namespaces
Namespaces provide an additional means of identifying a type (class, struct,
enumeration or delegate), since the full name of any class etc. includes
the names of any containing namespaces.
A namespace is defined using the keyword namespace , and its scope is indicated by braces.
namespace Wrox
{
// all class definitions etc. here
}
It is possible to nest namespaces inside each other:
namespace Wrox
{
namespace ProfessionalCSharp
{
// all class definitions etc. here
}
}
Alternatively, you can declare nested namespaces with one single namespace command. The following code has exactly the same effect as the previous version.
namespace Wrox.ProfessionalCSharp
{
// all class definitions etc. here
}
If we now declare the class inside the above nested namespace:
namespace Wrox.ProfessionalCSharp
{
class ChapterOutline
{
//implementation
}
}
Then the full name of this class is actually not just ChapterOutline , but Wrox.ProfessionalCSharp.ChapterOutline . If this class is referred to from outside the Wrox.ProfessionalCSharp
namespace, are then it must be referred to by its full name. If you wish
to refer to a sub-namespace from inside a parent namespace, then you only
need give the name of the sub-namespace. For example, if some code in the
Wrox namespace wanted to refer to the ChapterOutline class, it would refer to it as ProfessionalCSharp.ChapterOutline .
Note that the above example illustrates the recommended convention for
namespace names: You should start with the name of your company, then nest
namespaces that give the name of the technology or product that the software
is part of.
The Using Statement
Clearly, continually referring to classes with their full names is unwieldy, so C# provides the using statement to allow you to use just the type name itself to identify a class etc. The using statement should come at the start of your source code, and it always names one namespace.
using System.IO;
The using statement indicates that any item in the System.IO namespace will be referred to in your code by the name of the data type only. You can supply as many using
statements as you want, and the compiler will then examine the data types
in all the specified namespaces in order to attempt to resolve any references
to data types that are not defined in the current namespace and do not have
a namespace indicated explicitly. For example, System.IO is one of the .NET base class namespaces, and contains a class, FileInfo . If the above statement is placed in your code, then you can write:
FileInfo ReadMeFile = new FileInfo(@"C:\My Documents\ReadMe.txt");
whereas otherwise you would have had to write:
System.IO.FileInfo ReadMeFile =
new System.IO.FileInfo(@"C:\My Documents\ReadMe.txt");
Note that when used in this way, the using statement only applies to the namespace explicitly named - not to any nested namespaces. For example, the statement:
using System;
allows the identifiers in the System namespace to be referred to by the type name only. However, if you want to refer to System.IO.File , as File you will still need to separately supply the statement:
using System.IO;
You can also use the using statement
to provide an abbreviated name for just one type name. This can be useful
to distinguish two data types that have the same name but are in different
namespaces. For example, suppose you needed to use the System.IO namespace, but you also needed to use a different, third-party-supplied class, FileInfo , that is in another namespace - say ThirdPartyCompany.FileSoftware.FileInfo . Simply adding using statements to access both namespaces would cause confusion because then occurrences of FileInfo could refer to either class. Instead, you could indicate you wish to use just the System.IO namespace then provide a different name for the ThirdPartyCompany.FileSoftware.FileInfo class:
using System.IO;
using TPCFileInfo as ThirdPartyCompany.FileSoftware.FileInfo;
With the above code, FileInfo will now be taken to mean System.IO.FileInfo (and the same for any other classes in System.IO ), while any references to TPCFileInfo will be interpreted by the compiler as indicating ThirdPartyCompany.FileSoftware.FileInfo .
Data types and Variables
Besides data types that you define yourself as classes, structs, enumerations,
or delegates, C# recognizes the following data types, and reserves a keyword
to represent for each of these types. Note that all data types in the .NET
environment are derived from the class System.Object
, and are therefore formally objects. Hence the C# keywords in this table
are, strictly speaking, simply syntactical conveniences provided by the C#
language to make it easier for you to use the relevant base class.
Keyword that defines this type | Value or Reference Type | Base class this actually represents | Description |
bool | value | System.Boolean | value of true or false |
byte | value | System.Byte | 8-bit unsigned integer |
sbyte | value | System.SByte | 8-bit signed integer |
short | value | System.Int16 | 16-bit signed integer |
ushort | value | System.UInt16 | 16-bit unsigned integer |
int | value | System.Int32 | 32-bit signed integer |
uint | value | System.UInt32 | 32-bit unsigned integer |
long | value | System.Int64 | 64-bit signed integer |
ulong | value | System.UInt64 | 64-bit unsigned integer |
decimal | value | System.Decimal | floating point value stored to approx 28 significant figures. Has greater accuracy but smaller range then float and double. |
float | value | System.Single | 32-bit floating point value |
double | value | System.Double | 64-bit floating point value |
char | value | System.Char | 16-bit value intended to represent a Unicode character |
string | reference | System.String | text (set of Unicode characters) |
object | reference | System.Object | generic type that can refer to any variable of any type |
In addition, C# makes available four other keywords that allow you to
define other objects that are derived from base class objects.
Keyword that defines this type | Value or Reference | Base class inherited from | Description |
class | reference | System.Object (unless another class specified) | Any reference type |
struct | value | System.ValueType | Any value type |
delegate | reference | System.Delegate or System.MulticastDelegate | Any type that holds details of a method reference |
enum | value | System.Enum | Any type that holds a set of enumerated values |
C# also defines the keyword event , which refers to a particular definition of a delegate.
Value and Reference Types
C# rigorously enforces rules about which data types are valued types (stored either in an area of memory known as the stack
, or inline if embedded in another type) and which are reference types. All
instances of reference types are stored in another area of memory known as
the heap , with that the address of the object usually being stored
in the stack. It is not possible to specify that a particular instance of
a class is to be stored in a particular way as it is in some programming
languages - since how a particular variable is to be stored is determined
entirely by its data type.
The complete list of value and reference types is given in the above tables.
If a data type is a value type, then each variable of that type contains an instance of that type.
int X = 10; //X contains the value 10
On the other hand, if a data type is a reference type, then each variable
of that type simply stores a reference to where in memory the instance is
stored. We use the base class System.Drawing.Graphics to illustrate this.
Graphics dc = new Graphics(); // dc only contains a reference
Graphics dc2 = dc; // Graphics instance has not been copied.
// Only the address to it has been copied
The instance that a reference type refers to is known as the referent .
As noted elsewhere, the referent can be any instance of the class the
reference was declared as, or it can be an instance of any class derived
from that class.
It is not possible to access the referent directly in C# - it can only be accessed through the reference.
All variables, whether value or reference, obey the usual scope rules
(fields are scoped to all methods in the class or class instance, local variables
to the closing brace that ends the method or block statement in which they
are defined). However, in the case of reference variables, the referent remains
on the heap until the garbage collector removes it. The garbage collector
will periodically remove any objects from the heap that are not referred
to by any reference variables. Hence, you can extend the lifetime of an object
as long as you want by ensuring that the program holds a reference to it
somewhere.
It is also possible for a reference to refer to the so-called null reference . This is represented by the keyword null
, and indicates that the reference does not refer to anything. If you attempt
to call any methods against a null reference, this will cause an exception
to be thrown at runtime.
dc = null; // dc now holds the null reference
Declaring Variables
Variables are declared using the following syntax:
int X;
MyClass Mine;
double D1, D2;
The above code does not however initialize at the values of the variables.
C# has two different approaches to initialization according to the type of
data:
- Variables declared as class or struct member fields are
automatically initialized by being zeroed out. This means that numeric types
will contain 0, bool s will contain false , and references will contain the null
reference. You can modify this behavior by specifying an initial value in
the definition of the containing class or struct, or by writing a value in
the constructor to this class or struct.
- Variables declared locally to a method are completely uninitialized,
and it is a syntax error to access any of their values before they are explicitly
set.
struct SomeOtherStruct
{
int X; // will contain 0 initially
MyClass Mine; // will be a null reference initially
void DoSomething()
{
int Y; // uninitialized
int Z = Y + 2; // ERROR - Y referenced before being set
All variables of whatever data type are initialized by calling a constructor.
For classes and structs that you have defined, the syntax for doing so is
using new keyword followed by any parameters in parentheses.
MyClass Object = new MyClass();
For predefined data types, this syntax is available to some extent:
int X = new int(); // X will contain zero
string TwentyDs = new string('D', 20); // creates the string
// "DDDDDDDDDDDDDDDDDDDD";
However in most cases you will use an alternative simpler syntax for predefined
types, in which the initial value is specified directly.
int Y = 50;
string Message = "Hello World";
This simpler syntax is not available for structs and classes that are defined in your code.
You can initialize the variables when they are declared, or use an assignment statement to specify an initial value later.
int Y;
Y = 50;
MyClass MyObject;
MyObject = new MyClass();
C# recognizes both syntaxes for arrays. Arrays are covered later in this appendix.
For structs only (not classes) you can alternatively initialize a struct
by separately initializing all its fields (assuming you have public access
to them).
For example, if this struct is defined:
struct ComplexNumber
{
public double X;
public double Y;
}
Then after executing the following code, WaveVector will be considered to be initialized.
ComplexNumber WaveVector;
WaveVector.X = 20.0;
WaveVector.Y = 34.5;
Note that calling the constructor for a reference type involves allocating
memory in the heap, while calling constructor on any value type will simply
cause the appropriate values to be placed in the variable - this memory will
already have been allocated.
Assigning Values to Variables
The syntax for this is very simple, as it follows the shorthand syntax for constructors of predefined types.
X = -10; // X declared as an int
D1 = 4.0; // D1 declared as a double
Message = "Hello" // Message declared as a string
For user-defined classes and structs, you must either assign values to
the individual fields and properties, or use a constructor.
MyObject = new MyClass();
Alternatively, for both predefined data types and for classes etc. you
can syntactically set a variable equal to the return value of a method that
returns an appropriate type:
X = GetAnInt(); // GetAnInt defined elsewhere
Mine = GetAMyClassRef(); // Mine of type MyClass, GetAMyClassRef()
// defined elsewhere
Syntax for Different Types of Data
If assigning constant numbers to variables, you may use the following
suffixes to indicate that the number should be interpreted a particular data
type: L : long , M : decimal , F : float , UL : unsigned long . These suffixes are all case-insensitive, except that a compiler warning will be generated if you use lower-case l for long , because of the risk of confusing it with the numeral 1 (one).
float GravityAcceleration = 9.81f;
decimal Overdraft = 140.0M;
C# defines a detailed set of rules for precisely when these suffixes are
required, but in general you should use one if there is any doubt.
String expressions are always indicated by enclosing the string in double
quotes. Single quotes should not be used for strings, as they indicate individual
characters
string Message = "Hello";
char C = 'H';
Certain special characters are however represented by escape sequences:
Escape sequence | Character name | Unicode encoding |
\' | Single quote | 0x0027 |
\" | Double quote | 0x0022 |
\\ | Backslash | 0x005C |
\0 | Null | 0x0000 |
\a | Alert | 0x0007 |
\b | Backspace | 0x0008 |
\f | Form feed | 0x000C |
\n | New line | 0x000A |
\r | Carriage return | 0x000D |
\t | Horizontal tab | 0x0009 |
\v | Vertical tab | 0x000B |
For example, to represent the string C:\My Documents\Chapter 6.doc , you would write:
string Path = "C:\\My Documents\\Chapter 6.doc";
While to represent this text:
First line
Second line
You could write
string Text = "First line\nSecond line";
If you prefer not to use escape sequences in a particular string, you
can instruct the compiler to interpret the complete string literally by prefixing
it with an @ sign:
string Path = @"C:\My Documents\Chapter 6.doc";
It is not possible to represent the other text, containing a carriage return, using literal strings.
As remarked above, characters are represented in the same way as strings, but single quotes instead of double quotes are used.
char Ecks = 'X', BackSlash = '\\';
The @ format is valid only for strings. Individual characters must be escaped.
Alternatively, for characters, you can supply the numerical value of the
character. Hence, the following is exactly equivalent to the above code.
char Ecks = (char)88, BackSlash = (char)92;
Strictly, what is happening here is that an integer value is being converted (cast) to a char ; this will be explained very shortly.
Constants
A variable can be declared as const , which effectively means that it is a constant named value, and cannot be assigned to. If a variable is marked as const , then its initial value must be specified when it is declared.
const int MaxLength = 10;
Note that, unlike for some other languages, constants in C# are generally
not written in uppercase. Microsoft decided to recommend Pascal casing (in
which only the first letter of each word making up the name) is uppercase
because Pascal-cased names are easier to read than uppercase names.
There are additional requirements for const variables that are declared as member fields. These are detailed in the 'Fields' section of this appendix.
Casts
It is possible to convert values from one data type to another using casts.
int X = 10;
long XX = X; // implicit cast from int to long
int XXX = (int)XX; // explicit cast from long to int
C# formally defines two types of cast, both of which are illustrated in
the above code. An implicit cast simply involves syntactically assigning
the source variable to the destination variable, whereas an explicit cast
involves prefixing the expression to be cast with the name of the destination
data type, in parentheses. The idea is that an implicit cast is used when
nothing can go wrong - there is no risk of data loss or of an exception being
thrown. The conversion from int to long is therefore defined as being implicit, because any value that can be stored in an int can also be stored in a long . On the other hand, an explicit cast is used if there is a possibility of the cast failing. For example, conversion from long to int must be explicit because of the risk of an overflow if the long holds a value that is too large to be stored in an int . For the particular case of long -to- int
conversion, as well as most explicit conversions between other numeric predefined
data types, the results that occur if a number is out of range depend on
whether the conversion occurs inside a checked
block (see below). By specifying a cast explicitly in your code, you are
in effect telling the compiler that you understand there is a risk of a problem
arising from the conversion, and you want to do it anyway.
If a cast is defined as implicit, then it is syntactically correct to
perform the cast either implicitly or explicitly. On the other hand, if a
cast is defined as explicit, an attempt to perform it implicitly will raise
a compilation error.
int FirstInteger = 10;
long FirstLong = (long)FirstInteger; // OK, although int-to-long is an
// implicit cast
int SecondInteger = FirstLong; // WRONG - explicit cast required
// long-to-int
For the predefined data types, the C# language specification provides
a full list of which casts are explicit and which are implicit. There are
two many different possible combinations of source and destination data type
to list all the casts here, but the situation can be summarized by saying
that the only implicit casts allowed are those for which nothing can go wrong
-for example, short to int , ushort to long
etc. If converting from any floating-point type to any integer type then
you need an explicit cast (because of the risk of a fractional part of the
number getting lost). Similarly, if converting from signed to unsigned (in
cast the value is negative), and converting to any data type whose maximum
value is lower than that of the source data type. However that implicit casting
from int to float and from long to float or double is permitted, although there may be a loss of precision associated with these casts.
With these provisos, you can convert from any predefined numeric type to any other predefined numeric type. Note however, that char is really viewed in this context as holding a character rather than a number. Hence all conversions between char and numeric types must be explicit even if there is no risk of data loss.
If an implicit cast is allowed, you may still choose to cast explicitly
if it makes it clearer for other developers to follow the logic of your code.
This is particularly true when dealing with method overloads and user-defined
casts (covered later in the appendix).
Converting To or From string
It is quite common that you need to convert between a numeric data type
and a string. Most frequently, this is either so that you can display the
value of or read a value into a numeric type.
C# does not provide any predefined casts from the numeric types to string . However, all data types implement a virtual method, ToString() , which returns a string representation of the data type. This method is inherited by all classes from System.Object
, and is overridden in the predefined numeric data types to return a string
representation of the value (as opposed to the data type). Hence you can
write:
int X = 10;
string ResultText = X.ToString();
For performing the reverse operation, all the predefined numeric datatypes implement a static method, Parse() , which has a number of overloads, one of which takes a string as a parameter.
string ResultText = "10";
int X = int.Parse(ResultText);
Parse also has a number of other overloads for most predefined data types
that allow you to specify for example globalization information concerning
the format of the string to be converted.
Checked Code
Usually, if an overflow occurs as the result of an arithmetic operation
or an explicit cast, no action is taken: The excess bits of the result are
lost and execution continues with a variable having a (presumably) incorrect
value:
byte Byte1 = (byte)128;
byte Byte2 = (byte)130;
byte Byte3 = (byte)(Byte1 + Byte2); // result is 2
In this case, the result of the addition is 258, but since byte can only
store values between 0 and 255, an overflow occurs and we end up with 2 instead.
If you prefer, you can mark a block of code as checked . Any arithmetic operations or explicit casts involving the predefined data types that occur inside a checked block are examined to see if an overflow occurred, and is it does, an OverflowException is thrown.
checked
{
byte Byte1 = (byte)128;
byte Byte2 = (byte)130;
byte Byte3 = (byte)(Byte1 + Byte2); // this will throw an exception
}
Note that checked blocks are scoped at a source code level. In other words, the checked block only applies to the text that is physically inside the block. If you call a method from within a checked block, then the method executes as unchecked unless the code in it is explicitly marked as checked too.
checked
{
// anything here will execute checked
DoSomething();
DoSomethingElse();
}
void DoSomething();
{
// code here will execute unchecked even though it is
// called from inside a checked block
}
void DoSomethingElse
{
checked
{
// code here will execute checked
}
}
Boxing
Boxing is the technical term for the process of casting any value type to object (in other words, a System.Object instance). The cast is always valid because any type, even value types, are derived from Object
. However, it's important to be aware that the cast involves taking a copy
of the value type, since value types are stored inline or on the stack, whereas
Object is a reference type and so must be
stored on the heap. The process of taking this copy will not be that significant
if we are dealing with a primitive data type, but may give a significant
performance hit if the value type is a large struct. All the normal rules
of casts still apply for boxing processes.
When carrying out the boxing cast, you get an object reference, but this
reference points to an instance on the heap that retains the information
about its original data type, and so can be cast back. Syntactically, there's
no difference between this cast and any other cast from base type to derived
type - it's just that the cast involves copying data back from the heap.
(Casting between base and derived types is covered in the section on inheritance.)
int X = 20;
object Value;
Value = X; // boxing - implicit cast
int Y = (int) Value; // unboxing - explicit cast
Boxing occurs implicitly when value types are passed as parameters to
any method that expects a reference type. In this situation you may prefer
to indicate the boxing explicitly in order to make the code more self documenting.
You may also prefer to box explicitly for performance reasons. For example,
in the following code, the Console.WriteLine() overload calls actually takes an object as the second and third parameter. Boxing the value RunNumber explicitly means it is only boxed once, instead of being boxed separately every time Console.WriteLine() is called.
int RunNumber = GetRunNumber(); // assume this returns an int
object oRunNumber = (object)RunNumber;
for (int i=0 ; i< 10 ; i++)
{
// assume Result is an array of some numeric type
Console.WriteLine("Run {0} Iteration {1} Gives {2}", oRunNumber, i,
Result[i]);
}
Note that in the above code there is no performance advantage to be gained by explicitly boxing i or Result[i] since each of these is only passed to Console.WriteLine() once before being modified.
typeof and GetType()
C# provides the typeof keyword, which
allows you to obtain information concerning different data types. It cannot
be used to obtain information about a particular variable, as it takes as
its parameter the name of a data type rather than the name of a variable.
It returns an instance of the base class System.Type which holds information about a given type.
Type IntTypeInfo = typeof (int);
string TypeName = IntTypeInfo.Name; // TypeName will contain "Int32"
In this example, we have used the Name property of the Type class to extract the name of the type. This by itself may not seem too useful, but the Type
class implements a number of other properties that supply other extended
information about a type. For example, you can find out what methods and.
a type implements:
Type ti = typeof(int);
foreach(System.Reflection.MethodInfo mi in ti.GetMethods())
{
Console.WriteLine(mi.Name);
}
We won't go into full details of the TypeInfo and related classes here - details are in MSDN. TypeInfo itself is in the System namespace, and most of the related classes are in System.Reflection .
If you wish to obtain information about a type, given a particular variable whose type you are not certain of, you can use the GetType() method, which is implemented by all objects (it is inherited from System.Object ). This method returns a System.Type
object that describes the variable (or in the case of a reference variable,
the object to which the variable refers, which may be a derived type of the
type that the variable was declared as).
object SomeObject;
// initialize SomeObject.
//We now don't know what type / it is actually pointing to
Type TypeInfo = SomeObject.GetType();
Console.WriteLine(TypeInfo.Name);
as and is
C# provides two more operators which assist with casting when you are unsure if the source instance is of the correct data type:
- as allows you to attempt
the cast. If the cast fails because no cast exists between the source and
destination data types, it returns the null reference instead of throwing
an exception.
- is allows you to test whether an object is of a particular data type (or of a type derived from that type).
The following code illustrates both operators
int X = 5;
object obj = X;
Console.WriteLine(obj is int); // displays true
Console.WriteLine(obj is short); // displays false
Console.WriteLine(X is object); // displays true
MessageBox Y = obj as MessageBox; // will fail since Y isn't a MessageBox
Console.WriteLine(Y==null); // displays true
Console.ReadLine();
In this code we use the base class, System.Windows.Forms.MessageBox as an example. The is
operator takes two parameters: the name of a variable to be tested, and the
name of a class or struct to test it against. It returns the bool value true
if the variable is of the given type or of a type derived from it (which
means it will definitely be possible to cast it to that type), and false otherwise. Hence, in the above code (obj is int) returns true because obj was created by boxing an int . X is object also returns true because System.Int32 is derived from System.Object , as are all data types. (In fact <anything> is object will always return true since all types are ultimately derived from System.Object ).
Next in the code we attempt to cast obj to a MessageBox reference. This cast will fail because obj actually refers to an instance of int , not MessageBox . If we'd written MessageBox Y = (MessageBox) obj ; then the code would have compiled but thrown an exception at runtime. By using as we simply return the null reference instead.
Note that, whereas you can use the is operator to test against any data type, it is only possible to use the as operator to attempt to cast to reference types. It is a syntax error to use as
to cast to value types. The result has to be a reference because otherwise
it would not be possible to use the null reference as a failure indicator.
Execution Flow
C# uses the following statements or sets of statements to control the flow of execution.
- if ... else if ... else
- while
- do ... while
- for
- foreach
- goto
- switch ... case ... goto ... default
- break
- continue
- return
- try ... catch ... finally
-
We'll examine each of these statements individually in this section, with the exception of try ... catch ... finally , which we'll cover in the section on exceptions.
First a couple of general points:
Most of these statements involve an expression that appears in round brackets
being evaluated one or more times in order to determine where the flow of
execution should go. The evaluation of this expression may or may not have
some side effects. For example:
while ( (X = GetAnInt() )> 3)
DoSomething();
On each iteration round the loop, GetAnInt()
will be called - this may have some side effects of its own, as well as the
fact that evaluating the expression in the brackets will cause a value to
be assigned to X .
The syntax and function of these statements is as follows
if
The if statement in its most basic form looks like this.
if (Temperature < 10)
Heater.SwitchOn();
The statement inside the if block may be a simple statement as above, or a block statement. The same applies for all the execution-flow commands:
if (Temperature > 20)
{
Fan.SwitchOn();
Console.WriteLine("Fan switched on");
}
The if statement causes the evaluation of the expression that appears in the parentheses after the if keyword. This expression must evaluate to a bool - it is a syntax error for it to evaluate to anything else. If this expression returns true then the statement or block statement inside the if
block - that is, the statement immediately following the closing round bracket
- is executed. Otherwise, this statement is not executed. Either way, control
subsequently transfers to the next statement following the if . Notice that there is no semicolon between the closing round brackets and the statement inside the block.
Any number of else if clauses may optionally be appended to an if statement. If the if() expression returns false , then each expression associated with an else if clause in is evaluated in turn until one of the expressions evaluates to true
. At this point, the statement immediately following that expression is executed.
After this none of the remaining test expressions are evaluated, but instead
control transfers to the statements following the if statement.
if (Temperature < 10)
Heater.SwitchOn();
else if (Temperature > 20)
{
Fan.SwitchOn();
Console.WriteLine("Fan switched on");
}
An if statement may also be optionally terminated with an else clause. The statement inside the else clause will be executed only if none of the conditions associated with the if or else if clauses are evaluated to true.
if (Temperature < 10)
Heater.SwitchOn();
else if (Temperature > 20)
{
Fan.SwitchOn();
Console.WriteLine("Fan switched on");
}
else if (Temperature < 15)
{
Heater.Initialize();
}
else
Console.WriteLine("Nothing to do. Temperature between 15 and 20");
while
Like the if statement, the while
statement consists of the keyword followed by an expression in parentheses.
Immediately following the closing round brackets is a statement or block
statement.
The following uses while to read a file. The StreamReader class is, like the Console class, one of the .NET base classes.
StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
while ( (NextLineOfFile = sr.ReadLine()) != null)
Console.WriteLine(NextLineOfFile);
sr.Close();
When execution reaches a while statement, the test expression is evaluated. If it evaluates to true
then the statement inside the while is executed and the test expression is
evaluated again. The cycle repeats until the test expression evaluates to
false , at which point execution continues from the statement following the while statement.
do
The syntax for do looks like this:
StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
do
{
NextLineOfFile = sr.ReadLine();
if (NextLineOfFile != null)
Console.WriteLine(NextLineOfFile);
}
while ( NextLineOfFile != null);
sr.Close();
The procedure when executing a do statement is almost identical to that for a while statement. The only difference is that with a do statement the contained loop code is executed first, and then the condition is evaluated. This means that the code inside a do loop is always executed at least once, while that inside a while loop might not be executed at all. The above code achieves exactly the same effect as the previous example, for the while loop, again using the .NET base class, StreamReader .
for
The for loop is the most complex of all loops in C#. Its syntax looks like this.
for (int I=0 ; I<Customers.Length ; I++)
Console.WriteLine(Customer[I].Name);
We see that, unlike the other loops, the for
loop has three expressions/statements inside the round brackets that defines
the loop, separated by semicolons. Of these, the second item must be an expression
that evaluates to a bool . The first and the third items are complete statements. The rule is this:
When the execution flow hits a for loop,
the first statement in the parentheses is immediately executed. If, as is
typically the case, this statement is simply a declaration and initialization
of a variable, then that variable is scoped to the statement in the for loop. Now the loop actually begins. Each iteration of the loop consists of the following:
- The expression that forms the second item inside the parentheses is evaluated. If it evaluates to false then the loop terminates instantly, and execution is immediately transferred to the statement following the for loop. Otherwise, the loop continues with the next step.
- The statement or block statement inside the for loop is executed.
- The statement that forms the third item in the parentheses that define the for loop is executed.
The usual use of the for loop is to provide
a counter: The loop is executed as a variable is incremented or otherwise
modified, and the loop terminates when this variable hits a certain value
or exceeds a range. The above sample will display the strings given in a
property of some array elements, Customer[I].Name , for all elements with indices between 0 and 9.
foreach
The syntax of the foreach loop looks like this.
foreach(SomeClass Instance in SomeCollection)
DoSomething();
Unlike most execution flow statements, which use a C#
expression to control any loop or condition, foreach features a special syntax
inside the parentheses, which only occurs in this statement. The special
syntax begins with a declaration (but with no initialization) of a variable
(the loop control variable). This is followed by the in keyword, followed
by the name of an existing variable. The existing variable must be of a type
that the compiler recognizes as a collection. A collection can be thought
of, roughly speaking, as a set of objects, and is in this sense similar to
an array. However, whereas elements of an array are retrieved via their index,
elements of a collection are retrieved by moving through the collection with
a special type of object called an enumerator. One other notable feature
of collections is that they provide read-only access to the elements retrieved.
More precisely, the compiler will recognize any class or struct as a collection
if it implements the IEnumerable interface. Of the predefined data types,
only arrays satisfy this requirement, but a number of .NET base classes do
so, and you are free to implement your own collections as well. (For an example
of this, check out the VectorAsCollection example in Chapter 7) It is a compile-time error to use a foreach loop where the variable
following the in keyword is not of a type that implements IEnumerable. The action taken when the code hits a foreach loop is as follows.
On entering the loop, the IEnumerable.GetEnumerator() method is called against
the collection variable to return an enumerator, which is expected to implement
the IEnumerator interface. For each iteration round the loop, the IEnumerator.MoveNext()
method is called against the enumerator to shift to the next item in it,
followed by the IEnumerator.Current property, and the result is assigned
to the loop control variable. If this return value is null then the loop
terminates and execution continues at the next statement. Otherwise, the
statement inside the loop is executed with the loop control variable having
its new value, after which the loop iterates again. Note that the IEnumerator.Current
property returns an object reference. This must be cast to the data type
of the loop control variable. Obviously, if this cast fails, an exception
will be thrown. If a foreach loop is applied to an array, the effect will
be to step through all the values in the array. goto The goto statement
simply transfers control to a named statement, which is identified by a label.
Labels can be recognized because they end with a colon: if (X = true)
goto NextRound; // later in code NextRound: // code to carry on
There are some restrictions on the relative locations of the goto statement
and the labeled statement. In particular, they must both be in the same method,
and it is not permitted to jump into a block statement or into a loop. (Though
it is possible to jump out of a block statement). if (X = true) goto
NextRound; // WRONG. Jumping into a block statement. { // later in code
NextRound:
Use of goto statements is strongly discouraged, except for the particular case of jumping between case clauses of a switch statement, since it can lead to code that is hard to understand.
switch ... case ... break
The switch statement provides an alternative syntax to if
for executing different conditional branches. It is suited to the situation
in which there are a number of different actions to be taken depending on
the value of a particular variable. The syntax looks like this.
string UserInput = "W";
switch (UserInput)
{
case "X":
case "Q":
RaiseQuitEvent();
return;
case "D":
DisplayImage();
goto case "T";
case "T":
DisplayTitle();
break;
default:
MessageBox.Show("Please enter X, Q, D or T");
break;
}
As the above code shows, the syntax of the switch statement consists of the switch
keyword, followed by the name of a variable in parentheses. This is the variable
whose value is to be tested, and it may be of any of the predefined data
types. There then follows a block statement that consists of a number of
case clauses. Each case
clause gives a possible value of the test variable, followed by a mandatory
colon - failure to supply this colon constitutes a syntax error, and in addition
the colon allows a case clause to be the destination of a goto statement.
When execution enters a switch statement, the value of the test variable is compared with each of the alternatives presented by the various case clauses until a match is found. Then the instructions following that case clause are executed. If none of the case clauses match and a default clause is present, then the instructions following the default : are executed instead. If no default clause is present and no match is found then execution simply continues at the next statement following the switch statement.
As the above example shows, it is possible for case
clauses to immediately follow each other - this is how we provide for the
same set of instructions to be executed for more than one value of the test
variable.
One syntactical requirement of switch
statements is that each conditional sequence of instructions must end with
some statement that explicitly indicates where the flow of execution should
continue. It is a syntax error for a path of execution to simply hit the
following case clause.
case "D":
DisplayImage(); // WRONG - execution simply
case "T": // hits following case clause
There are three possible such terminating statements:
- break exits the switch statement and causes execution to continue at the first statement following the switch statement.
- return is more drastic. It causes the entire containing method to be exited.
- goto <label> transfers execution to the named label. It is common
for this to be used to transfer control to another case label (if fact this
is the only way to do this). In the above example, if the user hits D then
both the image and the title should be displayed.
Note that it is not possible to use the switch
statement to test whether a variable lies within a range or to test the return
value of a complex expression (unless you assign the return value to another
variable first). You will need to use the if statement for those kinds of situations.
break
This statement causes early termination of an immediately containing for , foreach , while , do or switch statement. On encountering a break , control immediately transfers to the next statement following the enclosing loop. We have already seen this behavior in the switch statement.
The most common uses for break are to terminate case clauses in a switch
statement, and to provide an alternative means of exiting a loop where more
than one condition can require termination. For example, the following uses
the .NET base class, StreamReader, to read a file until either then end of the file is reached or a line containing " Exit " is reached.
StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
while ( (NextLineOfFile = sr.ReadLine()) != null)
{
if (NextLineOfFile == "Exit")
break;
// do whatever other processing is needed;
}
sr.Close();
continue
This is similar to break except that
instead of exiting the entire loop, we simply exit that iteration early,
and immediately start the next iteration of the loop. Continue is valid only
inside for , foreach , while or do loops. It is not valid inside switch statements.
The following extends the previous example by allowing for a line that contains the text " Ignore " to not be processed (in other words, only this line is skipped - processing continues on the next line of the file).
StreamReader sr = new StreamReader(@"C:\ReadMe.txt");
string NextLineOfFile;
while ( (NextLineOfFile = sr.ReadLine()) != null)
{
if (NextLineOfFile == "Exit")
break;
if (NextLineOfFile == "Ignore")
continue;
// do whatever other processing is needed;
}
sr.Close();
return
This statement causes immediate termination of execution of the method
that is currently being executed. Control transfers back to the calling method.
All variables that our scoped to the method being executed will, of course,
go out of scope as control is returned.
The return statement may take one parameter, which must be of the return
type of the method currently being exited (can be cast to that data type).
This will be the return value of the method. If the method is defined to
return void, then the return statement takes no parameters.
No brackets are needed for the parameter of a return statement.
public bool IDIsValid(string ID) // ID is valid if first character is 'd'
{
if (ID == null) // or in this case, might alternatively
return false; // throw ArgumentNullException
if (ID[0] == 'd')
return true;
return false;
}
Arrays
An array in C# is actually an instance of the .NET base class, System.Array
. However, C# wraps some special syntax around this class in order to make
using its methods much more intuitive. Hence, for the most part, we can simply
treat arrays as if they a part of the C# language syntax itself, and are
not worry about the fact that behind the scenes we are actually dealing with
this base class.
Declaring and Initializing an Array
The simplest way to declare an array is like this
decimal [] Salaries;
Here the square brackets after the type name tell the compiler that we are actually declaring not a decimal, but an instance of System.Array , which will hold decimal s. (We've given this example for decimal
but the same syntax applies for any other data type, value or reference).
What the above code will actually do is declare a reference that can refer
to a System.Array instance that holds a one-dimensional array of decimal s. This reference is called Salaries . At this stage, we have not actually instantiated an array - to do this, we need to call the System.Array constructor - which is done using the new
keyword, as for most other data types. The only difference is that we need
to supply the number of elements in the array, which is done using square
brackets.
decimal [] Salaries = new decimal[5];
The dimension (or rank) of the array as well as the data type it holds
are formally considered part of the type definition, and so are part of the
definition of Salaries . However, the length
of the array (the number of elements in it) is considered to be an initial
value, not part of the type, and so is specified only on the right hand side
of the above statement.
It's important to be aware that the elements of an array are formally considered as fields inside a System.Array class instance, and so will always be initialized by being zeroed out. Also, because Array is itself a reference type, the elements of it will be stored inline on the heap, even if they are value types.
If we wish to special to specify alternative initial values for the elements we can do so like this.
decimal [] Salaries = new decimal[5] {30000.00M, 35000.00M, 67500.00M,
100000.00M, 50000.00M
Since the size of the array is obvious from the initialization list, we
don't need to explicitly indicate it if we are using the above syntax.
decimal [] Salaries = new decimal[] {30000.00M, 35000.00M, 67500.00M,
100000.00M, 50000.00M
As with the other simple data types, C# recognizes an alternative syntax that does not used the new keyword.
decimal [] Salaries = {30000.00M, 35000.00M, 67500.00M, 100000.00M,
50000.00M};
In this case, once again, the compiler can work out from the contents of the curly braces how big the array is supposed to be.
Array Size
The size of an array can be specified at runtime rather than compile time.
However, it cannot be adjusted once the array has been constructed.
int X;
// do some processing to figure out value of X
decimal [] Salaries = new decimal[X];
The size of a 1-dimensional array is subsequently available as the read-only Length property
int ArraySize = Salaries.Length;
Using Arrays
Once an array has been constructed, its elements are then accessed by
specifying the index in square brackets. It is treated just like an ordinary
variable of the appropriate type. Arrays are zero-indexed, so that, for example,
an array of size 5 has elements numbered from 0 to 4.
Salaries[0] = 40000.0M;
decimal MrBrownsSalaray = Salaries[3];
Salaries[4] *= 2;
Bounds checking is automatically performed for arrays. If you attempt to access an element outside the range of the array, an IndexOutOfRangeException will occur at runtime.
// assuming Salaries.Length = 5;
Salaries[5] = 10000.0M; // will cause an exception to be thrown.
//Max. value for index is 4.
Rectangular Multidimensional Arrays
It is also possible to declare rectangular arrays of more than one dimension,
by using commas inside the square brackets to indicate the dimensions. The
number of dimensions (which is technically known as the rank of
the array) will always be one greater than that the number of commas in the
definition. The following code uses an array of rank 2.
double [,] TransformMatrix = new double[3,3];
for (int I = 0 ; I < 3 ; I++)
{
for (int J=0 ; J<3 ; J++)
TransformMatrix[I,J] = 4;
}
Arrays of this type are known as rectangular arrays because each 'row'
of the array has the same number of elements. In the above example, the array
has 9 elements - 3 rows x 3 columns.
It is possible to instantiate multidimensional arrays with as large a
rank as you want. For example, to declare a rank 5 array of string:
string [ , , , ,] MassiveArray;
Jagged Arrays
In C#, if you wish to define a multidimensional array in which not all
'rows' have the same numbers of elements, it is possible to do so. Such an
array is technically known as a jagged array, and is constructed by effectively
forming an array of arrays. The syntax for jagged arrays involves specifying
each dimension in a separate pair of square brackets. To give a 2-dimensional
example:
// construct array
string[][] Text = new string[3][];
Text[0] = new string[2];
Text[1] = new string[4];
Text[2] = new string[2];
// initialize some of the elements
Text[0][0] = "1st word of 1st line";
Text[0][1] = "2nd word of 1st line. The 1st row of this array has just " +
"these two elements";
Text[1][0] = "And the second row has 4 elements";
// etc.
As with rectangular arrays, it is possible to define jagged arrays of
higher rank as well - though for higher rank jagged arrays initializing them
takes quite a bit of work! The above code shows that each individual 'column'
of a jagged array needs to be declared separately, as a separate array in
its own right. Jagged arrays are useful for any situation in which groups
of data might have different lengths. For example you might represent an
array of details of purchases made by customers in a 2-dimensional jagged
array:
Purchase [][] Purchases = new Purchase[][]; // purchase is a custom defined
// class
// later and given appropriate variables
Purchase SomePurchase = Purchases[CustomerIndex][PurchaseIndex];
A rectangular array would not suitable here because each customer will have most likely made a different number of purchases.
Operators
C++ recognizes a large number of operators. An operator is basically a
symbol, which, when encountered, causes some processing to be done that returns
a value.
NewBalance = OldBalance + Interest;
This code actually formally contains two operators. The +
operator adds the values of the two variables to its immediate left and
immediate right and returns their sum. (These two variables can be considered
to be the parameters passed to the + operator). Then the = operator assigns the variable on the left the value of the expression or variable on the right - hence storing the sum in NewBalance . Operators always return a value, and = returns the value assigned. This means that we can, if we want, set more than one variable to the sum, by writing:
CopyOfNewBalance = NewBalance = OldBalance + Interest;
Note that the fact that this statement works as described depends on the
compiler arranging for the operators to be called in the appropriate order.
We expect what to happen to be this.
CopyOfNewBalance = ( NewBalance = ( OldBalance + Interest) );
This is what actually happens because C# has strict rules of operator
precedence - which operator in an expression gets evaluated first. According
to these rules, + has a higher precedence than = , which means that any occurrences of + in an expression will be evaluated before any occurrences of = . If more than one =
occurs, these will be evaluated right-to-left. The rules for operator precedence
are detailed in the MSDN documentation, but have been carefully designed
so that the result of any expression is what would intuitively be expected.
For example, in the expression I + J * K , the multiplication is evaluated first then I added to the result.
An important point to understand with C# operators is that they always
return a value, but they sometimes have some side effect of setting some
variable. We've just seen an example of this, in which = returns a value but also has the effect of setting the value of the variable on its left. For the case of = , this side effect is usually the main reason why we've chosen to use the operator.
As another example, assuming X is some numeric type, in the statement:
if (X == 0)
DoSomething();
The == operator returns a value, which is of type Boolean and will have the value true if both parameters ( x and 0 ) happen to have the same value, and false otherwise. This particular operator does not have any side-effects.
Note the difference between = (assignment operator) which sets a value,
and == (comparison operator) which compares values. It is important not to
confuse these operators.
An operator can in many ways be regarded as a more intuitive syntax for
a method call. For example, our last example can be regarded as a simpler
way of writing the following (which for numeric types will have the same
effect).
if (X.Equals(0))
DoSomething();
Syntax of Operators
Operators can formally be regarded as taking one, two or three parameters,
depending on the operator. An operator that takes two parameters is known
as a binary operator , and is written between the two parameters.
This is known as infix notation and is the notation we are used to in everyday
life - hence the following syntactically correct expressions have their intuitive
meanings.
A + B // returns sum of A and B
Y = 0 // assigns 0 to Y
Y == 0 // returns true if Y contains 0
Note that the above snippets do not include terminating semicolons
because they represent expressions rather than complete statements.
Usually, if an operator takes one parameter, this parameter is written
to the operator's immediate right (prefix notation - the 'pre' refers to
the position of the operator, not the operand).
-X // reverses the sign of the expression
However two operators, increment ( ++ ) and decrement ( --
), may be written with the parameter either to the right (prefix notation)
or the left (postfix notation), and they have different meanings according
to the position:
++X // adds 1 to the value of X and returns new value of X
X++ // adds 1 to the value of X and returns old value of X
--X // subtracts 1 to the value of X and returns new value of X
X-- // subtracts 1 to the value of X and returns old value of X
Operators that take one parameter are known as unary operators .
C# has one operator that takes three parameters, the ternary operator. This operator has two symbols, ? and : , and is written with these symbols placed between the parameters (infix notation, again).
X ? Y : Z
The first parameter of the ternary operator must be a bool (or an expression
that returns a). The ternary operator evaluates the first parameter, then
returns the second parameter if the first parameter returns true , and the third parameter otherwise. For example:
// Result is a numeric variable
string SignOfResult = (Result >= 0) ? "Positive" : "Negative";
Note that in general, the parameters passed to the operators can be either
variables or expressions. However, if the operator assigns new values to
any of its parameters, then those parameters must be variables.
There are a number of different operators that take different numbers
of parameters but are represented by the same symbol. For example, unary
- returns the result from reversing the sign of the parameter, while binary - works out the value obtained by subtracting the second parameter from the first:
A-B // binary -
-B // unary -
There is in practice never any confusion - it is always clear how many parameters there are and hence which meaning is intended.
Meanings of the Operators
For most of the operators, it is easiest to understand what they do from
observing the context rather than looking at their strict definitions. Most
of the operators have numerous overloads, depending on the data types of
their operands. For example, in the expression X=A+B ; the action of the + operator depends on the types of A and B . If A and B are int s, it will return their sum as an int . On the other hand, if A and B are double s, it will return their sum as a double . Intuitively this seems like the same process, but strictly speaking adding two int s and adding two double
s are different operations that require different assembly language instructions.
It is also possible to overload the addition operator for your own classes
- and, of course, strictly you can overload it to do whatever you want, but
your code will be hard to understand if your overload doesn't correspond
to what most people would understand as addition in the context of your class.
In the following tables, we will list all the operators, providing examples
of their syntax and an intuitive description of their return value and side
effect when they are applied to the predefined data types. Unless otherwise
indicated, the operators can be applied to all numeric types (but not string ); only + , += and = can be applied to string . However, the bitwise and shift operators can only be applied to integer types, not to float or double .
Although you can overload operators to have meanings for your own classes
and structs, you should note that, in general, the operators have no meaning
when applied to classes etc. unless you define overloads. The only exceptions
are as follows:
- Assignment ( =) is defined
for all reference types to copy the reference (not the instance of the class)
and for all value types to copy the memory that the variable occupies.
- Comparison ( == and !=)
are defined for reference types to compare the reference (but they are overloaded
for strings to compare the text of the string). For value types that you
define, that is structs, == and != are undefined unless you overload them.
Unary operators
These operators all act on one variable. Their return type is the same
as the type of variable that they operate on. In the following table, the
syntax column uses the symbol X to represent the parameter.
Symbol | Nature of action | Syntax | Return value | Side-effect |
- | sign reversal | -X | -X | |
++ | increment | ++X | X+1 | Adds 1 to X |
++ | increment | X++ | X | Ands 1 to X |
-- | decrement | --X | X-1 | Subtracts 1 from X |
-- | decrement | X-- | X | Subtracts 1 from X |
* | dereference | *X | Contents of memory location with address X | |
& | address-of | &X | Address in memory of variable X. (X must be a variable, not an expression) | |
! | logical NOT | !X | true if X is false, false otherwise (X must evaluate to bool) | |
~ | bitwise complement | ~X | Result from reversing each bit in X (X must be an integer type) | |
Binary operators
The binary operators all return the same data type as their operands (which
must both be castable to the same data type), except for the comparison operators
which all always return bool.
Symbol | Nature of action | Syntax | Return value | Side-effect |
= | assignment | X=Y | Y | Sets X to have same value as Y. (X must be a variable, not an expression) |
== | comparison | X==Y | true if X and Y are equal, false otherwise | |
!= | comparison | X!=Y | true if X and Y are not equal, false otherwise | |
> | comparison | X>Y | true if X is greater than Y, false otherwise | |
<= | comparison | X<=Y | true if X is less than or equal to Y, false otherwise | |
< | comparison | X<Y | true if X is less than Y, false otherwise | |
>= | comparison | X>=Y | true if X is greater than or equal to Y, false otherwise | |
+ | addition | X+Y | Sum of X and Y | |
- | subtraction | X-Y | Value of X-Y | |
/ | division | X/Y | Value of X divided by Y | |
* | multiplication | X*Y | Value of X times Y | |
% | remainder | X%Y | Remainder on dividing X by Y, where X and Y are integers | |
+= | addition-assignment | X += Y | Value of X+Y | Assigns result to X
(X must be a variable, not an expression) |
-= | subtraction-assignment | X -= Y | Value of X-Y | Assigns result to X
(X must be a variable, not an expression) |
/= | division-assignment | X /= Y | Value of X divided by Y | Assigns result to X
(X must be a variable, not an expression) |
*= | multiplication-assignment | X *= Y | Value of X times Y | Assigns result to X
(X must be a variable, not an expression) |
%= | remainder-assignment | X %= Y | Remainder on dividing X by Y, where X and Y are integers | Assigns result to X
(X must be a variable, not an expression) |
&& | logical AND | X && Y | true if X and Y are both true, false otherwise. (X and Y must both evaluate to bool) | |
|| | logical OR | X || Y | true if either X or Y or both are true, false otherwise. (X and Y must both evaluate to bool) | |
& | bitwise AND | X & Y | Result of performing a bitwise AND operation on X and Y | |
| | bitwise OR | X | Y | Result of performing a bitwise OR operation on X and Y | |
^ | bitwise XOR | X ^ Y | Result of performing a bitwise exclusive-OR operation on X and Y | |
&= | bitwise AND-assignment | X &= Y | Result of performing a bitwise AND operation on X and Y. Result stored in X | Assigns result to X
(X must be a variable, not an expression) |
|= | bitwise OR-assignment | X |= Y | Result of performing a bitwise OR operation on X and Y. Result stored in X | Assigns result to X
(X must be a variable, not an expression) |
^= | bitwise XOR-assignment | X ^= Y | Result of performing a bitwise exclusive-OR operation on X and Y. Result stored in X | Assigns result to X
(X must be a variable, not an expression) |
>> | right-shift | X >> Y | Result of bit-shifting X right by Y bits | |
<< | left-shift | X << Y | Result of bit-shifting X left by Y bits | |
>>= | right-shift-assignment | X >>= Y | Result of bit-shifting X right by Y bits | Assigns result to X
(X must be a variable, not an expression) |
<<= | left-shift-assignment | X <<= Y | Result of bit-shifting X left by Y bits | Assigns result to X
(X must be a variable, not an expression) |
Note that the symbol <> is not recognised as an operator. To test for inequality you must use !=.
Ternary Operator
Symbol | Nature of action | Syntax | Return value | Side-effect |
? : | ternary operator | X ? Y : Z | Y if X is true, Z otherwise (X must evaluate to bool) | |
Classes
Classes are defined using keyword class with the following syntax
class Vector
{
public double X;
public double Y;
public double Norm()
{
return X*X + Y*Y;
}
}
If no base class is specified then the base class is assumed to be System.Object . A class is always derived from one other class and may additionally be derived from any number of interfaces.
class BetterVector : Vector, IEnumerable, ICollection, IFormattable
{
// members
}
The interfaces used in the above example are all .NET base class interfaces.
If a class is derived from another class, then all members of the base
class are automatically also part of the derived class (although any method
that has private access will not be visible to code in the derived class).
The derived class may, however, implement its own versions of any members
- either overriding or hiding the corresponding base class method.
If a class is derived from an interface, then the class must provide implementations
of all members of the interface; it is a syntax error for it not to do so.
Each member of a class should be given one of the access modifiers public , a private , protected , internal , and protected internal . The meanings of these are
Keyword | Meaning |
public | The member is visible to all other code outside the class |
private | The member is only visible within the class |
protected | The member is only visible within the class and any derived classes |
internal | The member is only visible within the assembly in which has been defined |
protected internal | The
member is only visible within that class and within any inherited classes
that are also the same assembly. It is not visible outside the assembly |
If no access modifier is indicated explicitly for a member, then that member will be private.
Types of Class Members
Members of a class may include any of the following
Type of member | Description |
Field | A member variable. |
Method | A function call - a set of code that performs some task. |
Property | A method call or a pair of method calls syntactically dressed to look to callers like a field. |
Indexer | A
method call or a pair of method calls syntactically dressed to look to callers
as if the containing object is an array and we are accessing an element of
this array. |
Constructor | A
method that is automatically called when a class is loaded or a class instance
is instantiated in order to initialise the class or class instance. |
Destructor | A method that is automatically called when a class instance is garbage collected. |
Custom cast | A
method that is automatically called as needed to carry out conversions between
instances of this class and instances of some other class. |
Event | A variable that contains references to methods and which is used in the Windows event handling architecture. |
Definition of another data type | Definition of a class, struct, enumerator or delegate. |
Static and Instance Class Members
By default, members of a class are instance members. For the case of fields
and events, this means that one copy of this field is independently stored
with each instance of the class. The data in an instance field is associated
with a class instance rather than a class. For all other members (the function
members), only one copy of the code is stored, no matter how many class instances
are declared. However, the code for a function is able to access instance
fields because, when called, it is passed an additional hidden parameter,
which is not seen by the developer, which gives a reference to the class
instance against which the method has been called. In this way, instance
methods are still associated with a class instance. The hidden parameter
is accessible via the this keyword.
Members of a class may also be declared as static. If a member is declared
as static, then it is associated with the class rather than with any instance.
Members are declared as static by marking them with the static keyword.
class CustomerName
{
public static ushort MaxLength;
For fields, only one copy of a static field is stored, and all instances
of the class have access to the same copy of this field. If one class instance
modifies the copy of a static field, then the changed value applies to all
other instances of the same class (or to any other classes that have access
to this field).
If a function is declared as static, then it is not passed any hidden
parameter that gives any instance of a class. Hence static functions cannot
be associated with a class instance. It is a syntax error for any function
that has been declared as static to access any instance data: Static methods
may only access static fields.
Any field may be declared as static, with the exception that const fields are always implicitly static, and so should not be explicitly declared as such:
public const ushort MaxLength = 10; // automatically static
Methods, indexers and properties may be declared as static. Constructors
and destructors may not be declared as static, except for the special case
of the static constructor, which has a precise syntax as described later
in the appendix. Casts may only be declared as static.
Definitions of other data types should not be declared as static.
Accessing Members of a Class
Members of the class that are not static are accessed from code outside the class by supplying the name of the class instance followed by the scope resolution operator, the dot.
CustomerName NextCustomer; //CustomerName defined as a class
NextCustomer.Name = "Arabel Jones";
Member static members of the class are accessed from code outside the class by supplying the name of the class followed by the scope resolution operator.
ushort Length = CustomerName.MaxLength;
For code within a class, members of that class may be accessed either
by simply giving their name, or by prefixing the name of the member by the
this keyword.
// in CustomerName definition
Length = MaxLength;
If, within a method, another local variable is declared that has the same
name as a class member, then the local variable hides the class member. In
that case, supplying the name by itself will be interpreted as referring
to the local variable, and the class member can only be referred to using
the this syntax.
class CustomerName
{
public CustomerName(string name)
{
this.name = name;
}
Recall that within an instance method, the this keyword accesses the hidden parameter passed to the method that represents the object against which the method has been called.
If the code within a method needs to access a member of the same instance
that has been defined in the immediate base class, then it may do so using
the base keyword. This is only necessary
if method has been hidden or overridden in the derived class; this is the
usual reason for using base . (The base keyword is the equivalent of Java's super keyword.)
class OurControlWithGreenText : OurCompanysGenericControl
{
public override void DisplayText(string text)
{
TextColor = Color.Green;
base.DisplayText(text);
}
// etc.
(Note that despite the suggestive names, the classes and methods in this
example are made up - they are not, with the exception of Color.Green - part of the .NET Windows Forms base classes).
The this Reference
The this keyword is used when you need a reference within a method to the containing class instance. As noted above, the this reference provides an alternative syntax for accessing other members of the class instance.
However, this is also treated as variable
in its own right. For example you may use it in order to pass a reference
to the containing class as a parameter to another method. This would commonly
be done in order for one class to provide another class instance with a reference
to itself. (Note, the example below only makes sense for classes, not structs):
class MyTreeElement
{
MyTreeElement(MyTreeElement parent)
{
// etc.
}
// somewhere in the code
MyTreeElement Child = new MyTreeElement(this);
Inheritance
As mentioned in the section of classes, all classes derive directlyfrom just one other class, except for System.Object
which does not derive from anything else. (Note that indirectly a class may
derive from many classes, as in the following example: OurNewGroovyZippyHardDrive derives directly from HardDrive , but indirectly (via HardDrive ) from Device , and in turn from System.Object .
Consider this situation:
class Device
{
// methods etc.
}
class HardDrive : Device
{
// methods etc.
}
class OurNewGroovyZippyHardDrive : HardDrive
{
// methods etc.
}
Inherited classes gain all members of the base class. So, for example,
all fields, methods, indexers, etc. and all other members of Device are all automatically members of HardDrive . HardDrive can also define any new members of its own. These, along with the ones defined in Device , are automatically also present in the OurNewGroovyZippyHardDrive
class. Note, however, that any member that is declared as private, although
present, is not visible to any code written in derived classes.
For the case of methods etc., a derived class takes the implementation
of each method from the base class. However, you may choose to supply alternative
implementations of any method in any derived class. What the compiler does
in this situation is discussed later in this appendix, in the section on
method overloads and method hiding.
The structure of which class is derived from which other class is known
as the class hierarchy. A derived class is sometimes known as a subclass,
and a base class as a superclass. Because a class may have many subclasses
but only one superclass, the class hierarchy has the appearance of a tree
structure:

It is a syntax error to form any circular dependencies in the class hierarchy.
References to Derived Classes
A characteristic of C# references is that a variable that is declared
as a reference to some class can alternatively point to any derived class.
So, for example, continuing the above example, suppose we write:
HardDrive SomeHardDrive;
As you would expect, we could write
SomeHardDrive = new HardDrive();
However, it is also perfectly legitimate to then code up this:
SomeHardDrive = new OurNewGroovyZippyHardDrive();
On the other hand, we cannot set references equal to base classes of the referent type.
SomeHardDrive = new Device(); // WRONG
Since all classes are derived from System.Object , a reference to Object can be set equal to any instance of any class whatsoever.
Object Anything; // can set this equal to any object
Note that you would generally write the above code as follows as C# uses the object keyword to represent instances of System.Object ;
object Anything; // a better way of saying the same thing
The reason that it is fine for references to refer to derived classes
is that a derived class automatically implements all the fields and methods
etc. of a base class. In a sense, a derived class is always the base class
plus a bit more. So, one way of looking at it, is that in the above examples,
SomeHardDrive is roughly speaking pointing to a HardDrive instance - it's just that it's a HardDrive instance that has some stuff added to turn it into an OurNewGroovyZippyHardDrive
. Note, however, that any methods etc. called through a base class pointer
must be defined in the base class, otherwise a compilation error will result.
HardDrive SomeHardDrive;
SomeHardDrive = new OurNewGroovyZippyHardDrive();
SomeHardDrive.DoSomething(); // OK if DoSomething() is a member of
// HardDrive. Compilation error if
// DoSomething() is only defined
// in OurGroovyZippyHardDrive
Casts Between Base and Derived Classes
Since a derived class is automatically also effectively an instance of
any base classes, then casting from a derived class to a base class is guaranteed
to work, and so may be done implicitly. In particular this means that any
class may be implicitly cast to object . Continuing the above example,
// SomeHardDrive declared as type HardDrive
Device OtherRefToHardDrive = SomeHardDrive;
object YetAnotherRef = SomeHardDrive; // Implicit casting to a base class OK
On the other hand, casting from a base class to a derived class is risky.
If the object pointed to by the reference is not of the correct type, then
an exception could be thrown. Hence casting this way must be done explicitly.
Device SomeDevice2 = new Device();
Device SomeDevice3 = new HardDrive();
HardDrive SomeHD2 = (HardDrive) SomeDevice2; // will compile but raise
// exception on execution as
// SomeDevice2 doesn't refer to
// a HardDrive instance
HardDrive SomeHD3 = (HardDrive) SomeDevice3; // will compile and run OK
If a reference variable currently stores the null reference then the cast
will succeed, but will return the null reference. If you want to cast between
objects and guarantee that an exception won't be thrown, then you should
use the as keyword as described in the section on data types. This will return null if the cast fails.
Method Overloads and Method Hiding
It is possible to define any method, indexer, operator overload, or property as virtual
. This has the usual meaning for object-oriented programming: It indicates
that a runtime check should be made of which class an object is an instance
of before selecting the appropriate method. For methods that are not explicitly
declared as virtual, the compiler uses the object reference to determine
the method to be called. The difference is significant because of the rule
in C# about references being able to refer to derived classes as well as
their referent data type.
For example, suppose we have the following classes, with one virtual and one non-virtual method defined in each:
class DatabaseUserAccount
{
public bool Login(string password)
{
// implementation
}
public virtual long GetAdminRights()
{
// implementation
}
public void DoSomethingElse()
{
// implementation
}
}
class DatabaseAdminAccount : DatabaseUserAccount
{
public override long GetAdminRights()
{
// implementation
}
public new void DoSomethingElse()
{
// implementation
}
}
We note that:
- Login() is declared only in the base class;
- GetAdminRights() is virtual in the base class and overridden in the derived class
- DoSomethingElse() is not virtual, but defined in both classes.
Now consider the following code:
DatabaseUserAccount User1 = new DatabaseUserAccount();
DatabaseUserAccount Admin1 = new DatabaseAdminAccount();
DatabaseAdminAccount Admin2 = new DatabaseAdminAccount();
// assume there is code here that correctly initializes these variables.
// note that Admin1 is an admin account but referenced through a
// user account reference
// comment indicates which version of this method is called
User1.Login("Password"); // DatabaseUserAccount
Admin1.Login("Password"); // DatabaseUserAccount
Admin2.Login("Password"); // DatabaseUserAccount
User1.GetAdminRights(); // DatabaseUserAccount
Admin1.GetAdminRights(); // DatabaseAdminAccount
Admin2.GetAdminRights(); // DatabaseAdminAccount
User1.DoSomethingElse(); // DatabaseUserAccount
Admin1.DoSomethingElse(); // DatabaseUserAccount
Admin2.DoSomethingElse(); // DatabaseAdminAccount
The reasoning behind this is as follows:
- User1 is a DatabaseUserAccount instance referenced through a DatabaseUserAccount reference. Both types (reference and referent) are the same, so the code will look in the DatabaseUserAccount class to identify all methods to be called.
- Admin1 is a DatabaseAdminAccount instance referenced through a DatabaseUserAccount instance. Therefore, the code will look in DatabaseUserAccount to identify all non-virtual methods, and DatabaseAdminAccount to locate virtual or overridden methods. In this case, the only such method is GetAdminRights() , so the derived class DatabaseAdminAccount version of this method is called.
- Admin2 is a DatabaseAdminAccount instance referenced through a DatabaseAdminAccount instance. Both types (reference and referent) are the same, so the code will look in the DatabaseAdminAccount class to identify all methods to be called. However, in the case of Login()
, no such method is available, so the we search back up the class hierarchy
to identify a method. The first one we find is the version in DatabaseUserAccount() .
In general, virtual methods are slightly slower to execute, because the
computer does not know until runtime what instance a reference is referring
to - hence a runtime check needs to be made. If the method is not virtual,
then the compiler can work out at compile-time which method should be called.
Functions (methods, properties, indexers) should be declared as virtual
if it is likely that derived classes will need to implement their own versions
of them. If a method is virtual then it is guaranteed that the correct version
of it for any given class instance will always be called, no matter what
the reference type used to obtain that instance is. If you do not intend
for derived classes to supply their own implementations of a method, then
for performance reasons it should not be declared as virtual. This situation
is illustrated in the above example: The process of logging in is likely
to be identical for all accounts so the Login()
method is not virtual. But a user's administrative rights depend on the type
of account, so this method is likely to be overridden. Hence, it is declared
as virtual. You cannot define constructors or casts as virtual, and destructors
are always automatically virtual.
Define methods, indexers and properties as virtual if derived classes might need to implement different versions of them.
The DoSomethingElse() method is not virtual,
but has had a new version of it defined in the derived class. Usually, you
would not deliberately code up this situation. It may occur, however, if
a new version of a base class is released which has new methods, one of which
happens to have the same name as a method in the derived class. The new method
is said to hide the base class version.
If a method is virtual, then all overrides of it must be declared as override.
If a method is not virtual, then any implementations in derived classes that
hide it should normally be declared as new
. Not doing so is not a syntax error, but it will raise a compiler warning.
This warning will ensure that developers will know if their method is 'accidentally'
hiding a method in the base class, and they can decide what action to take.
Abstract Methods
A method etc. can be declared as abstract . This means that it
is virtual, and also that no implementation is being provided in the base
class. The usual situation for doing this is if your base class is generic,
and intended only to be derived from rather than to be instantiated.
abstract class Device // generic device that can represent any hardware
// device attached to a computer
{
abstract string GetDeviceName();
If a class contains any abstract methods then the class is also considered
to be abstract, and must be marked as such explicitly in the class definition,
as above. It is a syntax error to attempt to instantiate an abstract class.
Device SomeDevice; // OK - just declaring a reference
SomeDevice = new Device(); // WRONG - an attempt was made to instantiate
// a class that has abstract methods
It is fine to derive classes from an abstract class, but it is a syntax
error to instantiate any instances of such classes unless they provide overrides
to all abstract methods that they have inherited.
class HardDrive : Device
{
override string GetDeviceName()
{
// implementation
}
// later in code
SomeDevice = new HardDrive() // OK
The rules for abstract methods are designed to prevent you from ever instantiating
a class that does not have implementations for all its methods.
In general you will use an abstract class in any situation in which a
number of classes have common functionality - in our example hereDevice is
useful as an abstract class because it can be used to encapsulate functionality
that is common to many particular devices - hard drive, cdromdrive, etc.
However there's no such thing as a generic device in real life so you probably
wouldn't want to actually instantiate a Device. By using an abstract class
in this situation you can provide more structure to the class hierarchy,
and provide common implementations of those methods that are common across
devices, while leaving abstract implementation-less methods in Device for
those methods whose implementations will be specific to each type of device.
The opposite to abstract classes are sealed classes, from which you are
not allowed to derive other classes. They are marked by the sealed keyword in the class definition:
public sealed class MyUnderivableClass
{
// Implementation
}
The reasons for doing this may have something to do with security or design. The System.String class of the .NET base class library is an example of a sealed class. It follows that because abstract and sealed
do the opposite task (one requiring class derivation, the other prohibiting
it) it is nonsensical to have the two keywords applied to the same class.
Structs
Structs are intended for the situation in which you need small, lightweight,
objects that do not have the full functionality of classes. They are particularly
useful when an object simply consists of a couple of pieces of information
grouped together.
You can declare as struct using a similar syntax to that for a class, except that we use the keyword struct instead of class
struct Booking
{
public string FlightID;
public string CustomerID;
}
The comments in previous sections of this appendix about syntax for classes
also apply to structs, apart from the use of the keyword struct in the initial declaration, and apart from the following differences.
- It is not possible to define destructors for structs.
As noted below, structs are value types and hence are simply removed from
the stack whenever they go out of scope.
In addition, we will note here the following other differences between
use of classes and structs which are described in the appropriate part of
the appendix.
- Member methods of structs may not be declared a virtual.
- It is not possible to define a no-parameter constructor for a struct.
Instead, the compiler always the supplies a default constructor (even if
you have defined other constructors).
- Structs are value types whereas classes are reference types.
Fields
A field is C# terminology for variable that has been defined as a member
of a class or struct. The rules for declaring and instantiating fields are
the same as those for declaring variables that are local to methods. Note
that it is quite acceptable to initialise a field by indicating the constructor
to be called or the initial value of the field from within the class definition.
class Automobile
{
ushort NWheels = 4;
The preferred way to initialise a field for which the initial value is
known at compile time is in the class definition as above. However, fields
may also be initialised in constructors.
class Truck
{
ushort NWheels;
Truck(string Make, string Model)
{
// code to check against the database how many wheels
// this sort of vehicle has. Assume result is returned as X
NWheels = X; // whatever value we find out
}
By initializing a field in a constructor you gain the flexibility of being
able to initialise it with a value computed at runtime. However you need
to take care to ensure that the value is initialised correctly in all constructors
that you define.
const Fields
A field may be declared as const , in
which case is treated as a fixed value rather than a variable. Its value
must be assigned in the source code with its declaration and must be known
at compile time.
class Password
{
public const uint MaxLength; // WRONG - no initial value
public const uint MaxLength = 20; // OK
public int X = 20;
public const uint MaxLength = X; // WRONG - const set to a non-const
// value. Hence compiler can't be sure of the value
public const uint TwiceMaxLength = 2*MaxLength;
// OK. Compiler can work this out
In this regard, the rules for const fields are identical to those for const local variables.
One important point about const fields is that they are always implicitly static (although it is a syntax error to explicitly declare them as such)
class Password
{
public static const uint MaxLength = 20;
// WRONG - const is already static. No need to declare it as such
Because const fields are implicitly static,
they must be accessed as for static variables - using the name of the class,
not the name of the object
Password ApplicationPassword = new Password();
uint Length = ApplicationPassword.MaxLength; // WRONG -
// static variable mustn't be accessed using variable
// name
uint Length2 = Password.MaxLength; // OK -
// correct syntax for static field
readonly Fields
Read-only fields are intended for the situation in which you want a field
to be treated as a constant, but you will not know until runtime what the
value will actually be. The syntax for declaring a read-only variable is
as shown in this example, in which we assume we are writing a class to represent
Internet sessions.
class Session
{
protected readonly string SessionID;
Note in this example, for the sake of variety, we've illustrated a protected
field, but read-only fields can have any access level set, as can const fields.
Unlike const fields, read-only fields
do not have to have to be initialised in the class definition, though you
may do so if you wish. The normal practice is to initialise read-only fields
in the constructor(s).
public Session(string IPAddress)
{
SessionID = GetIDFromDatabase(IPAddress);
}
Read-only fields may be set in constructors, but may not be assigned to
anywhere else. Hence their values remain constant from the moment that the
containing class has been constructed.
In contrast to const fields, read-only
fields are not necessarily static. As the above example shows, it is perfectly
possible for each instance of a class to have its own value of a read-only
field. However, you may declare read-only fields as static if you wish.
class Session
{
public static readonly ServerIPAddress;
protected readonly SessionID;
The only place where static read-only fields may be assigned to is inside the static constructor.
static Session()
{
ServerIPAddress = GetMyIPAddress();
}
Note that it is not permitted to assign to static read-only fields inside
any instance constructor (though they may be initialized inside the class
definition).
Methods
A method is a function call against the class instance which may be called
at any time using the syntax for a function (as opposed to properties and
indexers, which are wrapped in a syntax that makes them look like fields
and array elements to the calling code).
Calling Methods
The syntax for calling a method is simply to write the method name, followed
by any parameters. As with all other class members, it may be necessary to
prefix the method name with the details of the class or class instance that
contains the method. The following example uses two of the .NET base classes,
System.Windows.Forms.RichTextBox and System.Windows.Forms.MessageBox .
RichTextBox TextBoxFile = new RichTextBox();
TextBoxFile.LoadFile(@"C:\My Documents\ReadMe.txt");
MessageBox.Show("File has been loaded"); // show is static method so
// class name used
If the method returns a value, then the method can be syntactically treated
as a read-only variable from the point of view of obtaining the value.
if (MessageBox.Show() == DialogResult.OK)
DoSomething();
Defining Methods
The syntax is as follows.
public void DoSomething( /* any parameters here */)
{
// implementation
}
Note that this means the entire code for the method must be specified
within definition of the class of which the method is a member.
The return type of the method must be indicated. This may be any data
type known to the compiler (such as a predefined data type such as long or ushort , or the name of a user-defined data type). Alternatively, a return type of void may be indicated, which means that the method does not return anything.
Within the method, all execution paths must lead to an explicit return statement, except that in the case of a void
method only, it is acceptable for the return to be implied when execution
hits the closing curly brace that marks the end of the method definition:
static public void Main(string[] args)
{
Console.WriteLine("Main method");
// return implied because method returns void
}
ref and out Parameters
By default parameters are passed by value to methods. This means that
a copy of the value is made for the benefit of the method called. For value
types this means that the method cannot alter the value of the variable passed
in, as seen by the calling method. It also means that expressions as well
as variables can be passed in.
int AddNumbers (int A, int B)
{
return A+B;
}
// later
int Result = AddNumbers(10, 2*X);
For reference types, it is possible for methods to alter the values of
parameters since the copy made is a copy of the reference variable - which
simply contains the address - the actual object referred to is not copied,
but the method simply receives the address of where this object is stored.
If you do want a method to be able to alter the value of a parameter, you may declare the parameter as a ref parameter or as an out parameter. Either way will allow the parameter to be modified by the method.
If a value type is passed as a ref or out
parameter to a method then the method actually receives the address of the
variable - and hence can use this address to access or modify the actual
variable from the calling routine. This also carries performance advantages
for structs, since only the address is copied rather than the entire contents
of the struct.
The ref and out parameters behave in the same way except that the rules for initializing them are different.
- You should use a ref parameter when the intention is that the called method should modify the value of the variable. The C# compiler requires that ref parameters are initialized before being passed to a method.
- You should use an out parameter when
the intention is that the called method should initialize the value of the
variable. They do not need to be initialized before being passed to a method,
but within the called method the compiler regards them as uninitialized.
Hence they must be assigned to before the method returns and before their
values are accessed.
Use of ref and out parameters is illustrated by this example.
// converts value to its square and fills a bool to indicate sign of value
void Square(ref double value, out bool isNegative)
{
isNegative = (value < 0) ? true : false;
value *= value;
}
The ref or out keywords must be specified explicitly when calling a method that takes ref or out parameters:
double Quantity = 4.0;
bool IsNegative;
Square (ref Quantity, out IsNegative);
Method Overloads
It is permitted in C# to supply more than one method in a class with the
same name, but different signatures (types and numbers of parameters). A
good example of this is the Console.WriteLine() method that is part of the Console .NET base class, and which writes output to the console window. For example, the available overloads include
public void WriteLine(string value)
{
// writes out the string
}
public void WriteLine(bool value)
{
// writes out string representation of the bool
}
Defining overloads is no different to defining any other method. However,
calling them involves slightly different principles, as the compiler will
need to work out which overload of a method is the one you intend to call.
When the compiler encounters a call to an overloaded method, it examines
the number and types of the parameters being passed in. If an overload exists
that takes those exact same types of parameters, then it calls that overload.
If no exact match exists, then it will examine those overloads that require
the same number of parameters, and attempt to cast the parameters actually
being passed in to match one of those overloads. If it is able to match one
overload by casting the parameters, it will pick that overload. If, by casting
parameters, it is potentially able to match up to more than one overload,
then it will pick the 'best' one using certain criteria. Full details are
in the MSDN documentation, but in general you are advised not to code up
in a manner that leaves uncertainty over which overload of a method will
be called, but instead to explicitly cast the parameters to match the required
overload.
Note that the return type does not distinguish overloads. It is not
permitted to write overloads of a method that differ only in the return type.
Variable Argument Lists
It is possible to define a method that takes an unspecified number of arguments, using the params keyword, followed by an array as the parameter list:
void MyMethod(params object[] args)
{
// etc.
What then happens when the method is called is that the compiler takes
the arguments passed in and packages them up into an array to pass to the
method.
int X, Y;
string Z;
// initialize these variables
MyMethod(X, Y, Z); // MyMethod will receive an array of size 3
For the most general case, shown above, you will declare the array as
of type object, which means that calling methods can pass in any variables
whatsoever. If you know however that the variables to be passed in will all
be derived from a certain class, you can declare the array to be of that
class:
void MyMethod(params System.EventArgs[] args)
{
// etc.
Packaging up the parameters into an array does hit performance, so you
are better off providing explicit overloads of the method for the particular
sets of parameters you know you are likely to require. The compiler will
only call the variable-argument-list overload of a method if it cannot find
any other overloads that are suitable.
Properties
A property consists of pair of methods (or in some cases as detailed below,
a single method) which are syntactically dressed up so that they look like
a field.
Using Properties
The idea of a property is that code that calls the property effectively sees a member field.
MyControl.Width = 500;
int HalfWidth = MyControl.Width/2;
However, in fact the property contains two methods known as the get accessor and the set accessor
. Attempts to set the property actually result in a call to the set accessor
method, while attempts to return the value of the property actually result
in a call to the get accessor method. The above example shows this clearly
since, for any .NET control, setting the width actually changes the width
of the control on the screen - hence some sequence of code must be being
executed in response to the first line of the above code.
In order for this concept to work, the set accessor must return void
, and take one parameter which is of the same data type as defined for the
property. The get accessor must take no parameters and return the data type
as defined for the property.
Defining Properties
The syntax for defining a property is as follows
public int Width
{
set
{
// code to set Width
}
get
{
// code to get Width
}
}
The syntax is as for defining a field, except that in place of the closing
semicolon is a pair of braces containing the two accessor methods, which
are marked by the keywords get and set . The two accessors may appear in either order.
As an example of this, we will assume that the property Width is implemented to simply access a member variable, width . The code will then look as follows.
private int width;
public int Width
{
set // signature is implicitly void set (int value)
{
width = value;
}
get // signature is implicitly int set();
{
return width;
}
}
Notice the different case of the field width and the property Width.
This kind of situation is a good example of how the case-sensitivity of C#
can be exploited.
Note that the parameter supplied to the set accessor is not explicitly
mentioned in the definition of the accessor. However, within the body of
the accessor, the value keyword is used to indicate this parameter.
Either of the two accessors may be omitted from the declaration. If the
set accessor is omitted them the property is read only, and code that refers
to this property will only compile if it attempts to read the property, not
if it attempts to write to it:
class SomeControl
{
public int Width
{
get // signature is implicitly int set();
{
return width;
}
}
// etc.
}
// later on
// MyControl is a reference to a SomeControl instance
int HalfWidth = MyControl.Width; // OK
MyControl.Width = 800; // WRONG. Width is read-only
Supplying a property with only a get accessor (a read-only property)
is the usual way that you can provide read-only access to a member field.
Conversely, if the get accessor is omitted the property is a write-only
property. Although this is legal syntax it is not regarded as good programming
style. Write-only properties should normally be implemented as methods. Just
as with read-only properties, if you attempt to access an accessor that doesn't
exist, this will be picked up as a compilation error.
Indexers
An indexer is very similar to a property, except that instead of being
dressed syntactically to look like a field, an indexer has the effect of
allowing its containing class to be treated as if it was an array, with the
indexer being the method that gets called when you attempt to access an element
of the array.
Calling Indexers
As an example, if a class, Vector has an indexer assigned which takes a long as a parameter, and allows access to its ( double ) components, then we could write.
Vector SomeVector = new Vector(); // assume constructor adequately
// initializes Vector
SomeVector[0] = 40.0;
double Y = SomeVector[1];
Although we are used to thinking of arrays as being accessed using an
integer data type, indexers may be defined to take any data type as the parameter.
For example, components of vectors are, customarily, referred to as X, Y
and Z. If we wished to be able to access components of the vector using the
names of the components as indexes, then we can define an indexer which takes
a string as its parameter. Then we would be able to write this.
SomeVector["X"] = 40.0;
double Y = SomeVector["Y"];
It is also possible to define indexers that take more than one parameter.
This is analogous to a rectangular multidimensional array. For example if
you define a Matrix class you would probably want to be able to access its elements (which we will assume are of type double ) by representing the matrix as a 2D array Then you could write
Matrix SomeMatrix;
// initialize SomeMatrix
double Element00 = SomeMatrix[0,0];
SomeMatrix[1,2] = 10.0;
In both of the above examples the return type from the indexer is a double . In fact, the return type can be any data type you wish.
Defining Indexers
An indexer is defined using the same syntax as for a property, except
that we have to specify the parameters. For example, an indexer that takes
one long parameter and returns a string would be defined like this.
public string this [long index]
{
get // implicitly returns a string and takes one parameter, index
{
// code to return value
}
set // implicitly returns a void and takes two parameters, value (a
// string) and index (a long)
{
// code to set value
}
}
Note that in this code sample, the name of the parameter is up to you. We picked index
in this case, but that is just a variable name - it's not a keyword, unlike
the implicit parameter to the setter, which is always named value .
We can use our Matrix example to demonstrate a full implementation of an indexer that takes two long s as parameters and returns a double . We'll assume here that the Matrix indexer simply wraps around an array field of type double[,] .
class Matrix
{
private double [,] Elements; // assume this gets initialized
// somewhere
public double this [long row, long column]
{
get // implicitly returns a double and takes two long params
{
return Elements[row,column];
}
set // implicitly returns a void and takes two long and one
// double parameter
{
Elements[row,column] = value;
}
}
As with properties, the get accessor method must return the same data
type as the return type of the indexer itself, while the set accessor must
return void and also takes an implicit parameter, value
, which will be the value passed to the indexer as the right hand side of
an expression. Accessors also implicitly take the same parameters as explicitly
defined for the indexer ( row and column in the above example).
Also, as with properties, it is permissible to omit one of the accessors
from the definition of the indexer, in which case the indexer will be either
read-only or write-only.
As with methods, it is possible to provide as many overloads of an indexer
as you wish, with the proviso that the overloads must be identifiably different
in their signatures. The compiler will choose the closest matching overload
in each given context.
Constructors
Constructors are similar to methods, except they do not have a return
type and are really intended to initialize class instances when they are
declared.
Calling Constructors
Constructors are always called with the new
operator. The syntax for calling constructors was actually given earlier
in the appendix, in the section about initializing variables:
int NCustomers = new int();
The above code shows the explicit syntax for calling a constructor for the predefined data type, int - this constructor takes no parameters.
Since int is a predefined data type,
the constructor is already defined. However, if you define your own classes
or structs, then you have the option to define your own constructors, with
or without parameters.
Although we remarked that constructors are really intended for initialization,
there is nothing wrong with calling the constructor against a variable after
it has been initialized. For value types, this will have the effect of resetting
all fields to the values specified in the constructor. For reference types,
it will have the effect of creating a new instance of the class on the heap,
and reassigning the reference the constructor is called against to the new
type.
int NCustomers = 50
// do stuff with NCustomers
NCustomers = new int() // example with value type.
// Resets NCustomers to zero
Customer NextCustomer = new Customer();
// do stuff with NextCustomer
NextCustomer = new Customer(); // example with reference type
//Creates new Customer instance
For the predefined data types as shown above, there is little point in
using the constructor syntax since is easier to write, for example, NCustomers = 0 in the above case. However, you may find this technique useful for your own classes and structs.
Defining Constructors
You do not have to define constructors for your classes and structs. If
you do not do so then the compiler will automatically supply a default constructor
for each class or struct that you define. The default constructor will work
by simply zeroing out those member fields which have not been explicitly
initialized in the class definition, and will take no parameters.
The rules for what constructors you can add and the effect of this on
the default constructor are slightly different for classes and for structs:
- For classes you can replace the default constructor by supplying your own constructor that takes no parameters.
- For structs, you cannot replace the default constructor. It is considered a syntax error to do so.
- For both classes and structs you can supply as many overloads that take
parameters as you wish to the constructor - provided these overloads have
different signatures (in other words the number and type of the parameters
are different for each constructor).
- For classes, if you supply any constructors of your own, then no default
constructor will be generated. This means that if you supply parametered
constructors but no parameterless constructor, then it will not be possible
to instantiate the class without specifying parameters.
- For structs, the default constructor is always present, even if you add your own constructors.
The syntax for defining constructors is very similar to that for defining
methods, except that the constructor does not have a return type - the thing
implicitly returned from a constructor is the instance that has just been
created. The constructors are distinguished from methods in the class definition
because they have the same name as the name of the class.
As an example of a user-defined constructor, we'll assume we want to write an Employee class, where employees must have a name that must be supplied when the Employee is instantiated. Hence we might write this.
class Employee
{
private string name;
public Employee(string name)
{
this.name = name;
}
This example illustrates the most common use for parametered constructors:
to store the values of the parameters in member fields - but in fact you
can place any legal code in the constructor.
With the above example, you can then instantiate an employee like this:
Employee TheBoss = new Employee("Mr Gates");
However, the following code would fail, because by supplying a user-defined
constructor, we have prevented the compiler from supplying a default one.
Employee NewRecruit = new Employee(); // compilation error -
// no no-parameter constructor
If you want to be able to construct employees like that, then you would
need either explicitly supply a default constructor, or not supply any constructor
of your own at all.
class Employee
{
private string Name;
public Employee(string name)
{
Name = name;
}
public Employee() // now the above code will work
{
}
By contrast, if Employee had been declared as a struct, there would always be a default constructor, so it would always be possible to write:
Employee NewRecruit = new Employee(); // OK if Employee declared as struct
Calling Constructors from Other Constructors
It is acceptable for one constructor to call another constructor, but you must do so using a precise syntax known as a constructor initializor
. The usual reason for calling other constructors is in order to supply default
values for a constructors without duplicating the code in each constructor.
For example, suppose it is possible to define our Employee class in the one of two ways:
- Supplying the name of the employee
- Supplying the name and salary of the employee
Where the salary, if not supplied, is assumed to be 20,000 dollars. This could be implemented using the following code.
class Employee
{
private string name;
private decimal salary;
public Employee(string name)
: this(name, 20000M)
{
// if you want to do anything else here you can do so
}
public Employee(string name, decimal salary)
{
this.name = name;
this.salary = salary;
}
From this we see that the syntax for the constructor initializor consists of a colon, followed by the keyword this
followed by any parameters to be passed to the other constructor. There is
no semicolon after the constructor initializor. The compiler will determine
from the number and type of these parameters which other constructor is to
be called, according to the usual rules for method overloads.
It is important to understand that the constructor initializor is executed before the constructor body. This has the obvious implications for which variables you can rely on having been initialized where.
It is alternatively possible to call a base class constructor in the constructor
initializor. To do this you replace the keyword this with the keyword base in the above code.
For example suppose the class Manager is derived from employee, and you wanted the constructor for Manager to be roughly the same but with an additional parameter, bonus . You might implement the Manager constructor as follows.
class Manager : Employee
{
decimal bonus;
public Manager(string name, decimal salary, decimal bonus)
: base(name, salary)
{
this.bonus = bonus;
}
It is not possible to place more than one other constructor in the constructor
initializor. Nor it is it possible to place anything other than a call to
another constructor of this class or a constructor of the base class in it.
In other words, the constructor initialization list, if specified, always
has one of the following two formats:
: this(/*parameters*/)
: base(/*parameters*/)
If you do not specify the constructor initializor, then the compiler will
supply one for you, which calls the base no-parameter constructor. Hence,
for any constructor the following two definitions are equivalent.
class MyClass : MyBaseClass
{
public MyClass()
: MyBaseClass()
{
class MyClass : MyBaseClass
{
public MyClass()
{
A consequence of this is that, since the constructor initializor is always
executed before the constructor body, the constructors of any class will
always be executed in the order of going down the hierarchy. In other words,
the System.Object constructor will always
be executed first, followed by the constructor(s) of the first derived class,
followed by the next class, etc. until the constructor(s) for the class we
are actually instantiating is executed last.
Note that the compiler will check that each constructor has access to
the constructor in its constructor initializor. This means that if any constructor
is declared as private, it cannot be called from any derived classes. For
example:
class MyBaseClass
{
private MyBaseClass()
{
}
// etc.
}
class MyClass : MyBaseClass
{
public MyClass() // WRONG. This implicitly calls base(),
// but that constructor is private hence not visible
// in the derived class. Will give compilation error
{
// etc.
Static Constructors
Besides the instance constructors, it is also possible to define one static
constructor in each class. The syntax looks like this.
class MyClass
{
static MyClass()
{
// do stuff here
}
The static constructor is used to initialize any static or static read-only
variables fields. A static constructor is guaranteed ever to only be executed
once. You cannot determined precisely when it will be executed, but it will
be before any other code actually instantiates objects of the class or calls
any static members. In other words, the static constructor will be called
before any other code attempts to use the class.
A typical example of the use of static constructors was presented earlier, in the section of read-only fields:
class Session
{
public static readonly ServerIPAddress;
protected readonly SessionID;
static Session()
{
ServerIPAddress = GetMyIPAddress();
}
Like other static methods, it is not possible to access any instance data
or fields from the static constructor. Syntactically, the static constructor
never takes any parameters, does not have a return type, and does not even
have an access modifier. It would be meaningless to describe the static constructor,
for example, as public or protected, since the static constructor is called
only by the .NET run-time, never by any other C# code).
Destructors
Where constructors are called automatically to initialize an object, destructors
are designed to provide the ability for an object to clean up any resources
it has been using when it is destroyed.
In general, most classes will not need destructors, since any managed
memory being used by a reference object will automatically be removed when
the object is garbage collected, while memory on the stack that is occupied
by a value object will be freed when the object goes out of scope. The only
objects for which destructors are needed are those that maintain external
resources that are not managed by the .NET runtime - for example connections
to files or databases or old-style Windows handles.
Defining a destructor is more complex than defining a constructor, since
a well-designed destructor will be implemented in two stages. On the other
hand, as just remarked, it will not be required at all for most classes.
Destruction of structs works differently to destruction of classes.
Much of the contents of this section apply only to classes. You may derive
structs from IDisposable and implement a Dispose() method on them, but C#
does not permit destructors themselves to be defined for structs.
In order to have a class clean up its resources, you should:
- Derive the class from the IDisposable interface, and provide an implementation of this interface's Dispose() method.
- Implement the destructor itself.
The idea is this:
- Any client code that uses the class should call Dispose() explicitly when it has finished with a given object. Dispose() should be implemented, for example, to free any external resources that are being used by the object. Calling Dispose() means that these resources will be freed promptly.
- The destructor itself will be called automatically when the garbage collector
destroys the object. This serves as a backup mechanism just in case the client
code, for whatever reason, does not call Dispose() .
The reason for treating the destructor as purely a backup rather than
as the primary mechanism for cleaning up resources is twofold: Firstly, there
is no guarantee about precisely when it will be executed, because it is called
by the garbage collector, and there is no guarantee about when the garbage
collector will run. This means that if we relied solely on the garbage collector
to free up external resources then those resources would most likely not
be freed until considerably later than the time at which they become no longer
needed. Secondly, executing a destructor does affect the performance of the
garbage collector. Also, because of the way the garbage collector has been
implemented, objects that have destructors will not actually be destroyed
until the second time that the garbage collector is called - so their lifetimes
will be longer than needed. Because of this, it's desirable to avoid calling
destructors where possible, and in fact a method has been provided by one
of the .NET base classes, System.GC.SuppressFinalize() , which should be called from the Dispose() method and which notifies the .NET runtime that there is now no need to run the destructor for a given object.
Disposing of objects successfully requires cooperation between the object
and the client code that instantiates the object. In the next section we'll
see what the client code has to do, and in doing se we'll see how the destructor
is actually defined.
Disposing Objects
Client code should be implemented to call Dispose()
on an object that implements this method when it has finished with the object.
Client code cannot call the destructor itself - that is called by the garbage
collector.
Suppose we have a class, FileConnection , which contains a link to an external resource. Then we would use it like this.
FileConnection MyObject = new FileConnection();
// use MyObject
MyObject.Dispose();
// should not use MyObject again
However, C# provides an alternative syntax which will result in Dispose() being called automatically:
using (FileConnection MyObject = new FileConnection)
{
// code that uses MyObject
}
If the using statement is applied in
this way, the variable defined in the parentheses will be scoped to the following
block statement. When execution flow leaves the block, by whatever means
( return , break , exception etc.) execution will simply hit the closing brace. MyObject.Dispose() will be called implicitly.
Defining Destructors
The destructor itself is defined using the following syntax:
class MyClass
{
~MyClass()
{
// cleanup code here
}
The compiler recognizes a destructor because it has the same name as the
containing class, but is preceded by a tilde (~). The destructor must not
take any parameters, or have any return type or access modifiers.
As we have said, you should also derive the class from IDisposable and implement the Dispose() method, which takes no parameters and returns a void . In general, the Dispose() method will contain the same clean up code as the destructor, as well as calling GC.SuppressFinalize() to prevent the destructor from being called subsequently.
class MyClass : IDisposable
{
~MyClass()
{
// cleanup code here
}
public override void Dispose()
{
// cleanup code here
GC.SuppressFinalize(this);
}
From this code we see that SuppressFinalize() takes one parameter - an object reference that indicates the object whose destructor is no longer required.
Other Variations
The above sequence is not the only possibility, but it represents the
usual recommended way of implementing a disposal sequence. In particular
any of the stages just described can be omitted. However if you omit to implement
Dispose() performance will suffer since
disposal will always happen via the destructor. If you omit to implement
the destructor itself, then there is no safeguard against a badly behaved
client forgetting to call Dispose() . (This
is however the only real option for structs for which you cannot define a
destructor, since structs are not garbage collected).
You could also theoretically implement Dispose() without deriving from IDisposable
, but you don't save any coding effort that way, and it means that clients
cannot use the interface to dynamically find out if your class implements
Dispose() . A side effect of this is that attempting to use the using syntax to automatically have Dispose() called for such an object will cause a compilation error since this syntax relies on the object implementing IDisposable .
In some situations you may want to be able to free external resources,
while keeping the option of reopening them later. This might be the case
for example for a class that implements a database connection, where you
might wish to close the connection for periods when you are not accessing
the database. For this purpose, Microsoft recommend that you implement a
second method Close() , which should return a void and be implemented in the same way as Dispose() , except that it will not call GC.SuppressFinalize() .
Operator Overloads
C# allows you to define overloads of operators. The table below shows
which operators may be overloaded. You should note that there are some operators
that cannot be overloaded because their meanings are fixed (for example =
), while other operators cannot be overloaded but are constructed from other
operators that can (for example the compiler always breaks down the addition-assignment
operator, += into its components ( + and = ). Hence if you overload the addition operator ( +
) for a particular class or struct, then the corresponding addition assignment
operator will automatically have its meaning defined by your + operator overload.)
The following operators may be overloaded
- The binary arithmetic operators: + - * / %
- The unary arithmetic operators: +, -
- The prefix versions of the unary operators ++ and - (not the postfix versions)
- The comparison operators: != , == < , <= , > , >=
- The binary bitwise operators: & , | , ^ , >> , <<
- The unary operators: ! , ~
- The operators true , false
The following operators cannot be overloaded.
- The arithmetic assignment operators *= , /= , += , -= , %= (These are worked implicitly by combining the corresponding arithmetic operator overload with the assignment operator).
- The bitwise assignment operators &= , |= , ^= , >>= , <<= (These are worked implicitly by combining the corresponding arithmetic operator overload with the assignment operator).
- The Boolean operators && , || (these are worked out by the compiler from the corresponding bitwise operator overloads).
- The postfix increment and decrement operators, ++ and -- . These are worked out implicitly from the overloads to the prefix ++ and -- operators.
- The assignment operator, = . The meaning of this operator in C++ is fixed.
- The pointer-related operators, unary * , unary & and ->; the meanings of these operators is fixed.
- The ternary operator: ? :
It is also not possible to overload the scope resolution operator ( . ), or any of the forms of brackets () , [] and {} .
Note that there are a couple of restrictions for overloading the comparison operators: Any overloads of them must return bool , and they must be overloaded in pairs: If you overload == , you should also overload != . Similarly for >= and > , and for <= and < . Also, if you overload == then you are strongly recommended to override the Equals() method for your class (all classes inherit this method from System.Object ). If you do not, the compiler will give a warning.
Using Operator Overloads
Once you have overloaded an operator then you simply use it in the normal
way for that operator. For example supposed you have a Vector class. Mathematically, it is possible to add vectors, the result being another vector. Assuming you define an overload for + that takes two Vector instances as its parameters and returns a Vector , you would be able to write:
Vector V1, V2, V3;
// initialize V1 and V2 here
V3 = V1 + V2;
Defining Operator Overloads
Operator overloads are treated syntactically as static member methods
from the point of view of defining them. The name of the method is the keyword
operator , followed by the symbol used for that operator. So, for example, to provide the overload for addition for Vectors that will allow the above code to compile, we would write
class Vector
{
public static Vector operator + (Vector lhs, Vector rhs)
{
// code to add lhs and rhs and return a Vector here
}
Note that this operator overload must takes two parameters, since addition
is defined as a binary operator. In general, when overloading an operator
you must supply the same number of parameters that the operator normally
takes. However, these parameters can be of any type you wish, as can the
return type (with the exception that the return type for all overloads of
the comparison operators, == , != , < , >= , <= , and > must be bool ).
The compiler treats operator overloads in the same way as methods when
it comes to figuring out which overload is the one required in a given context.
It can choose between any of the overloads that you supply, and all the other
predefined overloads for each operator. For example, supposed the compiler
encounters this code.
int X1, X2, X3;
double D1, D2;
Vector V1, V2, V3;
// initialize all these variables
V3 = V2 + V1; // OK
D1 = X1 + X2; // OK
D2 = V1 + X3; // WRONG - no suitable overload
Our first attempt to use the addition operator in this code is when we add V1 to V2 , storing the result in V3 . Here, the compiler will see that the two parameters passed to the operator ( V1 and V2 ) are both Vector instances, as is the result. Hence, the compiler will look for an overload of this operator that takes two Vector s as its parameters and returns a Vector . Since we have provided one, there is no problem.
Next, we attempt to add two int s and store the result in a double . Again, this is no problem, since the compiler knows how to add two int s. In this case, the predefined overload of + also returns an int , but again, that is no problem because the compiler can implicitly convert an int to a double.
The final line of this code will cause compilation error. In this case, we're trying to add a Vector to an int and store the result in a double . The compiler will therefore look for an overload of + that takes a Vector as its first parameter and an int as its second parameter. Since we have not provided one, the compiler will next look to see if any casts on Vector or int
are available, that it can some how use to do some data conversions on the
parameters, that will result in data types for which a + overload is available. However, since we haven't supplied any casts to convert Vector
to or from anything else, the compiler will not find any way of doing this
either. At this point it will give up and generate a compilation error.
User-Defined Casts
Earlier in this appendix, when we covered variables, we described how
C# allows data types to be converted to other data types using casts. A cast
can be explicit or implicit.
int X1 = 10, X2 = 20;
double D2 = X1; // implicit;
int X2 = (int)D2; // explicit
For the predefined data types, implicit casts are available where it is
not possible for anything to go wrong with the operation. If, however, it
is possible in principle that the conversion might raise an exception or
cause an overflow or loss of data (as is the case for a double -to- int conversion), then only an explicit cast is available.
The same principles apply for casts between your own data types.
You may define a cast to convert any data type to any other data type, but the following restrictions apply.
- The cast must be marked as either implicit or explicit, to indicate how you intend it to be used.
- It is not possible to define casts between classes where one class is
directly or indirectly derived from the other. This is because casts already
exist in that case.
- The cast must be defined as a static member of either the source data type or the destination data type.
The final condition means that you can only define the cast if you
have access to the source code of least one of the data types. This effectively
means that you can only define casts for those classes and structs that you
have written.
The syntax for defining a cast is as follows. In this case we assume that we have two classes, Gif represents any Gif file. StillGif represents any Gif file, provided that it does not contain any animated images. We also assume that Gif and StillGif are not in the same inheritance tree. Now from the definition, a cast from Gif to StillGif may fail (if the Gif
contains animations) and hence if good programming practice is being followed
should be explicit, while a cast the other way round will always succeed,
and so should be implicit.
We define the explicit cast as follows.
public static explicit operator StillGif (Gif source)
{
// code to do conversion. Must return a StillGif
}
While the implicit cast would look like this
public static implicit operator Gif (StillGif source)
{
// code to do conversion. Must return a Gif
}
From this we see that a cast is defined in a similar way to a method,
except that the 'name' of the method consists of the keyword operator
followed by the name of the destination type. The cast must take one parameter,
which is the source data type instance to be converted.
Note that although not a syntax error, it would be considered extremely
bad programming practice for the implementation of the cast to modify the
source value passed in as a parameter in any way. The developers who write
client code would not be expecting this behavior.
For the above examples, either cast may be defined in either the Gif or the StillGif class definition (though not both).
If you define a cast between a custom class or struct, and a predefined
data type, then the cast must be placed in the definition of the custom class
or struct.
Interfaces
An interface is similar to a class or struct, but it does not contain
any implementations for any of its methods etc. The real intention of an
interface is that it provides a way for a class or struct to declare that
it supports certain methods, properties, events or indexers. For this reason
an interface is often regarded as a contract. An interface itself is defined
using the following syntax.
public interface IEnumerable
{
IEnumerator GetEnumerator();
}
For this example we have taken an interface from one of the base classes, System.Collections .
This code shows that the syntax for defining the members of an interface
is the same as that for defining the members of a class or struct, except
that, because the methods do not have any implementations, the name of each
method, etc. is simply followed by a semicolon. Also notice that they do
not have any access qualifiers and are not declared as virtual etc. It is
regarded as the job of the class that implements an interface to add any
such modifiers to the methods. The interface should simply indicate the return
types and parameters, and, as indicated in this sample, it is perfectly acceptable
for an interface reference to be returned as a parameter. This particular
interface only defines one method, but most interfaces will define more than
one.
IEnumerable is required to be implemented by collections, and in particular,
the foreach loop requires this interface to be implemented on the class it
enumerates over.
An interface may inherit from one or more other interfaces, in which case
it is considered as containing all the methods of the base interfaces as
well as those defined explicitly in the derived interface.
public interface ICollection : IEnumerable
{
// Properties
int Count { get; }
bool IsSynchronized { get; }
object SyncRoot { get; }
// Methods
void CopyTo(Array array, int index);
}
The ICollection interface defines GetEnumerator() (inherited from IEnumerable
), as well as its own methods. Classes may implement this interface, which
means that they are declaring they support more sophisticated collection
facilities than those offered by IEnumerable .
A class inherits implements an interface by deriving from it. Details of this are given in section about classes.
class MyCollection : ICollection
{
// implementations of members. Must include implementations of
// all ICollection (and IEnumerable) members.
}
A class that implements an interface can be cast to that interface, and
methods called against the interface reference. In general, you use interface
references in exactly the same way that you use class references, except
that you cannot instantiate an interface - you get an interface reference
either by casting a class reference or as the return value from a method.
Both techniques are illustrated here.
MyCollection SomeCollection = new MyCollection();
// initialize SomeCollection
IEnumerable Enumerable = (IEnumerable) SomeCollection;
IEnumerator TheEnumerator = Enumerable.GetEnumerator();
// this would work too.
IEnumerator OtherEnumerator = MyCollection.GetEnumerator();
Enumerations
An enumeration is strictly speaking an instance of a struct that is derived from the base struct System.Enum
. This base class is intended for classes that simply contain closed lists
of values. To make using such lists of values easier, the C# language uses
the enum keyword, which wraps a special syntax around the System.Enum class, in much the same way that string wraps a special syntax around System.String .
The syntax for defining and enumeration looks like this
public enum Platform {Win30, Win31, Win95, Win98, WinMe, WinNT351, WinNT4,
Win2000, WinXP}
By defining an enumeration, you are in effect defining a new data type.
Hence you may only define a new enumeration either as a top-level object,
in a namespace or as a member of a class or struct. Once an enumeration has
been defined, you may declare and use instances of it like this:
Platform OperatingSystem;
OperatingSystem = Platform.WinMe;
Since you can never write code inside an enumeration, you will always
be referring to the name of an enumerated value with the syntax <EnumName>.<Value> .
By default, enumerated values are stored as int
s. The first enumeration value in the list has the value 0 and other values
are numbered consecutively upwards 1,2,3 etc. However, you can change this
order by specifying a particular value for any enumerated value.
public enum Platform {Win30, Win31, Win95=95, Win98, WinMe, WinNT351,
WinNT4, Win2000, WinXP}
In this case, we have assigned the value 95 to Win95. Numbering will then
carry on from this point, so, for example, with the above code, Win98 will
have the value 96, WinMe 97 etc.
It is also permitted to change the underlying data type that is used to
store any enumeration. You may choose any of the integral data types ( sbyte , byte , short , ushort , int , uint , long , ulong , though not char ) and do so by indicating the data type after the enum keyword. For example, to store this enumeration as ulong :
public enum Platform : ulong {Win30, Win31, Win95=95, Win98, WinMe,
WinNT351, WinNT4, Win2000, WinXP};
The ToString() method has been overridden in System.Enumeration , so that it returns a string representation of the value. For example the code
string OsName = Platform.Win31.ToString();
will place the string " Win31 " in OsName .
A static Parse() method that takes a string as a parameter has also been implemented in System.Enum , which enables you to do this:
Platform Os = (Platform)Platform.Parse(typeof(Platform), "WinNT4");
This will result in Os containing the value, WinNT4 .
Explicit casts are available to convert enumerated values to the corresponding numeric value.
int OsAsInt = (int)Os;
If Os has the value WinNT4 , this will place the value 99 in OsAsInt .
Delegates
A delegate is an instance of a class that is derived from the .NET base class, System.Delegate . System.Delegate
is designed as a base class for any classes that hold references to methods.
Just as with enumerations, C# provides a special syntax to wrap around the
derived classes, which makes it more convenient to manipulate method references.
Defining a Delegate
A delegate is defined using the following syntax.
// declares a delegate that can wrap any method that returns double and takes
// an int parameter
public delegate double DoubleOp(int value);
// similarly declare a delegate for a method that returns a string and
// takes no parameters
public delegate string StringOp();
The definition consists of the delegate
keyword followed by the something that looks like the signature of a method.
The name of this method will be the name of this class of delegate. The names
given to the parameters in the signature are dummy names and do not have
any significance, they are not actually used anywhere.
Note that the above code should be viewed as defining a data type - it does not actually declare any variables.
Declaring a Delegate
Instances of delegates are declared in the same way as instances of other data types.
StringOp GetAString;
When actually instantiating the delegate, you need to pass one parameter
to the constructor, which is the details of the method that the delegate
is going to refer to. For static methods, this means supplying the name of
the class and the name of the method, while for instant methods, this involves
supplying the name of the instance and the name of the method. The syntax
is the same as the syntax for calling the relevant method, except that the
parameter list is omitted. For example, suppose we have a class, Employee , which represents details of a company employee and contains the following methods.
class Employee
{
public string GetName()
{
// etc.
}
public static string GetCompanyName()
{
// etc.
}
We can declare a variable that is a delegate instance able to refer to each of these methods as follows.
// somewhere in a class or method
// declare variable
StringOp GetAString;
// initialize it
GetAString = new StringOp(Employee.GetCompanyName);
Notice the parentheses after the method name GetCompanyName
are omitted. Adding them would result in the code actually calling the method
- we just want to indicate which method we are referring to.
The type of the delegate is entirely defined by the signature of the method
- that is to say the return type and the number and types of its explicit
parameters. Whether the method is static, and what type of class the method
is defined against, is irrelevant in this context. This means that a particular
delegate instance can be initialized to any method in any class provided
it has the appropriate number and types of parameter, and the appropriate
return type. The above code initializes the GetAString variable to wrap round the static GetCompanyName() method, but we could equally well later set it to refer to the GetName() method instead -except that because GetName() is an instance method, we need an instance of Employee too.
Employee TheManager;
// assume we initialize TheManager to point to an employee instance
GetAString = new StringOp(TheManager.GetName);
Similarly, we could, if we wished, set GetAString to refer to a method of an instance of a completely different class - provided that the method has the same signature.
Using a Delegate
Once a delegate has been initialized to refer to the method, you can actually
called the method by simply writing the name of the delegate followed by
the parameters to the method, in brackets just as if the delegate was the
method. This is known as invoking the delegate.
Console.WriteLine(GetAString());
// results depend on what method GetAString was referring to.
However, what makes delegates really powerful is that, because a delegate
is also regarded as a variable, its value can be passed around. This is the
way that C# allows for details of methods to be passed between the other
methods.
// method that takes a delegate
// this method at some point needs to obtain a string and it doesn't care what
// method it uses to get it - you tell it what method to use
// by passing it as a parameter in a delegate
void ProcessAString(StringOp op)
{
// at some point in the method .
string TheString = op();
// process the string
}
Multicast Delegates
A delegate is automatically a multicast delegate if the method to which it is defined to refer to returns a void
. The significance of multicast delegates is that they can simultaneously
refer to more than one method call. Methods may be added to multicast delegates
using the addition and additional assignment operators.
For example, suppose we have a delegate defined as follows. This delegate
is intended to display a message to the user, but we don't specify how it
is to be done (message box, writing to console, etc.) - the method passed
in to the delegate will determine that.
public delegate void ShowMessageOp(string value);
Now this delegate is a multicast delegate because it returns void .
We can set up a delegate instance that, when invoked, will display the message on the console:
ShowMessageOp DisplayIt = new ShowMessageOp(Console.WriteLine);
We can now add a second method to the delegate, which displays a message
with a message box. This would normally be done with by calling the Show() method of the.NET base class, System.Windows.Forms.MessageBox . However this method doesn't return void and so doesn't fit the requirements for a multicast delegate. So here we'll assume we've defined our own method, ShowMessageBox , as a static method of a class OurUtilities , which has the correct signature and returns void :
DisplayIt += new ShowMessageOp(OurUtilities.ShowDialog);
Now if DisplayIt is invoked, it will call both methods in turn, so displaying the same message in two different ways:
DisplayIt("This message will get displayed twice");
It is also possible to remove methods from multicast cast delegates by
using the subtraction and subtraction assignment operators.
DisplayIt -= new ShowMessageOp(OurUtilities.ShowDialog);
Or alternatively
ShowMessageOp NewDisplayOp = DisplayIt -
new ShowMessageOp(OurUtilities.ShowDialog);
When a multicast delegate is invoked, each method that it refers to is
invoked in turn, each one being passed for the same set of parameters.
Events
An event is a multicast delegate that has either the following signature.
void EventHandler(object sender, System.EventArgs e)
The second parameter can be a reference to a class derived from System.EventArgs .
The significance of events is that they are used in the standard Windows
event callback mechanism by which applications receive notifications of interesting
events (such as actions by the user concerning the mouse or keyboard). Because
of the importance of this mechanism, C# uses the event
keyword to make using events syntactically slightly easier. In the above
signature, sender is intended to provide a reference to the object that raised
the event. The e parameter, on the other
hand, contains information relating to the event (for example, which key
was pressed, where the mouse was when it was clicked), although in order
to provide this information, the EventArgs class must be derived from by a specialized class that can supply the information relating to a particular type of event.
In order to define an event, the code that raises the events must define
the relevant delegate type. For example, suppose an application implemented
a class, NetworkAnalyzer , which monitored
the network for FTP requests, and so was able to notify other code when FTP
requests came in. It might define the following delegate.
class NetworkAnalyser
{
public delegate void FTPRequest(object Sender, FTPEventArgs e);
Where FTPEventArgs is a class we have
defined that contains information related to an FTP request (source IP address
etc.), and which is derived from System.EventArgs . We might for example define FTPEventArgs like this:
class FTPEventArgs : EventArgs
{
private string sourceIPAddress;
private string ftpCommand;
private StringCollection ftpCommandParameters;
public FTPEventArgs(string sourceIPAddress, string ftpCommand,
StringCollection ftpcommandParameters)
: base()
{
// etc.
// also define public readonly properties for the fields.
}
This definition stores the IP address of the computer issuing the request,
and the command and parameters sent, and so could make this information available
to event handlers. It uses StringCollection , a .NET base class defined in System.Collections.Specialized , which implements a collection of strings.
We would then declare the event as an instance of the appropriate delegate.
public event FTPRequest OnFTPRequest;
A client that wishes to be notified of FTPRequests would then implement a handler:
protected void FTPRequestHandler(object sender, System.FTPEventArgs e)
{
// code to handle this request
}
It would also notify the event that it wishes to be notified of any other
events using the normal syntax for adding methods to multicast delegates:
NetworkAnalyser Analyser;
// initialize Analyser so it refers to the Network Analyser instance
// now add our handler to the event
Analyser.OnFTPRequest += new Analyser.FTPRequest(this.FTPRequestHandler);
Exceptions
Exceptions are the normal means by which C# code should handle exceptional error conditions.
In order to handle exceptions correctly you divide code into three types of blocks known respectively as try , catch , and finally blocks.
try
{
// normal code
}
catch (SomeException e) //SomeException is an exception class defined
{ // in your code
// error handling code
}
catch (SomeOtherException e)
{
// error handling code
}
finally
{
// cleanup code
}
This code illustrates the principles, but there are some variations. It is permitted to omit the finally block, and there may be as many catch blocks as you wish, but each one should take a parameter of a different type. However, the parameter to each catch block must be a reference to a class derived directly or indirectly from the .NET base class, System.Exception . Classes so derived are known as exceptions or exception classes. Note that it is also permitted to omit all catch
blocks - in this case the construction is not used to trap any errors, but
simply to provide a way of guaranteeing that code in the finally block will be executed when execution flow leaves the try block.
- The try block contains the normal code of your program, in which an error condition may arise.
- Each catch block contains code that handles a certain type of error condition.
- The finally block, if present, contains code to perform any clean up that is required after leaving the try block.
The idea is that if some unexpected error condition arises, the program should respond in the try block by throwing an exception. Throwing an exception is accomplished by the throw statement.
throw new SomeException("An error has occurred");
The throw statement takes one parameter, which syntactically appears straight after the throw
keyword. This parameter is the exception instance that is being thrown. There
is no need for any enclosing brackets surrounding it.
As soon as execution hits a throw statement, it immediately leaves the try block and enters the first available catch block that can handle that exception. In the process of leaving the try block, any variables that were scoped to the try
block or to any methods called from within it will automatically pass out
of scope. There is no limit to how deeply the code might have passed into
further method calls after entering the try block. Execution will always smoothly exit from all of these immediately. The first available catch block is identified using the particular class of exception that was thrown. As noted above, each catch block takes a parameter which is an instance of a class derived from System.Exception . A catch
block is able to accept a particular exception if its parameter is either
of the same class as the exception or is of a base class of that exception.
This is equivalent to saying that the reference in the catch block must be able to take the exception that was thrown as a referent.
When code leaves a catch block it automatically enters the finally block. If no exception is thrown in a try block, then eventually execution will simply pass out of the try block - in this case, control automatically then passes straight to the finally block. It does not matter how code exits from a try block, the finally block will always be executed. Afterwards, control usually proceeds to the next executable statement following the finally block.
As an example, consider this code. Assume a method is defined like this.
// this method expects Value to refer to something and Value2 to be > 10
void DoSomething(MyClass Value, int Value2)
{
if (Value == null)
throw new ArgumentNullException("Value was null in DoSomething()");
else if (Value2 <= 10)
throw new ArgumentException("Value2 was < 0 in DoSomething()");
// code for method
}
// later on
try
{
DoSomething(Value, Value2); // assuming these variables are defined and
// initialized
}
catch (ArgumentNullException)
{
// error handling code
}
catch (ArgumentException)
{
// error handling code
}
finally
{
// cleanup
}
In order to understand this code we need to be aware that ArgumentException and ArgumentNullException are two exception classes in the .NET base classes, and that ArgumentNullException is derived from ArgumentException . ArgumentException is used to indicate a general problem with a parameter passed to a method, while ArgumentNullException is used specifically to indicate that the argument passed on method was null but should have been referring to some object.
If DoSomething() is called with null passed as the first parameter, then an ArgumentNullException is thrown. Since the first catch block takes precisely this class of exception, this will be able to handle that exception, and therefore this catch block will be executed, followed by the finally block. If on the other hand DoSomething() is called with 0 passed as the second parameter, then an ArgumentException will be thrown. Execution flow will miss the first catch handler because that requires an ArgumentNullException , or class derived from it. However, the second catch handler takes an ArgumentException , and therefore can handle this exception.
Note that the order of the catch statements is important. For example, if we had coded this:
catch (ArgumentException)
{
// error handling code
}
catch (ArgumentNullException)
{
// error handling code
}
Then the second catch block will never be reachable because an ArgumentNullException would be picked up by the first catch block. The compiler would detect this and flag a compilation error.
If an exception is generated and no catch
blocks within the program are able to handle it, then program execution terminates
and the .NET runtime will handle the exception, which normally means displaying
a message box complaining that the exception was not handled in your code.
If you wish, you can ensure all CLS-compliant .NET-generated exceptions are
caught by providing a catch handler with this signature.
catch (Exception)
{
// error handling code
}
This is because according to the rules of the .NET Common Language Specification, all exceptions must be derived from System.Exception .
You can extend this to cover any exceptions that are not CLS-compliant
or are even generated outside the .NET environment altogether (and which
will therefore probably not be derived from System.Exception ) by adding a catch block that takes no parameters to the end of your list of catch statements.
catch
{
// error handling code
}
Although this will trap any object that has been thrown, it has the disadvantage
that you have no access to the exception object, and hence cannot find out
any information about the cause of the error from it.
Nested Try Blocks
It is possible to nest try blocks inside each other.
try
{
// normal code
try
{
// normal code
}
catch (SomeException e)
{
// error handling code
}
finally
{
// cleanup code
}
}
catch (SomeOtherException e)
{
// error handling code
}
finally
{
// cleanup code
}
Nested try blocks listed in this way usually act independently. However, they interact if an exception generated inside the inner try block is not handled by any of the catch handlers of the inner try block. In this case, the search for suitable exception handler moves up to the catch blocks associated with the outer try block (in the process, the inner finally block will be executed but no other code inside the outer try block will be). So, assuming a suitable handler is found with the outer try block, the order of execution will be throw statement -> inner finally block -> outer catch handler -> outer finally block , after which execution continues at first statement after the outer finally block.
Throwing Exceptions from Inside Catch or Finally Blocks
In the scenario above, if an exception is thrown from within the inner catch handler or inner finally block then search for suitable handler starts with the outer catch blocks. The implication of this is that if an exception is thrown from inside one of the outer catch blocks or the outer finally statement, this will automatically be regarded as unhandled.
Usually, a C# syntax requires that a throw
statement must specify an exception object. However, in the particular case
in which an exception is being thrown from inside a catch block, it is acceptable to avoid specifying an exception object.
// inside catch block
throw;
In this case, the compiler will simply arrange for the exception currently being handled to be thrown again.
Exception Objects
The .NET base classes include an extensive class hierarchy of exception
objects and it is permitted to define new ones appropriate to your code.
The most usual constructor for an exception object is the one demonstrated
in all the code samples above - it takes one string as a parameter. This
string should describe the error condition and is available as the Message property of the exception object. Hence one very basic catch handler might read:
catch (SomeException e)
{
MessageBox (e.Message);
}
If you are throwing a second exception from within a catch block, it is
common practice to string the two exception objects together. This is usually
done by passing in the original exception as a new parameter, innerException
, to the constructor of the new exception instance. The .NET base class exception
classes feature two-parameter constructors that allow you to do this, and
you are strongly advised to provide the same feature in any exception classes
that you define.
catch (SomeException e)
{
// code
throw new SomeOtherException("What happened was etc.", e);
}
The usual reasons for throwing second exceptions are because something
else went wrong while you were trying to handle the first exception or in
order to provide additional information about the first exception.
Attributes
Attributes are objects that can be attached to items in C# source code
to provide extra information to the compiler concerning those items. Items
to which they can be attached include methods, classes, structs, enums, method
parameters, etc.
They syntax of an attribute consists of a name, optionally followed by
one or more parameters in parentheses, the entire attribute being enclosed
in square brackets and placed immediately before the item to which it applies.
For example:
[STAThread] // indicates threading model of main program thread
public static void Main() // STAThread takes no parameters
{
Or:
// indicates that following method and all calls to it should be ignored
// by the compiler unless the named preprocessor directive is present
[Conditional("Debug")]
public void WriteDebugInfo()
{
Some attributes are recognized by the compiler and thus affect the compilation
process, as those in the above examples do. Other attributes simply cause
metadata to be left in the compiled assembly, which can later be retrieved
using the System.Reflection classes, and
may be used, for example, for extended documentation purposes. Since attributes
are actually implemented as instances of .NET base classes that are derived
from System.Attribute , it is quite possible
to define your own custom attributes - though obviously such attributes will
not be recognized by the compiler, and so will not have any effect other
than leaving metadata in the assembly.
Because it is possible to define further attributes, it is not possible
to give a comprehensive list of the available attributes. However, some of
the common ones include.
Attribute | Purpose |
Conditional | Prevents a method from being compiled unless a named debug symbol is present |
DllImport | Indicates that a method is defined in a named external DLL instead of in source code or an assembly |
Flags | Indicates that an enumerated list actually consists of bitwise flags that may be combined. |
Obsolete | Marks
a method or class as obsolete, causing either a compiler warning or error
(depending on parameters passed to the attribute) to be raised if any code
attempts to use the obsolete item. |
STAThread | Indicates that a thread should run in an STA environment |
StructLayout | Allows you to specify exactly how fields in a struct should be laid out in memory. |
Preprocessor Directives
In addition to the C# statements, C# supports a number of special commands
known as preprocessor directives. These commands do not translate into any
executable code directly, but they provide extra instructions to the compiler
concerning precisely how it should compile the source code.
Preprocessor directives have a different syntax from normal C# statements: They always begin with a #
, and use carriage returns rather than semicolons to indicate the end of
the directive. Because of this, a preprocessor directive should be the only
command on a line (though you may add comments after it).
#define/#undef
The #define directive defines a symbol, which will be present for the duration of compilation. #undef removes the symbol. The symbol so defined may be used by the #if , #elif , and #else preprocessor directives, as well as by the Conditional attribute.
#define Debug
#undef EnterpriseVersion
The #define and #undef
directives must appear before any actual C# code. Note that preprocessor
symbols may also be defined by supplying the appropriate flags to the compiler
at the command line.
#if/#elif/#else/#endif
These directives can be used to construct an #if block, which acts in a similar way to the if ... else if ... else statement of the C# language. However, whereas if indicates that statements should be conditionally executed, #if
indicates to the compiler that certain statements should be conditionally
compiled, depending on whether a certain preprocessor symbol has been defined.
#if Enterprise
// put code here that you want compiled for the enterprise version
// of your software
#elif Professional
// code here for the Professional version
#else
// code for the home version
#endif
You can use the && , || , and ! operators to perform simple logical operations on the preprocessor symbols:
#if Enterprise && Debug
// compile something if both these symbols are defined
#endif
You can also use the symbols true and false . A symbol is considered to be true if it is defined. Using this, you can test if a symbol is undefined:
#if Enterprise == false
// compile something if Enterprise symbol not present
#endif
#line
The #line directive modifies that line number and file name used to report compiler warnings and errors.
#line 200 "Main.cs" // line numbering from here reset to line 200,
// file reported as Main.cs
The main use of #line will be if you
have any software that processes your source files before handing them to
the compiler. This means that compiler error messages will report a file
name and line number that matches up to the files generated by the intermediate
software - and won't necessarily match up to the files you are editing. You
can use #line to restore the match so that you can see which code is causing compilation errors or warnings.
#warning/#error
These directives respectively instruct the compiler to issue a warning, or to raise a compilation error.
#warning "Don't forget to finish the implementation of the CalculateInterest method!"
#if (Debug && Release)
#error "You have the Debug and the Release symbols defined"
#endif
#region/#endregion
These directives can mark a region with a comment.
#region Member Fields
int X;
string Message;
#endregion
The main significance of #region and #endregion
is that smart text editors may be aware of them, and may be able to use them
to format the appearance of your source code appropriately. For example the
Visual Studio.NET code editor uses them to define areas of code that it can
collapse.
Pointers and Unsafe Code
C# allows the use of pointers and pointer arithmetic, but only in blocks of code that have been specifically marked as unsafe.
Pointers are very similar to references in that they store the address
in memory of where some data is stored. However, references are very safe
to use because the syntax around them restricts you from doing anything other
than accessing the data referenced as a class instance. Pointers are far
more flexible because their syntax gives you access to the address itself.
This means that you can perform arithmetic operations on the address, and
explore or modify areas of memory selected by address rather than by reference
to a named variable. This means that operations using pointers can be done
with a very low overhead and very high-performance, because you can write
your code at a very low level, with direct memory operations. On the other
hand, all the type safety of C# is lost, and it becomes a very easy to write
buggy code that corrupts data or areas of memory. It is for this reason that
such code is restricted to unsafe areas.
You can mark either a single member of a class or struct as unsafe, or
an entire class or struct as unsafe, by including the keyword unsafe in the definition of the member or of the class or struct definition.
public unsafe void DisplayResult(long *pData)
{
Or
unsafe class LinkedList // entire class is unsafe
{
ulong *Start; // only allowed in an unsafe class
ulong *End;
Declaring a method as unsafe means that pointers may be used anywhere
in the implementation of that method, and its parameters may be of pointer
data types. Declaring a class or struct as unsafe means that all members
of that class are automatically unsafe.
Variables may be declared as unsafe if they are member fields, but not if they are local variables.
public unsafe int *pX; // OK if declared as a field
It is also possible to mark block statements within methods as unsafe:
unsafe
{
// do something with pointers
}
Pointer Syntax
A pointer to any value data type is declared by prefixing the name of the data type with an asterisk.
long *pL1, pL2;
double *pD;
MyStruct *pMyStruct;
int *pWidth;
Pointers can be declared to any value data type, but it is not possible
to declare a pointer to a reference data type. This is because the garbage
collector is not aware of the existence of pointers, and so may at any time
delete class instances - which would clearly cause problems if a pointer
is pointing to an instance that is deleted.
Just as for a reference, declaring a pointer does not create any instance
of the data pointed to; it simply declares a variable that will hold the
address. You'll normally declare the variable pointed to separately and then
assign the address of this variable to the pointer. This is done using the
address of operator which is denoted by the ampersand symbol, & .
int Width = 20;
int *pWidth;
pWidth = &Width; // pWidth now points to Width
This code could alternatively have been written
int Width = 20;
int *pWidth = &Width; // pWidth now points to Width
Pointers can be dereferenced using the pointer dereference operator, for which the symbol is the asterisk.
int CopyOfWidth = *pWidth; // initializes CopyOfWidth to 20.
Note that there is never any confusion as the meanings of * and &
. When used in pointer operations they are unary operators, whereas, when
used for the alternative meanings of arithmetic multiplication and bit wise
AND , they are binary operators, and hence always appear in your source code between the names of two variables.
The syntax above is fine when declaring pointers to variables that are
stored in the stack. However, value types may also be stored on the heap
in some cases - for example, if they are member fields of a reference type,
and in those cases the above syntax is not permitted. This is because data
on the heap is subject to the garbage collector, and, even if not removed,
may be moved to a new location on the heap if the garbage collector chooses
to tidy it up. Instead, such pointers must be declared inside a fixed
statement, which instructs the garbage collector not to move the class containing
the member field(s) in question while the pointers remain in scope. This
prevents bugs from occurring due to pointers containing out-of-date and incorrect
addresses. Such bugs cannot arise with references, because the garbage collector
is able to update references automatically when it moves class instances.
For example, consider this class:
class Vector
{
double X;
double Y;
double Z;
Although the class itself is a reference types, so that it is not possible to declare a pointer to a Vector , its members, X , Y , and Z
. are all value types. Hence, you can legitimately declare a pointer to any
of these fields. However, because these fields will be stored on the heap,
C# requires that any pointers to them must be declared inside a fixed statement.
Vector MyVector;
// initialize MyVector
fixed (double *pX = &MyVector.X, pY=&MyVector.Y)
{
// statements
}
All variables declared in the brackets following the fixed keyword will be scoped to the statement or block statements associated with the fixed statement.
Just as for references, you can explicitly set a pointer to null to indicate that it doesn't point to anything.
double *pD ;
pD = null;
Note however, that if you wish a pointer to be set to null, you must do
so explicitly. For performance reasons, C# never initializes pointers implicitly
on your behalf. Also, when dealing with pointers, the usual rules about not
being able to access a variable before it is initialized are relaxed.
The sizeof Operator.
When dealing directly with memory addresses, it can be useful to know
how much memory each instance of a data type occupies. This information is
provided by the sizeof operator which takes
one parameter, which is the name of a value data type (it does not handle
reference types), and it returns the number of bytes in memory occupied by
that is data type.
int X = sizeof(long); // returns 8
int Y = sizeof(float); // returns 4
Like pointers, the sizeof operator is only available inside unsafe methods or classes.
Pointer Arithmetic
It is possible to increment or decrement pointers, or to add integers
to them. However, when performing pointer arithmetic, the compiler assumes
that the basic unit of memory is the size of whatever data type a given pointer
points to. Hence, for example, syntactically adding 1 to a long* pointer actually results in sizeof(long) being added to the pointer. Similarly, if you syntactically subtract 3 from a double* , then what will actually happen is that 3*sizeof(double) will be subtracted from the pointer.
double SomeDouble = 20.0;
double *pD = &SomeDouble;
pD -= 3; // subtracts 3*sizeof(double) = 24 from pD contents
If you are using this technique to navigate to memory locations of variables,
you should be aware that variables are not always stored contiguously. And
modern 32-bit machines are designed to access memory in 32-bit (4-byte) blocks.
Hence for performance reasons, the .NET runtime always tries to store each
variable beginning at the border between a 4-byte block. This means that
gaps will exist in memory between any variables whose size is not a multiple
of 4.
Stack-Based Arrays
It is possible in an unsafe code block to create high-performance stack-based arrays, using the stackalloc
keyword. These stack-based arrays each consists simply of a block of memory
on the stack, big enough to hold a specified number of instances of a given
data type. These instances are accessed using pointer arithmetic, and there
is therefore none of the overhead of having a full System.Array object hanging around. The s tackalloc
operator takes two parameters, the name of the data type and the number
of elements needed. It has an unusual syntax that looks like this.
double *pArray = stackalloc double [20];
The return value of stackalloc is a pointer to the start of the memory allocated. The above code will allocate 20 times sizeof(double) bytes of memory.
Accessing the elements of an array allocated with stackalloc
may be done using pointer arithmetic. But C# provides an easier syntax using
square brackets, which allows elements of stack based arrays to be treated
syntactically in an identical manner to elements of full arrays.
double *pArray = stackalloc double [20];
pArray[0] = 30.0; // initializes first element of array
pArray[1] = 16.0; // initializes second element of array
The rule is as follows: When the C# compiler encounters an expression of the form p[X] , where p is a pointer and X is an integer type, it always expands this to the expression *(p+X) - which correctly gives the appropriate address for an array allocated with stackalloc . This special syntax applies to all pointers; they don't have to have been allocated with stackalloc .
Note that stackalloc does not initialize the contents of its memory. Also, no bounds checking is performed when you attempt to access its elements.
Thread Safety
For the most part, support for threads and thread safety is provided by the base classes in the System.Threading namespace, and is therefore beyond the scope of this appendix. However, C# does provide one statement, lock
, which assists in thread safety by wrapping a mutual exclusion lock (also
known as a mutex) around a given reference object for the duration of a statement
or block statement. The mutual exclusion lock prevents any other threads
in the process from accessing the object in question.
// X is a reference type
lock(X)
{
// code that executes with mutual exclusion lock on X
}
Note that in this code, X must be a reference type, and it is the object referred to by X , not the variable X , which is subject to the lock. Internally, lock() works by providing a C# syntax wrapper around certain methods on the System.Threading.Monitor class. Further details of thread support are provided in Chapter 7, and in the MSDN documentation for the System.Threading namespace.
Keywords
In this section we present a list of all the recognized C# keywords, with their approximate meanings.
Keyword | Meaning |
abstract | marks that a class cannot be instantiated but simply serves as a base class for other classes. |
base | Used to reference the base class of a class. |
bool | The struct, System.Bool. |
break | Breaks out of the controlling for, foreach, do, while or switch statement. |
byte | The struct, System.Byte |
case | Indicates the start of commands for a particular value of the control variable in a switch statement |
catch | Marks the start of a block that throw transfers execution to |
char | The struct, System.Char. |
checked | Indicates
the following block of code should run in a checked context: Overflow errors
etc. should cause exceptions to be raised. |
class | Defines a new reference type |
const | Indicates that a variable or field cannot be assigned to |
continue | Transfers control to the next iteration of the controlling for, foreach, do or while statement. |
decimal | The struct, System.Decimal |
default | Marks code that should be executed if none of the tests in a switch statement were matched |
delegate | Defines a new type derived from System.Delegate |
do | Start of a loop that executes at least once and then continually until a specified condition is false. |
double | The struct, System.Double |
else | Marks the code block that should be executed if none of the conditions in an if .. else if block were true |
enum | Defines a new type derived from System.Enum |
event | Special delegate designed to work with Windows event architecture |
explicit | Indicates that a cast may only be used explicitly |
extern | Indicates that a method is implemented as a function by an external DLL. |
false | A Boolean value |
finally | Marks block of code that should always be executed on leaving a try block |
fixed | Prevents the garbage collector from moving a class |
float | The struct, System.Single |
for | A sophisticated
loop that allows the developer to determine the action to be taken at the
beginning of each iteration and the test of whether to exit the loop |
foreach | A loop that enumerates over items in a collection |
goto | Transfers control to the labeled statement |
if | Indicates the following statement should only be executed if some condition is true |
implicit | Indicates that a cast may be used either implicitly or explicitly |
in | Helper keyword for the foreach loop |
int | The struct, System.Int16 |
interface | Defines an interface |
internal | Marks an item as only being visible to code within the same assembly |
is | Tests whether an item is of a given data type |
lock | Wraps a mutual exclusion lock (mutex) around an object on the heap for the duration of a statement or block statement |
long | The struct, System.Int32 |
namespace | Defines a namespace |
new | Calls the constructor of an object. Also indicates that a method hides a base class methods. |
null | The null reference - indicates that a reference doesn't refer to anything. |
object | A reference to the class System.Object |
operator | Defines an operator overload |
out | Marks a parameter as being on that the method being called will set the value of |
override | Indicates that a method overrides a similarly named method in the base class |
params | Indicates that a method takes a variable argument list |
private | Prevents any code outside the class from being able to see a class member |
protected | Prevents any code outside the class and derived classes from being able to see a class member |
public | Make a member visible to all other code |
readonly | Prevents a field from being assigned to other than in the constructor of the containing data type |
ref | Indicates that the value of a parameter may be modified by the method being called |
return | Returns control to the next method up the call stack |
sbyte | The struct, System.SByte |
sealed | Prevents any classes from being derived from a class |
short | The struct, System.Short |
sizeof | Returns the number of bytes occupied by a particular data type |
stackalloc | For creating stack based arrays of value types in unsafe code. |
static | Indicates that a member of a class or struct is not associated with any particular instance of that class or struct |
string | The class, System.String |
struct | Defines a value data type |
switch | Presents a number of sets of statements, with control to be transferred to one of them depending on the value of some variable |
this | The class or struct instance with which this method is associated |
throw | Transfers control to the next catch statement that is able to handle a particular type of exception |
true | The Boolean value indicating true |
try | Indicates that exceptions may be generated by the following block of code and are handled by the subsequent catch block |
typeof | Returns a System.Type object that gives detailed information about a named type |
uint | The struct, System.UInt16 |
ulong | The struct, System.UInt32 |
unchecked | Cancels a checked statement. Returns to the default behaviour for overflow errors. |
unsafe | Marks a method or class as being one in which pointers and pointer operations may be used |
ushort | The struct, System.UShort |
using | Indicates
a namespace that will not be named explicitly, or alternatively supplies
an alternative name for a particular type within some namespace. This can
also be used to indicate that IDisposable.Dispose() should be called on a
variable when it goes out of scope. |
virtual | Indicates that a method may be overridden |
void | Specifies that a method does not return anything, or that a we are not indicating the type that a pointer points to |
while | Loop that tests some condition then executes until this condition is false. |
|