Revision as of 12:06, 16 December 2005 editMirror Vax (talk | contribs)5,609 edits rv, please stop screwing up categories which you evidently do not understand← Previous edit | Revision as of 17:44, 16 December 2005 edit undoSamiam95124 (talk | contribs)451 edits Revert. 1. Discuss first. 2. Stop insulting me. 3. Lay off the caffine.Next edit → | ||
Line 597: | Line 597: | ||
{{Major programming languages small}} | {{Major programming languages small}} | ||
] | ] | ||
] | ] | ||
] | ] | ||
] | |||
] | |||
] | |||
] | ] |
Revision as of 17:44, 16 December 2005
C vs Pascal: A language comparison
C and Pascal are both arguably descendants of the ALGOL programming language series. ALGOL introduced so-called "structured programming", where programs were constructed of single entry-single exit "components" such as "if", "while", "for", "case", etc. Also, whereas before ALGOL only the expression syntax for languages was described systematically, ALGOL defined the entire language in terms of a syntax or grammar. This tended to make the language more general and regular.
C and Pascal are often compared to each other, sometimes heatedly, probably because the languages have similar times of origin, influences, and purposes, and so represent two philosophical approaches to a similar need. Both languages have roughly the same data types, program structures and general layout. Both were used to design their own compilers early in their lifetimes, and both vied for superiority during the critical formative years of the early microcomputer age. It is less interesting to compare, say, C with Lisp or Pascal with Perl because these languages are so clearly different, with different aims.
Identifiers
C and Pascal are almost identical with regard to identifiers. All identifiers start with an alphabetical character, and continue with further alphabetical characters or digits. In C the character "_" (underscore) can also appear anywhere in the identifier, and identifiers that begin with "_" are considered system reserved, and are used to differentiate special system symbols from program symbols. Certain Pascal compilers use this naming scheme as well.
C and Pascal differ dramatically in their interpretation of upper and lower case. In Pascal, case does not matter, in C it does. Thus:
MyLabel
and
mylabel
are two different identifiers in C, but the same identifier in Pascal. This can cause problems if Pascal is used in a linker system primarily designed for C.
Keywords
Both C and Pascal use keywords, or words reserved for use by the language itself. Examples are "if", "while", "const", "for" and "goto", which are also keywords that happen to be in common to both languages.
Pascal is often said to be "wordy" compared to C. In Pascal, blocks begin and end with "begin" and "end". C uses "{" and "}", respectively. In Pascal, a function must begin with the keyword "function", a type with "type". In C, both of these are determined by context alone.
Syntax
C uses an abbreviated syntax compared to Pascal. C is also context-sensitive. For example, an identifier used as a "struct" can be reused for other purposes in the program.
C was deliberately designed to take such shortcuts. The syntax is more compact as a result, but is more difficult to parse, and has an irregular grammar.
Pascal was designed to have a simple and regular syntax with only one context sensitivity, which is the assignment ambiguity:
a := b;
where b could refer to a variable or a function.
Because of the simplicity of parsing Pascal, it is often used in compiler training courses, and the parsers for Pascal-based languages are often very fast.
Simple types
In C, the default type is "int", which corresponds to "integer" in Pascal. C can declare many objects to be int by default, for example, the return type of a function is int if not otherwise specified. However, this is considered "old C" practice and rarely used today, and has been removed from the language as of C99. Pascal requires declaration of all types.
C accommodates different sizes and signed/unsigned modes for integers by using modifiers such as long, signed, unsigned, etc. The exact meaning of the resulting int type is machine-dependent. In Pascal, the same end is performed by declaring a "subrange" of integer:
type a = 1..100; b = -20..20; c = 0..100000;
This subrange feature (simply called a "range") is not supported by C.
A major, if subtle, difference between C and Pascal is how they promote integer operations. In Pascal, all operations on integers or integer subranges must have the same effect as if all of the operands were promoted to a full integer. In C, there are defined rules as to how to promote different types of integers, typically with the effect that the result of two integers will not have a precision greater than either of the operands. This can have the effect of making C more efficient in use of registers or resources on a machine with mixed word lengths. A highly optimizing Pascal compiler can reduce, but not eliminate, the effect of this under standard Pascal rules. A side effect of this Pascal feature is that many Pascal compilers have special integer types and promotion rules.
In C, characters are interchangeable with ints. For example the following is valid:
int a; a = 'x'+1;
In Pascal, char is a separate type from integer, and must be specifically converted to integer with "ord". Likewise, a character is formed from an integer by the "chr" function.
C formerly considered int and pointer types to be the same, but this concept proved problematic, especially in machines that had different sizes for the two. In ANSI C, they are separate.
Character types
C has no specific character type. The type char is actually a int that is at least as small as, or smaller than "short int". Characters can be treated as interchangeable with ints at any time:
x = x+'g';
This integer form of characters in C is most clearly shown by the types:
signed char
unsigned char
Which are needed because char types are often used as "very short ints", with no character value whatever.
In Pascal, character types are distinct types, and are rarely used as general integers. In order to convert character to integer, the function "ord" must be used. In order to convert integer to character, the function "chr" must be used.
Boolean types
C does not have a specific Boolean type, however, it does have Boolean operators, and operators that give Boolean results. A Boolean, like a character value, is an integer, and in fact, Booleans are usually stored to their most compact form in "char" types (since that often means a single byte on many machines). The relational operators give Boolean results, so "<", ">", "==" and others give Boolean results. A Boolean is stored as a "1" when true, or a "0" when false. However, tests for boolean truth are performed as a check for zero when false, and not zero when true. This means that int values greater than 1 can effectively be used for true values.
C can use standard bitwise operators for Booleans, or special Boolean operators. For example, "|" is a bitwise OR, and "||" is a Boolean OR. In many cases, the effect is the same, however, this is not guaranteed. For example, ~0 (bitwise NOT 0) is not equal to 1, but usually -1 (on most computers). !0 (boolean NOT 0) is equal to 1. Therefore the programmer must pay attention to if the operands are intended for use as Boolean values or bit values.
In Pascal, Boolean is an enumerated type. The values of Boolean, are "false" and "true". Just as in C they are stored as an integer value, but values greater than 1 are not valid. If Boolean is to be converted to integer, "ord" is used:
i := ord(b);
There is no language special function for integer to Boolean, however, the conversion is in practice simple:
b := i <> 0;
Real/floating point types
There is little difference between C and Pascal in real/floating point types. In C, ints can be converted to floats, and vice versa, at will. In Pascal, integers can be converted to reals at will, but conversion of real to integer must be done via the functions "trunc" and "round", which truncate or round off the fraction, respectively.
Array types
Both C and Pascal allow arrays to be "structured", or consist of other complex types, including other arrays. However, there the similarity between the languages ends. In C, arrays are declared by their number of elements:
int x;
and the indexes are numbered from 0 to n, where n is the number of elements - 1. In Pascal, the starting and ending indexes are specified by a subrange (as introduced in "simple types" above):
type x = array of integer;
Contrary to popular myth, this does not make Pascal arrays substantially different from index than C arrays. For example:
type x = array of integer;
would be indexed as 0 to 9, just as in C.
In Pascal, array indices can be any range type, even enumerations:
type enum = (red, green, blue); myarray = array of integer;
In C, arrays are equivalent to a pointer with the same element type. Thus:
int *a; int b
can be used as:
a = b;
which causes a and b to point to the same array. Similarly:
int *a;
and
int b;
are both pointers to int.
The equivalence of C arrays and pointers suggests to programmers that arrays are "dynamic" or changing in length, for C. Actually, arrays in C are fixed, but pointers to them are interchangeable. So for example:
int *a, b; a = b;
declares a fixed length array of 10 ints, and defines b to be a (constant) pointer to that. Then, a is equated to point to that same array.
This flexibility allows C to manipulate any length array using the same code. However it also leaves the problem of how to find the exact length of an array up to the programmer, which can cause memory access faults.
In Pascal, arrays are a distinct type from pointers. This makes bounds checking for arrays possible from a compiler perspective. Practically all Pascal compilers support range checking as a compile option. The ability to both have array that change length at runtime, and be able to check them under language control, is often termed "dynamic arrays".
Strings
Both C and Pascal consider character strings to be a special case of an array:
char a; type a = packed array of char;
In both languages, it is up to the programmer to determine the exact length of a character array. However, in C, the method of using a null character (zero) as a "sentinel" for the end of string is supported by the language in the special case of a constant string:
char *a; a = "the rain in spain";
In this case, the C language processor automatically adds the null to the end of the constant string, so that the end can be found.
Pascal also supports the use of string constants for the special case of a packed array whose starting index is 1:
type string = packed array of char; var a: string; a := 'the rain in spain ';
However, Pascal leaves it entirely up to the programmer how to detect the end of a string. The most common method is to pad strings on the right side, but this is purely a programmer convention (many Pascal implementations have dynamic strings as a language extension).
C does not have built in string assignment. In the code:
char *a; a = "the rain in spain";
the string is not actually being transferred to a, but rather a is being made to point to the constant string in memory.
Pascal does have string (and in fact, all structured type) assignment, however, the constant must match the type exactly:
type string = packed array of char; var a: string; a := 'the rain in spain ';
Thus constant strings must be padded out to force type equality. This, along with the general lack of dynamic string support, is considered to be a problem with the language.
Record types
Both C and Pascal can declare record types. In C, they are termed "structures".
struct a { int b; char c; };
type a = record b: integer; c: char end;
In C, the exact bit length of a field can be specified:
struct a { int b:3; int c:1; };
Contrary to popular opinion, this can also be done in Pascal by using the subrange construct:
type a = record b: 0..3; c: 0..1 end;
However, the compiler must support packing of records to the bit level.
Both C and Pascal can have records which have different fields overlapping each other:
union a { int a; float b; };
type a = record case boolean of false: (a: integer); true: (b: real) end;
Both language processors are free to allocate only as much space for these records as needed to contain the largest type in the union/record.
The biggest difference between C and Pascal is that Pascal allows the use of a "tagfield" for the language processor to determine if the valid component of the variant record is being accessed:
type a = record case q: boolean of false: (a: integer); true: (b: real) end;
In this case, the tagfield q must be set to the right state to access the proper parts of the record.
C, in its original version, cannot assign structs or pass them to functions, but Pascal can. This capability was added to C in C99.
Pointers
In C, pointers can be made to point at most program objects, even constant objects:
int a; int *b; b = &a;
In C, since arrays and pointers are equivalent, the following are the same:
a = b; a = *(b+5);
Thus, pointers are often used in C as just another method to access arrays.
To create dynamic data, the library functions "malloc" and "free" are used to obtain and release dynamic blocks of data. Thus, dynamic memory allocation is not built into the language processor.
In Pascal, pointers are much more restricted. Each pointer is bound to a single dynamic data item, and can only be moved by assignment:
type a = ^integer; var b, c: a; new(b); c := b;
In Pascal, pointers can never point to program objects such as variables. This tends to make Pascal more type safe than C, but not completely type safe. Pascal can still have invalid pointer references in several ways. For example, a pointer can be referenced when uninitialized, or it would be referenced after it is disposed, etc.
Statements
The statements used between C and Pascal are roughly analogous.
if (x) ... else ... while (x) ... do ... while (x) ... switch (x) { case a: ...; case b: ...; default: }
if x then ... else ... while x do ... repeat ... until x case x of a: ...; b: ... end
Pascal, in its original form, did not have an equivalent version of "default" case (this is a common extension).
C has so called "early out" statements "break" and "continue". Pascal does not. There is controversy about whether the inclusion of these statements is in keeping with structured programming methodology. The best that can be said about this is that the use of break and continue may make programming easier, but there is no case where they cannot be replaced by "orthodox" structured programming constructs.
Both C and Pascal have a "goto" statement. However, Pascal allows jumps between different procedures or functions, which is commonly used to implement error recovery. C has this capability via the ANSI C "setjmp" and "longjmp". This is equivalent, but a little more difficult to use, because you must arrange for the setjmp to be executed both before the jump is executed, and when the jump is executed.
Both C and Pascal use ";" between statements. However, C always terminates statements with ";", but Pascal considers it a separator. In practice, there is little difference between these styles.
In C, comments are formed by /* comment */. In Pascal, it is (* comment *) or { comment }.
Functions/Procedures
In Pascal, routines that return a value are called functions, and routines that don't return a value are called procedures. In C all routines are called functions, however routines that do not return a value are declared to return "void", meaning that they do not return anything. Actually "void" is not really a valid variable type, it can only be used when declaring the return type of a function. It cannot be used for declaring normal variables of this type, for example.
In practice Pascal procedures are equivalent to C functions that return "void", and Pascal functions are equivalent to C functions that return a non-void type.
The following two declarations in C:
int f(int x, int y); void k(int q);
are equivalent to the following declarations in Pascal:
function f(x, y: integer): integer; procedure k(q: integer);
In Pascal, there are two different types of parameters, value and pass by reference or VAR parameters. In C, there are only value parameters, but the C ability to point to any variable allows the programmer to construct their own pass by reference scheme:
int f(int *k); x = f(&t);
function f(var k: integer): integer; x := f(t);
In C, it is possible to create a function with any number of parameters:
int f(int a, ...); f(1, 2, 3, 4, 5);
The function f uses a special set of functions that allow it to access each of the parameters in turn. This set of functions was undefined in original C, but was defined in ANSI C. In practice, this feature is easy to call to, but fairly complex, and was machine dependent, to create functions that are called using it. Perhaps because of this, it is mainly used to form the language support library for C, specifically I/O.
Pascal has no equivalent to C's n-parameter feature. However, Pascal has I/O statements built-in to the language, so there is less need for it.
Pascal allows procedures and functions to be nested:
procedure a; procedure b; begin end; begin end;
This is convenient to allow variables that are local to a group of procedures, but not global.
C did not have this feature until C99.
Preprocessor
The C language was originally defined as needing a "preprocessor", which was a separate pass that handled constant, type, include and macro definitions. This was required since the first C didn't have either constant declarations nor type declarations. However, C obtained those features later with ANSI C.
Pascal wasn't defined with a preprocessor, but several programmers did use it with a preprocessor, sometimes the same one that was used with C. It certainly was not as common as preprocessor use with C, but Pascal had constant and type defines, so the remaining use for the preprocessor was include files and macros.
Although this is often pointed at as a "lack" in Pascal, technically C didn't have program modularity nor Macros built in either, and both languages could just as well be equipped with a preprocessor.
As a practical matter, most of the succeeding versions of Pascal used a type controlled modular method instead of include files, so the need for a preprocessor is largely redundant.
Type escapes
C, in keeping with its loose typing requirements, features the ability to "cast" a type to become another:
int a; float b; a = (int) b;
The meaning of such casts is entirely machine dependent. This feature often helps with low level conversion of data. For example, a floating point value can be output to a file as a series of bytes.
Again contrary to popular opinion, Pascal can also do this, with a considerably more complicated method:
var a: integer; b: real; a2c: record case boolean of false: (a: integer); true: (b: real); end; begin a2c.b := b; a := a2c.a; end;
In Pascal, such type conversions are discouraged, hence the difficulty.
Files
In C, there are actually no files at all. This capability is left up to the programmer. This is in keeping with ALGOL, which left the I/O capabilities of the language up to the specific installation.
In practice, a large library of standard file access features is available with C, so this is not an issue.
Pascal has files built into the language.
The typical statements used to perform I/O in each language are:
printf("The sum is: %d\n", x);
writeln('the sum is: ', x);
The main difference is that C uses a "format string" that is interpreted to find the arguments to the printf function and convert them, whereas Pascal performs that under the control of the language processor. The Pascal method is arguably faster, because no interpretation takes place, but the C method is highly extensible.
"Blue Sky" Pascal
In commenting on Pascal vs. C, it is certainly germane to mention that some popular Pascal implementations have removed virtually all differences with C by incorporating C methods and contructs into Pascal. Examples include type casts, being able to "coin" a pointer to any variable, local or global, and different types of integers with special promotion properties.
These weren't discussed here because those implementations basically reduce the question of "what is the difference between C and Pascal?" to the answer "none". This is certainly one important technique towards achieving a "universal" Pascal. In fact, many popular Pascal implementations are, in fact, an amagamation of the languages Pascal, C and Basic, the last being because strings are extensively supported, even though neither original Pascal nor C has such support.
However, in languages it is not true that you can have your cake and eat it too. The incorporation of C's cavalier attitude towards types and type convertions can result in a Pascal that loses the type security of original Pascal. For example, Java and C# were created to address some of the type security issues of C, and have "managed" pointers that cannot be used to create invalid references. In its original form (as described by Niklaus Wirth), Pascal qualifies as a managed pointer language, some 30 years before either Java or C#. However, a Pascal amalgamated with C would lose that protection.
Epilogue
It is difficult to produce a truly impartial comparison of C and Pascal, and even more difficult to avoid offending one or another language aficionado.
However, C and Pascal are extraordinarily similar languages, if you look at the basic program structures, data and aims of the two languages. Each time a proponent of C claims that program X cannot be done in Pascal, someone else shows that it can be done. Each time a proponent of Pascal claims that program Y cannot be made machine independent in C, someone else shows that this, too, can be done.
About the only fault with Pascal that everyone agrees on is the lack of dynamic arrays, which even the creator of Pascal later agreed was not a good idea. Many, or even most later Pascal compilers added an extension for that problem, and the ISO 7185 standard addressed it as well.
The remaining major difference between the languages that everyone agrees on is type security. This is neither the runaway emergency for C that many claim (you can program array checks in C, you just have to do it yourself), nor the massive lack for Pascal (you can escape types in Pascal, it is just made painful on purpose).
Although Pascal is sometimes described as the "bondage" language, many people can and do use it to construct systems and low level device drivers.
Although C was originally described as a "systems" or "low level" language, it is clearly used for all applications, including high level ones.
Further reading
- Kathleen Jensen and Niklaus Wirth: PASCAL - User Manual and Report. Springer-Verlag, 1974, 1985, 1991, ISBN 0-387-97649-3 and ISBN 0-540-97649-3
- Brian Kernighan, Dennis Ritchie: The C Programming Language. Also known as K&R — The original book on C.
- 1st, Prentice Hall 1978; ISBN 0-131-10163-3. Pre-ANSI C.
- 2nd, Prentice Hall 1988; ISBN 0-131-10362-8. ANSI C.
- ISO/IEC 9899. The official C:1999 standard, along with defect reports and a rationale.
See also
Template:Major programming languages small
Categories: