Misplaced Pages

Programming language: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively
← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 10:57, 7 December 2001 edit203.37.81.xxx (talk)m link fix← Previous edit Revision as of 14:33, 15 December 2001 edit undo151.203.224.xxx (talk)m mention Lisp where appropriateNext edit →
Line 47: Line 47:




Languages can be classified as ''statically typed'' systems (e.g. ] or ]), and ''dynamically typed'' languages (e.g. ], ] or ]); Languages can be classified as ''statically typed'' systems (e.g. ] or ]), and ''dynamically typed'' languages (e.g. ], ], ] or ]);


statically-typed languages can be further subdivided into languages with manifest types, where each variable and function declaration has its type explicitly declared, and ''type-inferred'' languages (e.g. ], ]). With statically-typed languages can be further subdivided into languages with manifest types, where each variable and function declaration has its type explicitly declared, and ''type-inferred'' languages (e.g. ], ]). With
Line 61: Line 61:




Sometimes statically-typed languages are called "type-safe" or "strongly typed", and dynamically-typed languages are called "untyped" or "weakly typed"; confusingly, these same terms are also used to refer to the distinction between languages like ], ], ], or ], in which it is impossible to use a value as a value of another type and possibly corrupt data from an unrelated part of the program or cause the program to crash, and languages like ], ], ], ], and most implementations of ], in which it is possible to do this. Sometimes statically-typed languages are called "type-safe" or "strongly typed", and dynamically-typed languages are called "untyped" or "weakly typed"; confusingly, these same terms are also used to refer to the distinction between languages like ], ], ], ], or ], in which it is impossible to use a value as a value of another type and possibly corrupt data from an unrelated part of the program or cause the program to crash, and languages like ], ], ], ], and most implementations of ], in which it is possible to do this.





Revision as of 14:33, 15 December 2001

A programming language is a standardized method for expressing instructions to a computer. The language allows a programmer to precisely specify what kinds of data a computer will act upon, and precisely what actions to take under various circumstances.


This serves two primary purposes; the first is that the internal representation a computer uses for its data and operations (at the lowest level, just on and off switches) is not easily understood by humans, so translating a human-readable language into those internal representations makes programming easier. Another purpose is transporting programs between different computers: those internal representations also differ from one computer to the next, but if each is capable of translating the human-readable language into its own internal structures, then that program will operate on both.


If the translation mechanism used translates the program text as a whole and then runs the internal format, this mechanism is spoken of as compilation. The compiler is therefore a program which takes the human-readable program text (called source code) as data input and supplies object code as output. This object code may be machine code directly usable by the processor, or it may be code matching the specification of a virtual machine, and run under that environment.


If the program text is translated step by step at runtime, with each translated step being executed immediately, the translation mechanism is spoken of as an interpreter. Interpreted programs run usually more slowly than compiled programs, but have more flexibility because they are able to interact with the execution environment, instead of all interactions being planned beforehand by the programmer.


Many languages can be either compiled or interpreted, but most are better suited for one than the other.


Features of a Programming Language

Each programming language can be thought of as a set of formal agreements concerning syntax, vocabulary, and meaning, between the programmers who use the language and the implementers or vendors who create the programming system. Most languages that are widely used, or have been used for a considerable period of time, have standardization bodies that meet regularly to create and publish formal definitions of the language, and discuss extending or supplementing the already extant definitions.


These agreements usually include:

  • Data and Data Structures
  • Instruction and Control Flow
  • Reference Mechanisms and Re-use
  • Design Philosophy


Data and Data Structures

Internally, all data in a computer is simply on-off states, but humans use these to represent information in the real world like names, bank accounts, measurements, and so on. So programming languages allow users to specify data in several ways that better suit our uses.


Languages can be classified as statically typed systems (e.g. C++ or Java), and dynamically typed languages (e.g. Lisp, JavaScript, Tcl or Prolog);

statically-typed languages can be further subdivided into languages with manifest types, where each variable and function declaration has its type explicitly declared, and type-inferred languages (e.g. MUMPS, ML). With

statically-typed languages, there usually are pre-defined types for individual pieces of data (such as numbers within a certain range, strings of letters, etc.), and programmatically named values (variables) can have only one fixed type, and allow only certain operations: numbers cannot change into names and vice versa. Dynamically-typed languages treat all data locations interchangeably, so inappropriate operations (like adding names, or sorting numbers alphabetically) will not cause errors until run-time. Type-inferred languages superficially treat all data as not having a type, but actually do sophisticated analysis of the way the program uses the data to determine which elementary operations are performed on the data, and therefore deduce what type the variables have at compile-time. Type-inferred languages can be more flexible to use, while creating more efficient programs; however, this capability is

difficult to include in a programming language implementation, so it is relatively rare.


It is possible to perform type inference on programs written in a dynamically-typed language, but it is legal to write programs in these languages that make type inference infeasible.


Sometimes statically-typed languages are called "type-safe" or "strongly typed", and dynamically-typed languages are called "untyped" or "weakly typed"; confusingly, these same terms are also used to refer to the distinction between languages like Eiffel, Oberon, Lisp, Scheme, or OCaml, in which it is impossible to use a value as a value of another type and possibly corrupt data from an unrelated part of the program or cause the program to crash, and languages like FORTH, C, assembly language, C++, and most implementations of Pascal, in which it is possible to do this.


Sometimes type-inferred and dynamically-typed languages are called "latently typed."


Most languages also provide ways to assemble complex data structures from built-in types and to associate names with these new combined types (using arrays, lists, stacks, files). Object oriented languages allow the programmer to assemble complex structures along with behaviors specific to those data structures.


Aside from when and how the correspondence between expressions and types is determined, there's also the crucial question of what types the language defines at all, and what types it allows as the values of expressions (expressed values) and as named values (denoted values).

Low-level languages like C typically allow programs to name memory locations, regions of

memory, and compile-time constants, while allowing expressions to return values that fit into

machine registers; ANSI C extended this by allowing expressions to return struct

values as well. Functional languages often allow variables to name

run-time computed values directly instead of naming memory locations where values may be

stored. Languages that use garbage collection are free to allow arbitrarily complex data structures as both expressed and denoted values. Finally, in some languages, procedures are

allowed only as denoted values (they cannot be returned by expressions or bound to new names);

in others, they can be passed as parameters to routines, but cannot otherwise be bound to new

names; in others, they are as freely usable as any expressed value, but new ones cannot be

created at run-time; and in still others, they are first-class values that can be created at

run-time.


Instruction and Control Flow

Once data is specified, the machine must be instructed how to perform operations on the data. Elementary statements may be specified using keywords or may be indicated using some well-defined grammatical structure. Each language takes units of these well-behaved statements and combines them using some ordering system. Depending on the language, differing methods of grouping these elementary statements exist. This allows one to write programs that are able to cover a variety of input, instead of being limited to a small number of cases. Furthermore, beyond the data manipulation instructions, other typical instructions in a language are those used to control processing (branches, definitions by cases, loops, backtracking, functional composition).


Reference Mechanisms and Re-use

The core of the idea of reference is that there must be a method of indirectly designating storage space. The most common method is through named variables. Depending on the language, further indirection may include references that are pointers to other storage space stored in such variables or groups of variables. Similar to this method of naming storage is the method of naming groups of instructions. Most programming language use macro calls, procedure calls or function calls as the statements that use these names. Using symbolic names in this way allows a program to achieve significant flexibility, as well as a high measure of reusability. Indirect references to available programs or predefined data divisions allow many application-oriented languages to integrate typical operations as if the programming language included them as higher level instructions.


Design Philosophies

For the above-mentioned purposes, each language has been developed using a special design or philosophy. Some aspect or another is particularly stressed by the way the language uses data structures, or by which its special notation encourages certain ways of solving problems or expressing their structure.


Since programming languages are artificial languages, they require a high degree of discipline to accurately specify which operations are desired. Programming languages are not error tolerant; however, the burden of recognising and using the special vocabulary is reduced by help messages generated by the programming language implementation.

There are a few languages which offer a high degree of freedom in allowing self-modification in which a program re-writes parts of itself to handle new cases. Typically, only machine language and members of the LISP family (Common Lisp, Scheme, MUMPS) provide this capability; languages that support dynamic linking and loading such as C, C++, and the Java programming language

can fake self-modification by either embedding a small compiler or calling a full compiler and linking in the resulting object code. Interpreting code by recompiling it in real time is called dynamic recompilation; emulators and other virtual machines exploit this technique for greater performance.


There are a variety of ways to classify programming languages. The distinctions are not clear-cut; a particular

language standard may be implemented in multiple classifications. For example, a language may have both compiled and interpreted implementations.



To Do: this is just an outline to get started;

add some descriptive text (or put in '/' links)

and add a few representative languages to the descriptions


  • sequence of execution
    • procedural, sequential, linear
    • event-driven
    • pseudo-random
  • method of execution
    • interpreted
    • compiled
    • hybrid
  • main programming paradigm
  • primary method of use
    • script, shell, command
    • application
    • systems programming
  • abstraction level
    • visual programming and integrated development environments
    • 1GL (First Generation Languages)
    • 2GL (Second Generation Languages)
    • 3GL (Third Generation Languages)
    • 4GL (Fourth Generation Languages)
    • database UI development kits
    • high-level (declarative, objective, procedural)
    • machine
  • other
    • pathological
    • specialty



Links to specific languages



  • Algol family
    • C family


APL --

awk


BeFunge -- BLISS -- Blue -- Brainfuck


COBOL -- CORAL66 -- CPL


Dylan


ECMAScript -- Erlang -- Euphoria


Forth -- FORTRAN


GENIE -- Godiva


Haskell


Icon -- INTERCAL


Kvikkalkul


Limbo -- LOGO -- Lua


m4 --

Miranda -- Mercury -- Mesa --

ML --

Modula --

MOO -- MUMPS -- Mary


Nial


Oberon Occam


Perl -- PHP -- PL/I -- Poplog -- PostScript --

Prolog -- Python


REBOL -- REXX -- RPG -- Ruby


sed -- SETL -- Simula -- Smalltalk -- SNOBOL -- SPITBOL --SQL


Tcl -- teco -- tpu -- Trac -- Turing


Unicon -- UnLambda


VarAq --



Timeline of the history of programming languages.


/Talk


Someone has written a very long, contentful article on programming languages for the German Misplaced Pages: Programmiersprache.


Part of that document has been processed through Automatic Translation Software, and incorporated into this document. Other parts of it are below, partially processed.




Current developments in new programming languages


Newer integrated, visual development environments brought clear progress. They reduced expenditures of time, money (and nerves). Regions of the screen that control the program can often be arranged interactively. Code fragments can be invoked just by clicking on a control. The work is also eased by prefabricated components and software libraries with re-usable code. Object-oriented methodology can substantially reduce the complexity of programs. These techniques mark the transition of a craft to an industrial process.


Specialized classes of programming languages


Machine language: The code is directly executable on a processor. Its scope is architecture-dependent. It is typically formulated as numbers expressed in octal or hexadecimal. Each group of numbers is associated with particular fundamental operations of the hardware. The activation of specific wires and logic controls the computation of the computer.


Assemblers: Assemblers are almost always directly tied to a machine language. Assembler allows these machine instructions to be written in a form readable by humans. Assembler allows a program to use symbolic addresses which become absolute addresses calculated by the assembler. Most assemblers also allow for macros and symbolic constants as well.


Data-structured languages: LISP uses lists as its organizing principle. Even programs themselves are formulated as a list of instructions, which change other lists. FORTH and Poplog are conceptually based on an open stack model and use stack operations as fundamental building blocks.


Logical languages: Prolog formulates data and the program evaluation mechanism as a special form of mathematical logic known as Horn logic and a general proving mechanism called logical resolution.


Procedural languages: Ada, BASIC, C, COBOL, FORTRAN, Pascal,

PL/1 represent the procedural family, in which the computer performs imperative statements consecutively.


Object-oriented languages: Smalltalk,

Eiffel, Modula-3, C++, Java, Sather, and Oberon(?) are object-oriented languages. The data structures are defined in object classes, which also include code (methods). Thus the effects of a change to the code reamin very localized. Object classes can be extended by inheritance.


Functional languages: APL, LISP, ML, Ocaml, Scheme, Haskell define programs and subroutines as mathematical functions. Many so-called functional languages are "impure" and also contain imperative features.


Rule-based languages: Rule-based languages such as OPS-5, Prolog, Clips, Jess instantiate rules when activated by conditions in a set of data. Of all possible activations, some set will be selected and the statements belonging to those rules will be executed.


Application-oriented languages and systems: Data base systems: dBase, SQL provide powerful ways of searching and manipulating mathematical relations that have been described as tables, mapping one set of things into other sets.