Misplaced Pages

Strong and weak typing: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 15:21, 17 June 2015 editQwertyus (talk | contribs)Extended confirmed users31,640 edits Implicit type conversions and "type punning": I don't see what syntax has to do with this at all← Previous edit Latest revision as of 19:54, 24 December 2024 edit undoBeland (talk | contribs)Autopatrolled, Administrators237,096 editsm convert special characters found by Misplaced Pages:Typo Team/moss (via WP:JWB
(189 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Short description|Programming language type systems}}
{{original research|date=June 2015}}
{{Type systems}}
{{refimprove|date=June 2015}}
In ], one of the many ways that ]s are colloquially classified is whether the language's ] makes it '''strongly typed''' or '''weakly typed''' ('''loosely typed'''). However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages.<ref>{{Cite web |title=What to know before debating type systems {{!}} Ovid |url=https://blogs.perl.org/users/ovid/2010/08/what-to-know-before-debating-type-systems.html |access-date=2023-06-27 |website=blogs.perl.org}}</ref> For this reason, writers who wish to write unambiguously about type systems often eschew the terms "strong typing" and "weak typing" in favor of specific expressions such as "]".
In ], programming languages are often colloquially referred to as '''strongly typed''' or '''weakly typed'''. These terms do not have a precise definition, but in general a strongly typed language is more likely to generate an error or refuse to compile if the argument passed to a function does not closely match the expected type. On the other hand, a very weakly typed language may produce unpredictable results or may perform implicit type conversion.<ref>http://www.cs.cornell.edu/courses/CS1130/2012sp/1130selfpaced/module1/module1part4/strongtyping.html</ref>

Generally, a strongly typed language has stricter typing rules at ], which implies that errors and ] are more likely to happen during compilation. Most of these rules affect variable assignment, function return values, procedure arguments and function calling. ] languages (where ] happens at ]) can also be strongly typed. In dynamically typed languages, values, rather than variables, have types.

A weakly typed language has looser typing rules and may produce unpredictable or even erroneous results or may perform implicit type conversion at runtime.<ref>{{cite web | url = http://www.cs.cornell.edu/ | title = CS1130. Transition to OO programming. – Spring 2012 --self-paced version | date = 2005 | publisher = Cornell University, Department of Computer Science | archive-url = https://web.archive.org/web/20151123211922/http://www.cs.cornell.edu/ | archive-date = 2015-11-23 | access-date = 2015-11-23 | url-status = bot: unknown }}</ref> A different but related concept is ].


== History == == History ==
In 1974, ] and Zilles described a strongly-typed language as one in which "whenever an object is passed from a calling function to a called function, its type must be compatible with the type declared in the called function."<ref>{{cite paper | id = {{citeseerx|10.1.1.136.3043}} | title = Programming with abstract data types | first1 = B | last1 = Liskov | first2 = S | last2 = Zilles | journal = ACM Sigplan Notices | year = 1974 }}</ref> In 1974, ] and Stephen Zilles defined a strongly-typed language as one in which "whenever an ] is passed from a calling ] to a called function, its type must be compatible with the type declared in the called function."<ref>{{cite journal | citeseerx = 10.1.1.136.3043 | title = Programming with abstract data types | first1 = B | last1 = Liskov | first2 = S | last2 = Zilles | journal = ACM SIGPLAN Notices | year = 1974 | doi = 10.1145/942572.807045 | volume=9 | issue = 4 | pages=50–59}}</ref>
In 1977, Jackson wrote, "In a strongly typed language each data area will have a distinct type and each process will state its communication requirements in terms of these types."<ref>{{cite journal | title = Parallel processing and modular software construction | first1 = K. | last1 = Jackson | journal = Lecture Notes in Computer Science | year = 1977 | volume = 54 | pages = 436–443 | doi = 10.1007/BFb0021435 | url = http://www.springerlink.com/content/wq02703237400667/ | series = Lecture Notes in Computer Science | isbn = 3-540-08360-X }}</ref> In 1977, K. Jackson wrote, "In a strongly typed language each data area will have a distinct type and each process will state its communication requirements in terms of these types."<ref>{{cite book |last1=Jackson |first1=K. |title=Design and Implementation of Programming Languages |chapter=Parallel processing and modular software construction |year=1977 |url=https://archive.org/details/designimplementa0054unse_u4m7 |series=Lecture Notes in Computer Science |volume=54 |pages=436–443 |doi=10.1007/BFb0021435 |isbn=3-540-08360-X |url-access=registration}}</ref>


== Definitions of "strong" or "weak" == == Definitions of "strong" or "weak" ==
A number of different language design decisions have been referred to as evidence of "strong" or "weak" typing. In fact, many of these are more accurately understood as the presence or absence of ], ], ], or ]. A number of different language design decisions have been referred to as evidence of "strong" or "weak" typing. Many of these are more accurately understood as the presence or absence of ], ], ], or ].

"Strong typing" generally refers to use of programming language types in order to both capture invariants of the ], and ensure its correctness, and definitely exclude certain classes of programming errors. Thus there are many "strong typing" disciplines used to achieve these goals.


=== Implicit type conversions and "type punning" === === Implicit type conversions and "type punning" ===
Some programming languages make it easy to use a value of one type as if it were a value of another type. This is sometimes described as "weak typing". Some programming languages make it easy to use a value of one type as if it were a value of another type. This is sometimes described as "weak typing".


For example, Aahz Maruch opines that "''] occurs when you have a ] language and you use the syntactic features of the language to force the usage of one type as if it were a different type (consider the common use of void* in C). Coercion is usually a symptom of weak typing. Conversion, on the other hand, creates a brand-new object of the appropriate type.''"<ref name="artima"></ref> For example, Aahz Maruch observes that "] occurs when you have a ] language and you use the syntactic features of the language to force the usage of one type as if it were a different type (consider the common use of void* in ]). Coercion is usually a symptom of weak typing. Conversion, on the other hand, creates a brand-new object of the appropriate type."<ref name="artima">{{cite web|url=http://www.artima.com/weblogs/viewpost.jsp?thread=7590|title=Typing: Strong vs. Weak, Static vs. Dynamic|author=Aahz|access-date=16 August 2015}}</ref>


As another example, GCC describes this as '']'' and warns that it will ''break strict ]''. ] discusses several problems that can arise when type-punning causes the ] to make inappropriate ]s.<ref></ref> As another example, ] describes this as ] and warns that it will break strict ]. Thiago Macieira discusses several problems that can arise when type-punning causes the ] to make inappropriate ]s.<ref>{{cite web|url=https://www.qt.io/blog/2011/06/10/type-punning-and-strict-aliasing/|title=Type-punning and strict-aliasing - Qt Blog|work=Qt Blog|access-date=18 February 2020}}</ref>


There are many examples of languages which allow ], but in a type-safe manner. For example, both C++ and C# allow programs to define operators to convert a value from one type to another in a semantically meaningful way. When a C++ compiler encounters such a conversion, it treats the operation just like a function call. In contrast, converting a value to the C type {{mono|void*}} is an unsafe operation which is invisible to the compiler. There are many examples of languages that allow ]s, but in a type-safe manner. For example, both ] and ] allow programs to define ] to convert a value from one type to another with well-defined semantics. When a C++ compiler encounters such a conversion, it treats the operation just like a function call. In contrast, converting a value to the C type {{mono|void*}} is an unsafe operation that is invisible to the compiler.


=== Pointers === === Pointers ===
Some programming languages expose pointers as if they were numeric values, and allow users to perform arithmetic on them. These languages are sometimes referred to as "weakly typed", since pointer arithmetic can be used to bypass the language's type system. Some programming languages expose ] as if they were numeric values, and allow users to perform arithmetic on them. These languages are sometimes referred to as "weakly typed", since pointer arithmetic can be used to bypass the language's type system.


=== Untagged unions === === Untagged unions ===
Some programming languages support ], which allow a value of one type to be viewed as if it were a value of another type. Some programming languages support ], which allow a value of one type to be viewed as if it were a value of another type.

=== Dynamic type-checking ===
Some programming languages do not have static type-checking. In many such languages, it is easy to write programs which would be rejected by most static type-checkers. For example, a variable might store either a number or the Boolean value "false". Some programmers{{who|date=August 2014}} refer to these languages as "weakly typed", since they do not ''seem ''to enforce the "strong" type discipline found in a language with a static type-checker.


=== Static type-checking === === Static type-checking ===
In ]'s article ],<ref>ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/SRC-045.pdf page 3</ref> a "strong type system" is described as one in which there is no possibility of an unchecked runtime type error. In other writing, the absence of unchecked run-time errors is referred to as ''safety'' or ''type safety''; ]'s early papers call this property ''security''. In ]'s article ''Typeful Programming'',<ref></ref><!--ref>ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/SRC-045.pdf page 3</ref--> a "strong type system" is described as one in which there is no possibility of an unchecked runtime type error. In other writing, the absence of unchecked run-time errors is referred to as ''safety'' or ''type safety''; ]'s early papers call this property ''security''.<ref>Hoare, C. A. R. 1974. Hints on Programming Language Design. In ''Computer Systems Reliability'', ed. C. Bunyan. Vol. 20 pp. 505–534.</ref>

=== Predictability ===
If simple operations do not behave in a way that one would expect,{{vague|date=June 2015}} a programming language can be said to be "weakly typed". For example, consider the following program:

<source lang="csharp">
x = "5" + 6
</source>

Different languages will assign a different value to 'x':
* One language might convert 6 to a string, and concatenate the two arguments to produce the string "56" (e.g. ], ])
* Another language might convert "5" to a number, and add the two arguments to produce the number 11 (e.g. ], ])
* Yet another language might convert the string "5" to a pointer representing where the string is stored within memory, and add 6 to that value to produce an address in memory (e.g. ])
* In yet another language, the + operation might fail during execution, saying that the two operands have incompatible type (e.g. ], ])
* And in many compiled languages, the compiler would reject this program because the addition is ill-typed, without ever running the program (e.g. ], ])

Languages that work like the first three examples have all been called "weakly typed" at various times, even though only one of them (the third) represents a possible safety violation.

=== Type inference ===
Languages with static type systems differ in the extent to which users are required to manually state the types used in their program. Some languages, such as C, require that every variable be declared with a type. Other languages, such as ], use the ] method to infer all types based on a global analysis. Other languages, such as C# and C++, lie somewhere in between; some types can be inferred based on local information, while others must be specified. Some programmers use the term weakly typed to refer to languages with type inference, often without realizing that the type information is present but implicit.


== Variation across programming languages == == Variation across programming languages ==
{{Original research|section|date=May 2018}}
Note that some of these definitions are contradictory, others are merely orthogonal, and still others are special cases (with additional constraints) of other, more "liberal" (less strong) definitions. Because of the wide divergence among these definitions, it is possible to defend claims about most programming languages that they are either strongly or weakly typed. For instance:
{{More citations needed section|date=May 2020}}
* ], ], ] and ] require all ] to have a declared type, and support the use of explicit casts of arithmetic values to other arithmetic types. Java, C#, Ada and Pascal are sometimes said to be more strongly typed than C, a claim that is probably based on the fact that C supports more kinds of implicit conversions, and C also allows ] values to be explicitly cast while Java and Pascal do not. Java itself may be considered more strongly typed than Pascal as manners of evading the static type system in Java are controlled by the Java ] type system. C# is similar to Java in that respect, though it allows disabling dynamic type checking by explicitly putting code segments in an "unsafe context". Pascal's type system has been described as "too strong", because the size of an array or string is part of its type, making some programming tasks very difficult.<ref></ref><ref>]: ''Why Pascal is not my favourite language'']</ref>
Some of these definitions are contradictory, others are merely conceptually independent, and still others are special cases (with additional constraints) of other, more "liberal" (less strong) definitions. Because of the wide divergence among these definitions, it is possible to defend claims about most programming languages that they are either strongly or weakly typed. For instance:
* The object-oriented programming languages ], ], ], ], ], and ] are all "strongly typed" in the sense that typing errors are prevented at runtime and they do little implicit ], but these languages make no use of static type checking: the compiler does not check or enforce type constraint rules. The term ] is now used to describe the ] paradigm used by the languages in this group.
* The ] family of languages are all "strongly typed" in the sense that typing errors are prevented at runtime. Some Lisp dialects like ] or ] do support various forms of type declarations<ref></ref> and some compilers (]<ref></ref> and related) use these declarations together with ] to enable various optimizations and also limited forms of compile time type checks.
* ], ], ], ] and ] are statically type checked but the compiler automatically infers a precise type for all values. These languages (along with most ] languages) are considered to have stronger type systems than Java, as they permit no implicit type conversions. While OCaml's libraries allow one form of evasion (''Object magic''), this feature remains unused in most applications.
* ] is a hybrid language. In addition to variables with declared types, it is also possible to declare a variable of "Variant" data type that can store data of any type. Its implicit casts are fairly liberal where, for example, one can sum string variants and pass the result into an integer variable.
* ] and ] have been said to be ''untyped''. There is no type checking; it is up to the programmer to ensure that data given to functions is of the appropriate type. Any type conversion required is explicit.
<!-- Please do not grow this list without bound. These examples are intended to illustrate the ambiguity of the term -- NOT to advocate a particular language. Unless a language shows a DIFFERENT sort of ambiguity, please don't add it. For instance, Python is much the same as Lisp. -->


* ], ], ], and ] require ] to have a declared type, and support the use of explicit casts of arithmetic values to other arithmetic types. Java, C#, Ada, and Pascal are sometimes said to be more strongly typed than C, because C supports more kinds of implicit conversions, and allows ] values to be explicitly cast while Java and Pascal do not. Java may be considered more strongly typed than Pascal as methods of evading the static type system in Java are controlled by the ]'s type system. C# and VB.NET are similar to Java in that respect, though they allow disabling of dynamic type checking by explicitly putting code segments in an "unsafe context". Pascal's type system has been described as "too strong", because the size of an array or string is part of its type, making some programming tasks very difficult. However, ] fixes this issue.<ref>{{cite book|url=https://books.google.com/books?id=7i8EAAAAMBAJ&q=pascal+type+system+%22too+strong%22&pg=PA66|title=InfoWorld|access-date=16 August 2015|date=1983-04-25}}</ref><ref>{{Cite web |url=http://www.cs.virginia.edu/~cs655/readings/bwk-on-pascal.html |first=Brian |last=Kernighan |author-link=Brian Kernighan |title=Why Pascal is not my favorite programming language |year=1981 |access-date=2011-10-22 |archive-url=https://web.archive.org/web/20120406094058/http://www.cs.virginia.edu/~cs655/readings/bwk-on-pascal.html |archive-date=2012-04-06 |url-status=dead}}</ref>
For this reason, writers who wish to write unambiguously about type systems often eschew the term "strong typing" in favor of specific expressions such as "]".
* ], ], ], and ] are all "strongly typed" in the sense that typing errors are prevented at runtime and they do little implicit ], but these languages make no use of static type checking: the compiler does not check or enforce type constraint rules. The term ] is now used to describe the ] paradigm used by the languages in this group.
* The ] family of languages are all "strongly typed" in the sense that typing errors are prevented at runtime. Some Lisp dialects like ] or ] do support various forms of type declarations<ref>{{cite web|url=http://www.lispworks.com/documentation/HyperSpec/Body/04_.htm|title=CLHS: Chapter 4|access-date=16 August 2015}}</ref> and some compilers (] (CMUCL)<ref>{{cite web|url=http://common-lisp.net/project/cmucl/doc/cmu-user/compiler.html#toc123|title=CMUCL User's Manual: The Compiler|access-date=16 August 2015|archive-url=https://web.archive.org/web/20160308055914/https://common-lisp.net/project/cmucl/doc/cmu-user/compiler.html#toc123|archive-date=8 March 2016|url-status=dead}}</ref> and related) use these declarations together with ] to enable various optimizations and limited forms of compile time type checks.
* ], ], ], ], ] and ] are statically type-checked, but the compiler automatically infers a precise type for most values.
* ] and ] can be characterized as ''untyped''. There is no type checking; it is up to the programmer to ensure that data given to functions is of the appropriate type.
<!-- Please do not grow this list without bound. These examples are intended to illustrate the ambiguity of the term, NOT to advocate a given language. Unless a language shows a DIFFERENT sort of ambiguity, please don't add it. For instance, Python is much the same as Lisp. -->


== See also == == See also ==
* ]
* ] includes a more thorough discussion of typing issues * ] includes a more thorough discussion of typing issues
* ] (strong typing as implicit contract form)
* ]
* ]
* ]
* ]
* ] * ]
* ] * ]
* ]


== References == == References ==
{{reflist}} {{reflist|30em}}


{{DEFAULTSORT:Strong and Weak Typing}} {{DEFAULTSORT:Strong and Weak Typing}}
] ]

Latest revision as of 19:54, 24 December 2024

Programming language type systems
Type systems
General concepts
Major categories
Minor categories

In computer programming, one of the many ways that programming languages are colloquially classified is whether the language's type system makes it strongly typed or weakly typed (loosely typed). However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages. For this reason, writers who wish to write unambiguously about type systems often eschew the terms "strong typing" and "weak typing" in favor of specific expressions such as "type safety".

Generally, a strongly typed language has stricter typing rules at compile time, which implies that errors and exceptions are more likely to happen during compilation. Most of these rules affect variable assignment, function return values, procedure arguments and function calling. Dynamically typed languages (where type checking happens at run time) can also be strongly typed. In dynamically typed languages, values, rather than variables, have types.

A weakly typed language has looser typing rules and may produce unpredictable or even erroneous results or may perform implicit type conversion at runtime. A different but related concept is latent typing.

History

In 1974, Barbara Liskov and Stephen Zilles defined a strongly-typed language as one in which "whenever an object is passed from a calling function to a called function, its type must be compatible with the type declared in the called function." In 1977, K. Jackson wrote, "In a strongly typed language each data area will have a distinct type and each process will state its communication requirements in terms of these types."

Definitions of "strong" or "weak"

A number of different language design decisions have been referred to as evidence of "strong" or "weak" typing. Many of these are more accurately understood as the presence or absence of type safety, memory safety, static type-checking, or dynamic type-checking.

"Strong typing" generally refers to use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors. Thus there are many "strong typing" disciplines used to achieve these goals.

Implicit type conversions and "type punning"

Some programming languages make it easy to use a value of one type as if it were a value of another type. This is sometimes described as "weak typing".

For example, Aahz Maruch observes that "Coercion occurs when you have a statically typed language and you use the syntactic features of the language to force the usage of one type as if it were a different type (consider the common use of void* in C). Coercion is usually a symptom of weak typing. Conversion, on the other hand, creates a brand-new object of the appropriate type."

As another example, GCC describes this as type-punning and warns that it will break strict aliasing. Thiago Macieira discusses several problems that can arise when type-punning causes the compiler to make inappropriate optimizations.

There are many examples of languages that allow implicit type conversions, but in a type-safe manner. For example, both C++ and C# allow programs to define operators to convert a value from one type to another with well-defined semantics. When a C++ compiler encounters such a conversion, it treats the operation just like a function call. In contrast, converting a value to the C type void* is an unsafe operation that is invisible to the compiler.

Pointers

Some programming languages expose pointers as if they were numeric values, and allow users to perform arithmetic on them. These languages are sometimes referred to as "weakly typed", since pointer arithmetic can be used to bypass the language's type system.

Untagged unions

Some programming languages support untagged unions, which allow a value of one type to be viewed as if it were a value of another type.

Static type-checking

In Luca Cardelli's article Typeful Programming, a "strong type system" is described as one in which there is no possibility of an unchecked runtime type error. In other writing, the absence of unchecked run-time errors is referred to as safety or type safety; Tony Hoare's early papers call this property security.

Variation across programming languages

This section possibly contains original research. Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research should be removed. (May 2018) (Learn how and when to remove this message)
This section needs additional citations for verification. Please help improve this article by adding citations to reliable sources in this section. Unsourced material may be challenged and removed. (May 2020) (Learn how and when to remove this message)

Some of these definitions are contradictory, others are merely conceptually independent, and still others are special cases (with additional constraints) of other, more "liberal" (less strong) definitions. Because of the wide divergence among these definitions, it is possible to defend claims about most programming languages that they are either strongly or weakly typed. For instance:

  • Java, Pascal, Ada, and C require variables to have a declared type, and support the use of explicit casts of arithmetic values to other arithmetic types. Java, C#, Ada, and Pascal are sometimes said to be more strongly typed than C, because C supports more kinds of implicit conversions, and allows pointer values to be explicitly cast while Java and Pascal do not. Java may be considered more strongly typed than Pascal as methods of evading the static type system in Java are controlled by the Java virtual machine's type system. C# and VB.NET are similar to Java in that respect, though they allow disabling of dynamic type checking by explicitly putting code segments in an "unsafe context". Pascal's type system has been described as "too strong", because the size of an array or string is part of its type, making some programming tasks very difficult. However, Delphi fixes this issue.
  • Smalltalk, Ruby, Python, and Self are all "strongly typed" in the sense that typing errors are prevented at runtime and they do little implicit type conversion, but these languages make no use of static type checking: the compiler does not check or enforce type constraint rules. The term duck typing is now used to describe the dynamic typing paradigm used by the languages in this group.
  • The Lisp family of languages are all "strongly typed" in the sense that typing errors are prevented at runtime. Some Lisp dialects like Common Lisp or Clojure do support various forms of type declarations and some compilers (CMU Common Lisp (CMUCL) and related) use these declarations together with type inference to enable various optimizations and limited forms of compile time type checks.
  • Standard ML, F#, OCaml, Haskell, Go and Rust are statically type-checked, but the compiler automatically infers a precise type for most values.
  • Assembly language and Forth can be characterized as untyped. There is no type checking; it is up to the programmer to ensure that data given to functions is of the appropriate type.

See also

References

  1. "What to know before debating type systems | Ovid [blogs.perl.org]". blogs.perl.org. Retrieved 2023-06-27.
  2. "CS1130. Transition to OO programming. – Spring 2012 --self-paced version". Cornell University, Department of Computer Science. 2005. Archived from the original on 2015-11-23. Retrieved 2015-11-23.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  3. Liskov, B; Zilles, S (1974). "Programming with abstract data types". ACM SIGPLAN Notices. 9 (4): 50–59. CiteSeerX 10.1.1.136.3043. doi:10.1145/942572.807045.
  4. Jackson, K. (1977). "Parallel processing and modular software construction". Design and Implementation of Programming Languages. Lecture Notes in Computer Science. Vol. 54. pp. 436–443. doi:10.1007/BFb0021435. ISBN 3-540-08360-X.
  5. Aahz. "Typing: Strong vs. Weak, Static vs. Dynamic". Retrieved 16 August 2015.
  6. "Type-punning and strict-aliasing - Qt Blog". Qt Blog. Retrieved 18 February 2020.
  7. Luca Cardelli, "Typeful programming"
  8. Hoare, C. A. R. 1974. Hints on Programming Language Design. In Computer Systems Reliability, ed. C. Bunyan. Vol. 20 pp. 505–534.
  9. InfoWorld. 1983-04-25. Retrieved 16 August 2015.
  10. Kernighan, Brian (1981). "Why Pascal is not my favorite programming language". Archived from the original on 2012-04-06. Retrieved 2011-10-22.
  11. "CLHS: Chapter 4". Retrieved 16 August 2015.
  12. "CMUCL User's Manual: The Compiler". Archived from the original on 8 March 2016. Retrieved 16 August 2015.
Category: