Revision as of 08:52, 11 March 2004 editDcoetzee (talk | contribs)37,529 edits Added tradeoffs and part about saving memory← Previous edit | Revision as of 20:43, 11 March 2004 edit undoDcoetzee (talk | contribs)37,529 editsm CDR coding also applies to doubly linked listsNext edit → | ||
Line 15: | Line 15: | ||
* Storing the data itself inside the nodes, rather than a reference to it | * Storing the data itself inside the nodes, rather than a reference to it | ||
* Using a ] for the nodes | * Using a ] for the nodes | ||
* |
* ] | ||
== Linked List Operations == | == Linked List Operations == |
Revision as of 20:43, 11 March 2004
In computer science, a linked list is one of the simplest data structures used in computer programming. It consists of a number of nodes, each containing one element and one (or more) links to other nodes. Linked lists permit insertion and removal of nodes at any point in the list in O(1) time. Linked lists do not allow random access.
The simplest kind of linked list is a singly linked list, which has one link per node. This link points to the next node in the list, or to a null value if it is the last node. This kind of list allows sequential access to elements only in one direction (from front to back).
A singly linked list containing three integer values
A more sophisticated kind of linked list is a doubly linked list. Each node has two links, one to the previous node and one to the next. This allows sequential access to the list in both directions.
An example of a doubly linked list.
There are two common variations to the types mentioned above. First is a circularly linked list, where the first and last nodes are linked together. This can be done for both singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in either direction until you return to the original node. The other variation is the addition of a sentinel node at the beginning of the list. This node represents before the first node and/or after the last node of the list. It also makes it possible to have an empty list.
One strong disadvantage of linked lists over compact data structures like arrays is that they must reserve storage for each data node. This problem is exacerbated by naive approaches to allocation of list nodes, which may store additional metadata with each node. The problem can be reduced by any combination of the following methods:
- Using singly-linked lists or xor linked lists wherever possible
- Storing the data itself inside the nodes, rather than a reference to it
- Using a memory pool for the nodes
- CDR coding
Linked List Operations
When manipulating linked lists in-place, care must be taken to not use values that you have invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an end-of-list marker or sentinel, which may be implemented in a number of ways.
Singly-linked lists
Our node data structure will have two fields:
- data -- The data being stored in the node
- next -- A reference to the next node
We also keep a variable head which always points to the first node in the list. Traversal of a singly-linked list is easy:
node := head while node not null <do something with node.value> node := node.next
The following code inserts a node after an existing node in a singly linked list. Inserting a node before an existing one cannot be done; instead, you have to locate it while keeping track of the previous node.
insertAfter(node, newNode) newNode.next := node.next node.next := newNode
Inserting at the beginning of the list requires a separate function. This requires updating head.
insertBeginning(newNode) newNode.next := head head := newNode
Similarly, we have functions for removing the node after a given node, and for removing a node from the beginning of the list. To find and remove a particular node, one must again keep track of the previous element.
removeAfter(node) nodeBeingRemoved := node.next node.next := node.next.next destroy nodeBeingRemoved removeBeginning(node) nodeBeingRemoved := head head := head.next destroy nodeBeingRemoved
Doubly-linked lists
With doubly-linked lists there are even more pointers to update, but also less information is needed, since we can use backwards pointers to observe preceding elements in the list. This enables new operations, and eliminates special-case functions. We will add a prev field to our nodes, pointing to the previous element, and a tail variable which always points to the last node in the list. No null values are strictly needed for doubly linked lists with head and tail, since comparison with head and tail can be used instead, but by placing them before the first node and after the last we help maintain locality of reference.
Iterating through a doubly linked list can be done in either direction. In fact, direction can change many times, if desired.
Forwards
node := head while node not null <do something with node.value> node := node.next
Backwards
node := tail while node not null <do something with node.value> node := node.prev
These symmetric functions add a node either after or before a given node:
insertAfter(node,newNode) newNode.prev := node newNode.next := node.next if node.next is null tail := newNode else node.next.prev := newNode node.next := newNode insertBefore(node,newNode) newNode.prev := node.prev newNode.next := node if node.prev is null head := newNode else node.prev.next := newNode node.prev := newNode
Removing a node is easier, only requiring care with head and tail:
remove(node) if node.prev is null head = node.next else node.prev.next = node.next if node.next is null tail := node.prev else node.next.prev = node.prev destroy node
One subtle consequence of this procedure is that deleting the last element of a list sets both head and tail to null.
A curious way of implementing doubly linked lists that uses half as much space for pointers is the xor linked list.
Circularly-linked lists
Circularly-linked lists can be either singly or doubly linked. For both, their circular versions, which link all the nodes into a continuous circle without using null, benefit from the ability to traverse the full list beginning at any given node. This often allows us to avoid storing head and tail, although if the list may be empty we need a special representation for the empty list, such as a head variable which points to some node in the list or is null if it's empty; we use such a head here.
This representation significantly simplifies adding and removing nodes into a non-empty list, but empty lists are a special case. Assuming that someNode is some node in the list, this code iterates through that list starting with someNode:
Forwards
node := someNode do <do something with node.value> node := node.next while node not someNode
Notice the postponing of the test to the end of the loop. This is important for the case where the list contains only the single node someNode.
This simple function inserts a node into a doubly-linked circularly linked list after a given element:
insertAfter(node, newNode) newNode.next := node.next newNode.prev := node node.next.prev := newNode node.next := newNode
Inserting an element in a possibly empty list requires a special function:
insertBeginning(node) if head is null node.prev := node node.next := node else insertAfter(head.prev, node) head := node
Finally, removing a node must deal with the case where the list empties:
remove(node) if node.next is node head := null else node.next.prev := node.prev node.prev.next := node.next destroy node
Tradeoffs
Different types of linked lists have different trade-offs, as is evident from the implementation of their operations above. Singly linked lists are the simplest, especially for simple iteration, are persistent data structuress as explained below, and use less space for pointers than doubly linked lists (xor linked lists aside). Doubly linked lists require more space and more time to manipulate, but offer more freedom of traversal and make removal easier than for singly linked lists. Circular linked lists are most useful for describing naturally circular structures, and have the advantage of regular structure and being able to traverse the list starting at any point; their main disadvantage is the complexity of iteration, which has subtle special cases.
Singly linked lists have one strong advantage over doubly-linked lists: each tail of the list, meaning all the nodes following a chosen one, themselves form a singly linked list. Thus this type of list, sometimes also called a cons-based list, can be seen as a recursive datatype. This enables, for example, two different lists to share a common tail, which in turn enables one to add to the front of the list while keeping a reference to both the new and the old versions, making this type of list a persistent data structure.
Linked lists also have trade-offs with other data structures. The array allows random access, is more compact (stores no pointers), and has better locality of reference, but both insertion and removal are expensive operations, particularly in the middle of the array. Arrays are also typically limited to a fixed size or grow in large jumps, while linked lists grow incrementally as nodes are added.
Sometimes, linked lists are used to implement associative arrays, and are in this context called association lists. There is very little good to be said about this use of linked lists; they are easily outperformed by other data structures such as self-balancing binary search trees even on small data sets (see the discussion in associative array). However, sometimes a linked list is dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse that set.
Language Support
Many programming languages, such as LISP and Scheme have singly linked lists built in. In other languages it is simple to create the data structure using references. For example, one might create a doubly linked list in C like this:
typedef struct Person { struct Person *prev; // previous person in list struct Person *next; // next person in list char name; int age; };
An example of code that uses this structure:
Person first; // first item in the linked list Person *p = &first;
p->prev = NULL; p->next = NULL;
while(!feof(stdin)) // there's still input { p->next = (Person *)malloc(sizeof(Person)); // first link p->next->prev = p; p = p->next; p->next = NULL; printf("Enter name and age:\n"); // Ask for input scanf("%s %d", &(p->name), &(p->age)); // Use it }
Linked Lists On The Web
Some linked list materials are available from the Stanford Computer Science department: