Linked list: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 08:52, 11 March 2004 editDcoetzee (talk \| contribs)37,529 edits Added tradeoffs and part about saving memory← Previous edit		Revision as of 20:43, 11 March 2004 edit undoDcoetzee (talk \| contribs)37,529 editsm CDR coding also applies to doubly linked listsNext edit →
Line 15:		Line 15:
	* Storing the data itself inside the nodes, rather than a reference to it		* Storing the data itself inside the nodes, rather than a reference to it
	* Using a ] for the nodes		* Using a ] for the nodes
	* ~~For singly-linked lists,~~ ]		* ]

	== Linked List Operations ==		== Linked List Operations ==

Revision as of 20:43, 11 March 2004

In computer science, a linked list is one of the simplest data structures used in computer programming. It consists of a number of nodes, each containing one element and one (or more) links to other nodes. Linked lists permit insertion and removal of nodes at any point in the list in O(1) time. Linked lists do not allow random access.

The simplest kind of linked list is a singly linked list, which has one link per node. This link points to the next node in the list, or to a null value if it is the last node. This kind of list allows sequential access to elements only in one direction (from front to back).

A singly linked list containing three integer values

A more sophisticated kind of linked list is a doubly linked list. Each node has two links, one to the previous node and one to the next. This allows sequential access to the list in both directions.

An example of a doubly linked list.

There are two common variations to the types mentioned above. First is a circularly linked list, where the first and last nodes are linked together. This can be done for both singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in either direction until you return to the original node. The other variation is the addition of a sentinel node at the beginning of the list. This node represents before the first node and/or after the last node of the list. It also makes it possible to have an empty list.

One strong disadvantage of linked lists over compact data structures like arrays is that they must reserve storage for each data node. This problem is exacerbated by naive approaches to allocation of list nodes, which may store additional metadata with each node. The problem can be reduced by any combination of the following methods:

Using singly-linked lists or xor linked lists wherever possible
Storing the data itself inside the nodes, rather than a reference to it
Using a memory pool for the nodes
CDR coding

Linked List Operations

When manipulating linked lists in-place, care must be taken to not use values that you have invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an end-of-list marker or sentinel, which may be implemented in a number of ways.

Singly-linked lists

Our node data structure will have two fields:

data -- The data being stored in the node
next -- A reference to the next node

We also keep a variable head which always points to the first node in the list. Traversal of a singly-linked list is easy:

node := head
while node not null
    <do something with node.value>
    node := node.next

The following code inserts a node after an existing node in a singly linked list. Inserting a node before an existing one cannot be done; instead, you have to locate it while keeping track of the previous node.

insertAfter(node, newNode)
    newNode.next := node.next
    node.next    := newNode

Inserting at the beginning of the list requires a separate function. This requires updating head.

insertBeginning(newNode)
    newNode.next := head
    head         := newNode

Similarly, we have functions for removing the node after a given node, and for removing a node from the beginning of the list. To find and remove a particular node, one must again keep track of the previous element.

removeAfter(node)
    nodeBeingRemoved := node.next
    node.next := node.next.next
    destroy nodeBeingRemoved
removeBeginning(node)
    nodeBeingRemoved := head
    head := head.next
    destroy nodeBeingRemoved

Doubly-linked lists

With doubly-linked lists there are even more pointers to update, but also less information is needed, since we can use backwards pointers to observe preceding elements in the list. This enables new operations, and eliminates special-case functions. We will add a prev field to our nodes, pointing to the previous element, and a tail variable which always points to the last node in the list. No null values are strictly needed for doubly linked lists with head and tail, since comparison with head and tail can be used instead, but by placing them before the first node and after the last we help maintain locality of reference.

Iterating through a doubly linked list can be done in either direction. In fact, direction can change many times, if desired.

Forwards

node := head
while node not null
    <do something with node.value>
    node := node.next

Backwards

node := tail
while node not null
    <do something with node.value>
    node := node.prev

These symmetric functions add a node either after or before a given node:

insertAfter(node,newNode)
    newNode.prev := node
    newNode.next := node.next
    if node.next is null
        tail := newNode
    else
        node.next.prev := newNode
    node.next    := newNode
insertBefore(node,newNode)
    newNode.prev := node.prev
    newNode.next := node
    if node.prev is null
        head := newNode
    else
        node.prev.next := newNode
    node.prev    := newNode

Removing a node is easier, only requiring care with head and tail:

remove(node)
    if node.prev is null
        head = node.next
    else
        node.prev.next = node.next
    if node.next is null
        tail := node.prev
    else
        node.next.prev = node.prev
    destroy node

One subtle consequence of this procedure is that deleting the last element of a list sets both head and tail to null.

A curious way of implementing doubly linked lists that uses half as much space for pointers is the xor linked list.

Circularly-linked lists

Circularly-linked lists can be either singly or doubly linked. For both, their circular versions, which link all the nodes into a continuous circle without using null, benefit from the ability to traverse the full list beginning at any given node. This often allows us to avoid storing head and tail, although if the list may be empty we need a special representation for the empty list, such as a head variable which points to some node in the list or is null if it's empty; we use such a head here.

This representation significantly simplifies adding and removing nodes into a non-empty list, but empty lists are a special case. Assuming that someNode is some node in the list, this code iterates through that list starting with someNode:

Forwards

node := someNode
do
    <do something with node.value>
    node := node.next
while node not someNode

Notice the postponing of the test to the end of the loop. This is important for the case where the list contains only the single node someNode.

This simple function inserts a node into a doubly-linked circularly linked list after a given element:

insertAfter(node, newNode)
    newNode.next := node.next
    newNode.prev := node
    node.next.prev := newNode
    node.next      := newNode

Inserting an element in a possibly empty list requires a special function:

insertBeginning(node)
    if head is null
        node.prev := node
        node.next := node
    else
        insertAfter(head.prev, node)
    head := node

Finally, removing a node must deal with the case where the list empties:

remove(node)
    if node.next is node
        head := null
    else
        node.next.prev := node.prev
        node.prev.next := node.next
    destroy node

Tradeoffs

Different types of linked lists have different trade-offs, as is evident from the implementation of their operations above. Singly linked lists are the simplest, especially for simple iteration, are persistent data structuress as explained below, and use less space for pointers than doubly linked lists (xor linked lists aside). Doubly linked lists require more space and more time to manipulate, but offer more freedom of traversal and make removal easier than for singly linked lists. Circular linked lists are most useful for describing naturally circular structures, and have the advantage of regular structure and being able to traverse the list starting at any point; their main disadvantage is the complexity of iteration, which has subtle special cases.

Singly linked lists have one strong advantage over doubly-linked lists: each tail of the list, meaning all the nodes following a chosen one, themselves form a singly linked list. Thus this type of list, sometimes also called a cons-based list, can be seen as a recursive datatype. This enables, for example, two different lists to share a common tail, which in turn enables one to add to the front of the list while keeping a reference to both the new and the old versions, making this type of list a persistent data structure.

Linked lists also have trade-offs with other data structures. The array allows random access, is more compact (stores no pointers), and has better locality of reference, but both insertion and removal are expensive operations, particularly in the middle of the array. Arrays are also typically limited to a fixed size or grow in large jumps, while linked lists grow incrementally as nodes are added.

Sometimes, linked lists are used to implement associative arrays, and are in this context called association lists. There is very little good to be said about this use of linked lists; they are easily outperformed by other data structures such as self-balancing binary search trees even on small data sets (see the discussion in associative array). However, sometimes a linked list is dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse that set.

Language Support

Many programming languages, such as LISP and Scheme have singly linked lists built in. In other languages it is simple to create the data structure using references. For example, one might create a doubly linked list in C like this:

typedef struct Person
{
    struct Person *prev; // previous person in list
    struct Person *next; // next person in list
    char name;
    int age;
};

An example of code that uses this structure:

Person first; // first item in the linked list
Person *p = &first;

p->prev = NULL;
p->next = NULL;

while(!feof(stdin)) // there's still input
{
    p->next = (Person *)malloc(sizeof(Person));   // first link
    p->next->prev = p;
    p = p->next;
    p->next = NULL;
    printf("Enter name and age:\n");              // Ask for input
    scanf("%s %d", &(p->name), &(p->age));        // Use it
}

Linked Lists On The Web

Some linked list materials are available from the Stanford Computer Science department: