# Difference between revisions of "Infinity and efficiency"

(introduction) |
(dropWhileRev) |
||

Line 6: | Line 6: | ||

A very simple examples is the function <code>reverse . reverse</code> which seems to be an inefficient implementation of <code>id</code>. |
A very simple examples is the function <code>reverse . reverse</code> which seems to be an inefficient implementation of <code>id</code>. |
||

But if you look closer it is not exactly equivalent to <code>id</code>, because for infinite input list, the function <code>reverse . reverse</code> is undefined, whereas <code>id</code> is defined. |
But if you look closer it is not exactly equivalent to <code>id</code>, because for infinite input list, the function <code>reverse . reverse</code> is undefined, whereas <code>id</code> is defined. |
||

+ | |||

+ | Now lets consider a more complicated example. |
||

+ | Say we want to program a function, that removes elements from the end of a list, |
||

+ | just like <code>dropWhile</code> removes elements from the beginning of a list. |
||

+ | We want to call it <code>dropWhileRev</code>. |
||

+ | (As a more concrete example imagine a function which removes trailing spaces.) |
||

+ | A simple implementation is |
||

+ | <haskell> |
||

+ | dropWhileRev :: (a -> Bool) -> [a] -> [a] |
||

+ | dropWhileRev p = reverse . dropWhile p . reverse |
||

+ | </haskell> |
||

+ | You will already guess, that it also does not work for infinite input lists. |
||

+ | And actually it is also inefficient, because <code>reverse</code> requires to compute all list nodes |
||

+ | (but it does not require to compute the values stored in the nodes). |
||

+ | Thus the full list skeleton must be hold in memory. |
||

+ | However it is possible to implement <code>dropWhileRev</code> in a way that works for more kinds of input. |
||

+ | <haskell> |
||

+ | dropWhileRev :: (a -> Bool) -> [a] -> [a] |
||

+ | dropWhileRev p = |
||

+ | foldr (\x xs -> if p x && null xs then [] else x:xs) [] |
||

+ | </haskell> |
||

+ | Here, <code>foldr</code> formally inspects the list from right to left, |
||

+ | but actually it processes data from left to right. |
||

+ | Whenever a run of elements occurs, which matches the condition <code>p</code>, |
||

+ | then these elements are hold until the end of the list is encountered (then they are dropped), |
||

+ | or a non-matching list element is found (then they are emitted). |
||

+ | This works in many cases, but fails if the number of matching elements becomes too large. |
||

+ | The maximum memory consumption depends on the length of the runs of non-matching elements, |
||

+ | which is much more efficient than the naive implementation above. |

## Revision as of 21:13, 18 September 2008

In this article we want to demonstrate how to check efficiency of an implementation by checking for proper results for infinite input.
In general it is harder to reason about time and memory complexity of an implementation than on its correctness.
In fact in Haskell inefficient implementations sometimes turn out to be wrong implementations.
A very simple examples is the function `reverse . reverse`

which seems to be an inefficient implementation of `id`

.
But if you look closer it is not exactly equivalent to `id`

, because for infinite input list, the function `reverse . reverse`

is undefined, whereas `id`

is defined.

Now lets consider a more complicated example.
Say we want to program a function, that removes elements from the end of a list,
just like `dropWhile`

removes elements from the beginning of a list.
We want to call it `dropWhileRev`

.
(As a more concrete example imagine a function which removes trailing spaces.)
A simple implementation is

```
dropWhileRev :: (a -> Bool) -> [a] -> [a]
dropWhileRev p = reverse . dropWhile p . reverse
```

You will already guess, that it also does not work for infinite input lists.
And actually it is also inefficient, because `reverse`

requires to compute all list nodes
(but it does not require to compute the values stored in the nodes).
Thus the full list skeleton must be hold in memory.
However it is possible to implement `dropWhileRev`

in a way that works for more kinds of input.

```
dropWhileRev :: (a -> Bool) -> [a] -> [a]
dropWhileRev p =
foldr (\x xs -> if p x && null xs then [] else x:xs) []
```

Here, `foldr`

formally inspects the list from right to left,
but actually it processes data from left to right.
Whenever a run of elements occurs, which matches the condition `p`

,
then these elements are hold until the end of the list is encountered (then they are dropped),
or a non-matching list element is found (then they are emitted).
This works in many cases, but fails if the number of matching elements becomes too large.
The maximum memory consumption depends on the length of the runs of non-matching elements,
which is much more efficient than the naive implementation above.