Does this make my code look big?

July 27, 2009

Real World Haskell Chapter 5 – Haskell

Filed under: Chapter 5,Code,Exercise,Haskell — Barry Allison @ 10:02 pm
Tags:

There are only 2 exercises in Chapter 5 which both live in the Prettify.hs file.
I’ve uploaded all of the files I created developing the JSON library for completeness and to be able to actually compile and use it.
Chapter 5 JSON library exercise solutions:

The files I wrote to create the libraries cabal package:

All in all I found this chapter a bit confusing which I think is because the ideas behind it are obviously based on the built in Haskell pretty printing library which isn’t introduces until the end of the chapter. There was no real clear explanation of the structure and purpose of the different parts of the library which made reading and following some of the ideas very dry. Plus not much clarity about the placing of some of the definitions, stubs and functions.

Having said that the overall structure for writing the exercise solutions are given in the pretty and compact functions so both exercises were just a matter of adapting them to the job at hand.

Exercise 1

Keep a note of the current column (in the same way pretty does) and whenever a Line is encountered append spaces at the end of the current line to pad it out to the required width. The rest of the tree structure of the Document is maintained ‘as is’.

fill :: Int -> Doc -> Doc
fill width d = scanLines 0 [d]
    where scanLines col (d:ds) =
              case d of
                Empty        -> Empty  scanLines col ds
                Char c       -> d  scanLines (col + 1) ds
                Text s       -> d  scanLines (col + length s) ds
                Line         -> (padLine col  Line)  scanLines 0 ds
                a `Concat` b -> scanLines col (a:b:ds)
                a `Union` b  -> scanLines col (a:ds) `Union`  scanLines col (b:ds)
          scanLines _ _ = Empty
          padLine pos = text $ replicate (width - pos) ' '

Exercise 2

This works in the same way that fill does except the current line’s nesting position is maintained rather than the current item’s character position in the current line.
When an open brace or bracket is found the nesting is increased and when closing braces or brackets are found the nesting is decreased. After a new Line spaces are added to the beginning of the line according to the current nesting value. In the renderJValue function braces are implemented using only Char Doc values
- initially I also checked inside Text Doc values as well but realized that it isn’t necessary.

nest :: Int -> Doc -> Doc
nest indent d = scanLines 0 [d]
    where scanLines col (d:ds) =
              case d of
                Empty        -> d  scanLines col ds
                Char c       -> d  scanLines (col + offset c) ds
                Text s       -> d  scanLines col ds
                Line         -> d  indentLine col  scanLines col ds
                a `Concat` b -> scanLines col (a:b:ds)
                a `Union` b  -> scanLines col (a:ds) `Union`  scanLines col (b:ds)
          scanLines _ _        = Empty

          indentLine pos    = text $ replicate pos ' '
          offset c | c `elem` "{[" = indent
                   | c `elem` "}]" = negate indent
                   | otherwise     = 0

July 17, 2009

Real World Haskell Chapter 4 – Scheme

Filed under: Chapter 4,Code,Exercise,Scheme — Barry Allison @ 4:20 pm
Tags:

Before I could tackle the exercises in chapter 4 I think it would be useful to have some of the nice functions from Haskell’s prelude, so I wrote a prelude library to help. I’ll be updating it as I work my way through the book.
Scheme prelude and the lazy version
Interact with text file framework
First words program
Transpose text program
Chapter 4 exercise solutions

The start of chapter 4 gives a framework for transforming text files which I needed to convert into Scheme.
The only complex part is being able to invoke a scheme text file as a script, but the PLT documentation gives the answer. I’m not sure how this would work under windows, but there must be an equivalent. I’m posting the magic part here for posterity and so I can find it more easily in the future.
Their script uses exec mzscheme -cu "$0" ${1+"$@"}.
The -c switch disables loading compiled files (which I think causes some/a;; of the library functions used to be loaded and compiled when the script runs but I got tired of the extra delay caused so I’m using:

#! /bin/sh
#|
exec mzscheme -u "$0" ${1+"$@"}
|#
#lang scheme/base

I realise I have no idea how the Haskell code schedules and performs the reading and writing of the files.
I began to wonder about efficiency though or more precisely the lazy nature of Haskell and how reading in a text file line by line and processing it actually works. According to the documentation readFile the file is read in lazily line by line as required so I expect it depends on how function uses the list of strings it is passed.

interactWith function inputFile outputFile = do
  input <- readFile inputFile
  writeFile outputFile (function input)

I tried 2 different approaches in Scheme:

  1. Reading in the whole file into a list of strings, then performing the transformation on each line in the list.
  2. Reading and writing the file a line at a time, performing the transformation and writing it as it is read.

I went with the second option initially and so the framework code for transforming a file:

(define (main)
  (define arguments (current-command-line-arguments))
  (if (= (vector-length arguments) 2)
      (let ([in  (vector-ref arguments 0)]
            [out (vector-ref arguments 1)])
        (interact-with my-function in out))
      (error "interact-with expects exactly 2 arguments  : interact-with in-file-path out-file-path\n\tProvided : " arguments)))

The procedure that applies the line by line transformation

(define (interact-with f in out)
  (with-output-to-file out #:mode 'text #:exists 'replace
    (λ () 
      (with-input-from-file in 
        (λ ()
          (let next-line ([line (read-line)])
            (cond [(eof-object? line) (void)]
                  [else (write-string (f line))
                        (newline)
                        (next-line (read-line))])))
            #:mode 'text))))

I thought of using call-with-input-file and call-with-input-file but I couldn’t find a neat way of capturing the procedure to transform each line without writing a nested procedure, but it doesn’t seem any more elegant.

I didn’t convert the FixLines to Scheme as mzscheme seems to cope with both Unix and Windows line separators (at least in the Ubuntu version.)

Exercise 1

I can see I’m going to get in a lot of trouble later on in the book coping with Haskell’s type system under Scheme but here goes with Maybe. I think there must be a way of using macros to cope with types like Maybe, but I’ll leave that as an exercise for myself. I have enough new concepts in Haskell to take on-board currently.

(define-struct maybe (value))
(define Just make-maybe)
(define Nothing (make-maybe 'Nothing))
(define (nothing? o) (equal? o Nothing))

(define (safe-head o) (if (null? o) Nothing (Just (first o))))
(define (safe-tail o) (if (null? o) Nothing (Just (rest o))))
(define (safe-last o) (if (null? o) Nothing (Just (last o))))
(define (safe-init o) (if (null? o) Nothing (Just (init o))))

The init and last procedures are in the Scheme Prelude library I wrote.

Exercise 2

I’m using values and let-values to implement an analogue of tuples for example in defining and using span and break but I ran into a problem with that which I’ll explain later. The split-with that preserves all list members:

(define (split-with+ p xs)
  (cond [(null? xs) null] 
        [else (let-values ([(head) (first xs)]
                           [(pre suf) (span p (rest xs))])
                (cons (cons head pre)
                      (split-with+ p suf)))]))

The non-preserving version:

(define (split-with p xs)
  (define (split-suffix suf) (split-with p (drop-while (combine not p) suf)))
  (if (null? xs)
      null
      (let-values ([(pre suf) (span p xs)])
        (if (null? pre)
            (split-suffix suf)
            (cons pre (split-suffix suf))))))

Exercise 3

Once again it’s a little messier dealing with strings – using a higher order functional approach to deal with strings requires converting strings into lists of characters and back again. Still at least Scheme has procedures for that.

(define char-nonspace? (compose not char-whitespace?))
(define (first-word line)
  (list->string
   (take-while char-nonspace? 
               (drop-while char-whitespace? (string->list line)))))

Exercise 4

The transpose procedure needs all the lines of text before it can work and so I had to switch back to my first approach for dealing with text files, namely to read the whole file into a list of strings, then apply the transformation, then write out the result.

(define (interact-with f in out) (write-lines out (f (read-lines in))))
(define (read-lines in)
  (with-input-from-file in 
    (λ ()
      (let next-line ([line (read-line)])
        (if (eof-object? line) 
            null
            (cons line (next-line (read-line))))))
    #:mode 'text))
(define (write-lines out ls)
  (with-output-to-file out #:mode 'text #:exists 'replace
    (λ () (for-each (λ (line) (write-string line) (newline)) ls))))

Picking the first letter of each line and and the rest of the letters from each line is a little more fiddly in Scheme but for the Haskell version I also had to write extra functions to mimic head and tail to be able to use map safely on different length lines.
I’m still not entirely happy that it doesn’t pad spaces in front of transposed lines as you’d expect but it works for lines of the same length and so that’s goods enough for now.

(define (transpose-lines lines) 
  (let ([heads (apply string-append (map safe-string-first lines))]
        [tails (map safe-string-rest lines)])
    (if (string-empty? heads)
        null
        (cons heads (transpose-lines tails)))))

(define empty-string "")
(define (string-empty? s) (string=? s empty-string))
(define (safe-string-first s) (if (string-empty? s) empty-string (substring s 0 1)))
(define (safe-string-rest s)  (if (string-empty? s) empty-string (substring s 1)))

Folds Exercise 1

Basically the same as the Haskell version except for once again using string based functions instead of list based ones.

(define (as-int-fold ds)
  (cond [(string=? ds "") 0]
        [(char=? (string-ref ds 0) #\-) (- (as-int-fold (substring ds 1)))]
        [else (foldl shift-right-and-add 
                     0
                     (string->list ds))]))

(define (shift-right-and-add d acc)
  (+ (* 10 acc) 
     (char->digit d)))

Then for fun I decided to try using a right fold but I ran into a problem. Let’s say our string has 4 digits d1 d2 d3 d4 and the intermediate result after processing d4 and d3 is res so d1 * 1000 + (d2 * 100 + (d3 * 10 + (d4 * 1)))
At each iteration 3 values are needed – the current result, the current digit and the factor needs to be multiplied by 10 to give
So I wrote:

(define (shift-right-and-add+ d acc)
  (let-values ([(res factor) acc]) 
    (+ (* 10 acc) 
        (char->digit d)))

But calling it with (foldr shift-right-and-add+ (values 0 1) (string->list ds)) -> error: context expected 1 value, received 2 values: 0 1
Considering I’d finished the exercise with a left fold I checked out the comments for the exercise on the Real World Haskell web site and found the delightful:

asInt_fold:: String -> Int
asInt_fold ('-':rest) = -asInt_fold (filter isDigit rest)
asInt_fold str = foldr step 0 (reverse str)
where step i acc = acc * 10 + digitToInt i

I really wish I’d had the insight to think of that. Thank you Marcus Edwards.
This means the same step function can be used as with the left fold version and gives the much more elegant:

(define (as-int-foldr ds)
  (cond [(string=? ds "") 0]
        [(char=? (string-ref ds 0) #\-) (- (as-int-foldr (substring ds 1)))]
        [else (foldr shift-right-and-add 
                     0
                     (reverse (string->list ds)))]))

I won’t comment on the other solutions as they are pretty much just conversion from Haskell into the appropriate Scheme.
Except for the cycle exercise. The only way I could think of doing this in Scheme is to use delay, force and promise but that would mean re-writing all list handling functions but luckily PLT has a lazy scheme language which can be used almost interchangeably with regular scheme. I’m considering using this for future exercises, but for now this is a stand-alone exercise using the PLT lazy scheme language.

#lang lazy
(define (integers-from n) (cons n (integers-from (add1 n))))
(define (ba-cycle xs)
  (define (cycle-step x acc) (cons xs acc))
  (foldr cycle-step null (integers-from 0)))

July 15, 2009

Real World Haskell Chapter 4 – Haskell

Filed under: Chapter 4,Code,Exercise,Haskell — Barry Allison @ 9:19 pm
Tags:

Whew! after a long hiatus I’ve managed to return to studying Real World Haskell. Straight down to it then.
Chapter 4 exercise solutions.
FirstWords program.
Transpose words program.

Exercise 1

safeHead, safeTail, safeInit, safeLast each require a simple guard pattern to return Nothing if an empty list is passed as an argument.

safeHead :: [a] -> Maybe a
safeHead [] = Nothing
safeHead xs = Just (head xs)

safeTail :: [a] -> Maybe [a]
safeTail [] = Nothing
safeTail xs = Just (tail xs)

safeLast :: [a] -> Maybe a
safeLast [] = Nothing
safeLast xs = Just (last xs)

safeInit :: [a] -> Maybe [a]
safeInit [] = Nothing
safeInit xs = Just (init xs)

Exercise 2

The definition of splitWith isn’t very well defined so I decided to take 2 approaches

  • Preserve all list members and break the list at the point the predicate returns False,
  • Discard all list members where the predicate is False.

I spent much too long on this because I was convinced there would be a nice elegant way of using the pattern let (pre,suf) = span p xs in introduced earlier when writing splitLines that could be used generically in both versions but I couldn’t really find it.

Preserving version:

splitWith :: (a -> Bool) -> [a] -> [[a]]
splitWith _ []     = []
splitWith p (x:xs) = (x : pre) : splitWith p suf
    where (pre,suf) = span p xs

Discarding version:

splitWith' :: (a-> Bool) -> [a] -> [[a]]
splitWith' _ [] = []
splitWith' p xs = case pre of
                    [] -> splitSuffix
                    _  -> pre : splitSuffix
    where
      (pre,suf)    = span p xs
      splitSuffix  = splitWith' p (dropWhile (not . p) suf)

Exercise 3

I won’t post the whole program because the only relevant things are firstWord and firstWords

firstWords :: String -> String
firstWords input = unlines (map firstWord (splitLines input))

firstWord :: String -> String
firstWord ws = takeWhile nonSpace $ dropWhile isSpace ws

nonSpace :: Char -> Bool
nonSpace = not . isSpace

I could have written takeWhile (not . isSpace) $ dropWhile isSpace ws but I thought that takeWhile nonSpace is clearer to read.

Exercise 4

There is already a function in the Prelude to transpose lists but it only works on lists of equal length. I could have written
transposeFile input = unlines $ transpose $ splitLines input but it’s unlikely that the number of words per line in a text file will be constant so I decided to cope with lists of different length so that:

ABC       A137
1 2       B 48
3456  =>  C259
789       6

Although this doesn’t preserve position for columns with no entries above them – so the 6 appears as the first character on the last transformed line.

transposeLines :: [String] -> [String]
transposeLines xs = if null heads
                    then []
                    else heads : transposeLines tails
                         where heads = concat $ map safeFirst xs
                               tails = map safeRest xs

safeFirst :: [a] -> [a]
safeFirst []     = []
safeFirst (x:xs) = [x]

safeRest :: [a] -> [a]
safeRest []     = []
safeRest (x:[]) = []
safeRest (_:xs) = xs

Folds – Exercise 1

The trick of coping with negative numbers is simply to write a guard that recursively calls asInt_fold – for some reason this reminded me of writing overloaded constructors in OO languages.
To find the overall value of a string of digits d1 d2 d3 d4

a) d1 = d1
b) d1 d2 = 10 d1 + d2 = 10 a + d2
c) d1 d2 d3 = 100 d1 + 10 d2 + d3 = 10 b + d3
d) d1 d2 d3 d4 = 1000 d1 + 100 d2 + 10 d3 + d4 = 10 c + d4

Applying a function to digits from left to right implies a left fold.

asInt_fold :: String -> Int
asInt_fold ('-':xs) = - asInt_fold xs
asInt_fold xs       = foldl shiftRightAndAdd 0 xs
    where
      -- shiftRightAndAdd :: Int -> Char -> Int
      shiftRightAndAdd acc x = 10 * acc + digitToInt x

I didn’t have to extend the function to add error handling as charToInt already takes care of the cases mentioned.

Exercise 2

One thing I’m enjoying about the book is the attitude of throwing in extra parts of the language or libraries and encouraging us to go off and read about it ourselves for example introducing the Either data type and related Left and Right constructors. Having said that I couldn’t find a way of using Either without writing verbose (and ugly) code as I have to de-construct and reconstruct values throughout. It also seems like it will a maintenance headache as changing something minor in its use will require a lot of careful fiddly changes to the code. I hope late I’ll find a more elegant or robust way of using it but I’m already wary.

type ErrorMessage = String

asInt_either :: String -> Either ErrorMessage Int
asInt_either ('-':[]) = Left "No digits supplied"
asInt_either ('-':xs) = case asInt_either xs of
                          Left mess -> Left mess
                          Right val -> Right (negate val)
asInt_either xs       = foldl radixAdd (Right 0) xs
    where
      -- radixAdd :: Either ErrorMessage Int -> Char -> Either ErrorMessage Int
      radixAdd acc x =
          case acc of
            Left mess -> Left mess
            Right acc -> digitToInt' x
                where -- digitToInt' :: Char -> Either ErrorMessage Int
                      digitToInt' x = if isDigit x
                                      then Right (10 * acc + digitToInt x)
                                      else Left ("non digit '" ++ x:"'")

Exercise 3

Each step of concat is a matter of appending the current item to the previous result l1 ++ (l2 ++ (l3 ++ (l4 ++ [])))

a) l4 = l4
b) l3 ++ l4 = l3 ++ a
c) l2 ++ l3 ++ l4 = l2 ++ b
c) l1 ++ l2 ++ l3 ++ l4 = l1 ++ c

Building up the value using items from right to left means a right fold as a simple one liner.

baConcat :: [[a]] -> [a]
baConcat = foldr (++) []

Exercise 4

The recursive version of takeWhile:

baTakeWhile :: (a -> Bool) -> [a] -> [a]
baTakeWhile f [] = []
baTakeWhile f (x:xs)
 | f x       = x : baTakeWhile f xs
 | otherwise = []

Converting a recursive function that exits before the list argument is exhausted seems to be a wasteful because the fold calls its function argument for every value – but its a good exercise to think more creatively about using folds at a higher level. The key insight for me was that accumulating all items to the end of the result list (in original order) until the predicate is False is identical when traversing in reverse order to starting with an empty list and appending its value to the front of the result which lends itself a right fold.

baTakeWhile' :: (a -> Bool) -> [a] -> [a]
baTakeWhile' f xs = foldr takeStep [] xs
    where takeStep x acc = if f x then x:acc else []

Exercise 5

I had to rewrite my answer for this one because my original answer compared items pairwise along the length of the list so baGroupBy f [a,b,c,d] executes f a b, f b c, f c d until the predicate returns False.
groupBy doesn’t behave like that though – it executes : f a b, f a c, f a d until the predicate returns False.

baGroupBy :: (a -> a -> Bool) -> [a] -> [[a]]
baGroupBy f xs = foldl step [] xs
    where
    step [] x  = [[x]]
    step acc x =
        if f (head lastAdded) x
        then init acc ++ [lastAdded ++ [x]]
        else acc ++ [[x]]
            where lastAdded = last acc

My original answer though needed much more creative thinking – I transformed the list into a list of tuples using zip and ignored the last item – so [a,b,c,d] -> init [(a,b),(b,c),(c,d),(d,a)].
I’ll remember this technique in future when I need to hop pairwise through list items.

Exercise 6

any p xs can be written p xn || pxn-1 || ... || p1 || False and so we have a right fold

baAny :: (a -> Bool) -> [a] -> Bool
baAny p = foldr or False
          where or x acc = p x || acc

I would write cycle as a recursive function with no exit test and let Haskell’s laziness deal with
A right fold can be applied to an infinite list and in the past I’ve seen a right fold used to generate an infinite result list e.g. of even numbers using foldr (*2) [1..] and so

baCycle :: [a] -> [a]
baCycle xs = foldr selfAppend xs [1..]
    where selfAppend x ys = xs ++ ys

Another little trick I’ve discovered is using a tuple of (state, returnValue) as the return value the function being folded.
It feels a bit dirty though, like imperative programming, but the alternative when building up a list is to continually deconstruct and reconstruct it as in the gropBy function.
The list of words can be built up by working from the end of the given string and pre-pending each word built up form non spaces at the front of the result which means I can use a right fold again.

baWords :: String -> [String]
baWords ws = if null headWord
             then tailWords
             else headWord:tailWords
    where
      (headWord, tailWords) = foldr buildWord ("",[]) ws
      buildWord c (thisWord,allWords) | isSpace c = if null thisWord
                                                    then ("", allWords)
                                                    else ("", thisWord:allWords)
                                      | otherwise = (c:thisWord, allWords)

unlines can also be written using a right fold by prepending each word plus by a newline at the front of the result.

baUnlines :: [String] -> String
baUnlines ls = foldr appendCr [] ls
    where appendCr w ws = w ++ "\n" ++ ws

Lots of new things in this chapter but I do like the way the exercises stretch the material. The thing that caused me the most trouble was working out how the left fold was written using the right fold. Haskell keeps doing this, just when I start to feel comfortable with something I see there’s much more to it than meets the eye. I’m going to grab a copy of the cited paper on folds by Paul Hutton and have a look at that. Understanding how the left fold definition using foldr is one thing, but actually using a similar technique is going to take a lot of effort.

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.