blog.vi-kan.net One thought after another

Project Euler, Problem 18 & 67

I have been struggling with a couple of Euler Problems the past days. Not so much finding the solution as finding a couple of bugs in my code…

By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.

3 7 4 2 4 6 8 5 9 3

Find the maximum total from top to bottom of the given triangle.The 18th problem involves finding the optimal route from top to bottom in a pyramid  of 15 rows. It is possible to solve this using a brute force algorithm, testing every possible solution. Problem no. 67 is the exact same problem, but this time with a pyramid of 100 rows. The problem text includes a little warning, stating that a brute force approach would take a couple of billion years to complete…

I remember back when I was a kid, I used to solve labyrinths like ‘help the little mouse find its cheese’, I always started with the cheese. It was always easier to find the way back to the mouse. I would guess that the same would go for this one. If we can solve the problem from the bottom instead of the top, it should be easier to choose the right path.

6 1 7 9 2 3 8 5 9 3

So, lets start at the bottom of the four-row pyramid. We have four numbers: 8 5 9 and 3. Even though 9 is the larger one, we have no idea if 9 is where we want to end up, since the numbers leading to the first number, 8, could be higher. Imagine the third row to be 9 2 and 3. The only way down to the 8 would be through the 9 in the third row, and down to the 9 we could have come through the 2 or the 3. 8 9 = 17 looks like a much better total then 2 9 = 11 or 3 9 = 12. Moving up one more row, to the second row, lets imagine the numbers 1 and 7. If we went for the  8 9 combination, we would only have one choice up to the second row, making a total of 8 9 1 = 18. If we went for the 9 3 combination, the 7  would be our only choice, giving a total of 19. From the 2 in the third row, we could reach either one, but would not get a higher total any way, so lets just drop that option all together.

Now, what seemed like a great path, starting with 8 9, turned out to be beaten by another option higher in the pyramid. How could we predict that? We couldn’t. So what would make going from bottom to top a better solution then from the top to bottom? Well, we managed to throw away a couple of options. We never payed either the 5 or the 3 in the bottom row any attention. From the 5, we could have gone up through 9 or 2 in the third row, but if we have come to the 9, why should we go down to the 5 when we know that the 8 would give a better total? Same goes for the 2. Why go down to 5 for a total of 7 when we can go down to 9 for a total of 11? And that is the clue to solve this problem. We need to know what would give the greater total, left or right.

OK. So lets start at the bottom again, this time, with some of the numbers from the given pyramid in problem 18:

If we start with the 63, we could either go down to 04 or 62. Obviously, knowing this is the last row, we would go for 62 and a total of 125. From 66 we can go to 62 or 98. Once again, we would take the right path for a total of 164. From 04, we would pick 98, from 68, 27 or 89, 23 and so on.

Now we can easily know where we want to go next from the 14th row. Lets add the 13th row:

Since we already know the best path from each number in the 14th row down to the 15th, we can treat the trip from the 13th row down to the 14th in the same way. From the first number, 91, we know we can choose between 63 and 66. We know that if we choose 63, we can reach a maximum of 91 125 by going right from 63. If we choose 66, we know that we can reach a total of 91 128 by going left. From the second number, 71, we can go left through 66, or right through 04. By choosing right, we can reach a maximum of 71 102, so clearly left would be a better choice, which gives us 71 164.

Do you see the pattern? I think we are ready for some code! I started out with a NumberPiramid-class. It has a method for adding a row of numbers, and a method returning the calculated maximum total.

@interface NumberPiramid : NSObject {
  NSMutableArray *rows;
}

-(void)addRow:(NSString *)stringWithNumbers;
-(int)numberOfRows;
-(int)maximumTotal;

@end

The method addRow:stringWithNumbers takes all numbers for one row as a string separated by spaces. The string is split into an array of Number, a class able to hold both the number itself, and the totals for the left and right side.

òbjective-c -(void)addRow:(NSString *)stringWithNumbers; { NSArray *numbers = [stringWithNumbers componentsSeparatedByString:@" "]; NSMutableArray *row = [[NSMutableArray alloc] initWithCapacity:[numbers count]]; for (NSString *number in numbers) { Number *n = [[Number alloc] initWithNumber:number]; [row addObject:n]; [n release]; } [rows addObject:row]; [row release]; } `

So now, we have an array of arrays that we can use to find our total. The method maximumTotal will first traverse the arrays bottom up, adding the numbers one by one, always keeping the larger option. I have given the Number class a convenience method maxSum, that returns the larger number of leftSum and rightSum. When we have reached the top, we have our total.

-(int)maximumTotal;
{
  NSArray *row = [rows lastObject];
  NSArray *prevRow;
  for (int r = [self numberOfRows] - 2; r >= 0; r--) {
    prevRow = row;
    row = [rows objectAtIndex:r];
    for (int n = 0; n < [row count]; n  ) {
      Number *number = [row objectAtIndex:n]>
      Number *left = [prevRow objectAtIndex:n];
      Number *right = [prevRow objectAtIndex:n   1];
      number.leftSum = number.number   [left maxSum];
      number.rightSum = number.number   [right maxSum];
    }
  }
  Number *top = [row objectAtIndex:0];
  return [top maxSum];
}

And thats all there is. Problem 67 is the exact same problem, but with a larger pyramid. The numbers are given in a textfile, so we need a new method that can read the textfile line by line. Actually, I found it easier to just read the whole file into a string, using NSString initWithContentsOfFile:filename. I then split this string into lines with the same method I use to split a single row.

As before, the code is available through svn. There is a delphi implementation too.

Project Euler, Problem 17

After my success on getting unittest to work on the iPhone, I thought I should try it out some more. It’s been a while since last time I solved an Euler Problem, so maybe this would be a good excuse to solve another one? The last problems involved a lot of math, so this time I found something completely different:

If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are 3 3 5 4 4 = 19 letters used in total. If all the numbers from 1 to 1000 (one thousand) inclusive were written out in words, how many letters would be used?

Well, lets see, then…

We would need a class that can convert an integer into written words. I will call it WrittenNumber, and give it a class method (NSString *)fromInt:(int) to do the work.

@interface WrittenNumber : NSObject {
}
 (NSString *) fromInt:(int)number;
@end

Now, that I got my basic interface in place, I tried it out with the most basic test:

- (void)testSingleDigitOne {
    GHAssertEqualStrings([WrittenNumber fromInt:1], @"one", nil);
}

As expected, it failed :

    euler17/testSingleDigitOne ✘ 0.00s

    Name: GHTestFailureException
    File: /../euler17.m
    Line: 40
    Reason: '' should be equal to 'one'.

I will not take you through every single step I took, including returning ‘one’, writing a new test and realize that returning ‘one’ didn’t work for other cases and so on… What we need, is to identify every word that we need to build up every number from one to 1000. Up to nineteen, there are all unique words. From there, most written numbers are combined by two or more words. So I put these words into a couple of static arrays:

static NSString * simpleNumbers[] = {
    @"", @"one", @"two", @"three", @"four", @"five", @"six", @"seven", @"eight", @"nine",
    @"ten", @"eleven", @"twelve", @"thirteen", @"fourteen", @"fifteen", @"sixteen", @"seventeen", @"eighteen", @"nineteen"
};

static NSString *tens[] = {
    @"", @"", @"twenty", @"thirty", @"forty", @"fifty", @"sixty", @"seventy", @"eighty", @"ninety"
};

I have padded the ‘tens’-array to make indexing a little easier. tens2 = “twenty”, tens[8] = “eighty” and so on. Now, the algorithm that I ended  up with, was to start with the larger part of the number, converting it into text, and then handle the rest in the same way. Peele away the larger part, convert it to text, and repeat.

(NSString *) processSimpleNumbers:(int)number
{
    return simpleNumbers[number];
}

(NSString *) processTens:(int)number
{
    if (number < 20) {
        return [self processSimpleNumbers:number];
    }

    int div = number / 10;
    int rest = number % 10;
    NSString *tmp = tens[div];
    if (rest > 0){
        return [NSString stringWithFormat:@"%@-%@", tmp, [self processSimpleNumbers:rest]];
    } else{
        return tmp;
    }
}

 (NSString *) processHundreds:(int)number;
{
    if (number < 100) {
        return [self processTens:number];
    }

    int div = number / 100;
    int rest = number % 100;
    NSString *tmp = [NSString stringWithFormat:@"%@ hundred", simpleNumbers[div]];
    if (rest > 0){
        return [NSString stringWithFormat:@"%@ and %@", tmp, [self processTens:rest] ];
    } else {
        return tmp;
    }
}

 (NSString *) processThousands:(int)number;
{
    if (number < 1000){
        return [self processHundreds:number];
    }

    int div = number / 1000;
    int rest = number % 1000;

    NSString *tmp = [NSString stringWithFormat:@"%@ thousand", [self processHundreds:div]];

    if (rest > 0) {
        if (rest < 100) {
            return [NSString stringWithFormat:@"%@ and %@", tmp, [self processTens:rest]];
        } else {
            return [NSString stringWithFormat:@"%@ %@", tmp, [self processHundreds:rest]];
        }
    } else{
        return tmp;
    }
}

 (NSString *) fromInt:(int)number
{
    return [self processThousands:number];
}

I’m sure there are prettier ways of solving this, but it works. The remaining task, is trivial. Convert each number and accumulate the length of the strings after removing hyphens and spaces. Of cause, we could get rid of these characters right away by not returning them in the first place, but then again, I did not…

I have uploaded the source to my svn repository if you want to take a look. You’ll find the WrittenNumber class in the iPhone/src folder, and the unittests in the iPhone/tests folder.

UnitTesting on the iPhone

I have to admit, that I find xcode and objective-c the most challenging developing environment I have ever tried. Specifically, I have struggled to find an easy way of including unittests for an application. Even though it comes with some unittest templates, I haven’t successfully put it to work. Can’t say I have tried that much either; it all seemed to complicated to be worth it…

This weekend, the following tweet got my attention:

![TDD for iPhone Development : http://www.vimeo.com/16722907][2]

The video shows a step-by-step guide on how to get started with unittesting for the iPhone. It looked great, and not that complicated, so I decided to give it a try. 10 minutes later, only delayed by a mysterious “Failed to launch simulated application: Unknown error.”-error*, I had my first unittest up and running in the simulator.

If you develop for the iPhone, and either have struggled with unittesting, or just never tried, I will recommend you watch the video. The steps followed in the video is thoroughly documented on the authors home page.

*I never found a reason for this error. As for so many other error messages I have encountered in xcode, starting a new project resolved the issue…

Basic CouchDB interaction

So, did you read my little introduction? Did you go nuts with the Futon application?   Well, then, I guess it's time to look a little closer on the CouchDB API.

As mentioned before, CouchDB provides a RESTful API. You will be sending requests over HTTP, and checking status codes for success. This makes it easy to test features, by using small utilities like HTTP Client and Curl. All GET operations should be easy enough to do through your browser, and some browsers have addons that lets you make other kind of requests too. So find a tool that suites you well, and start the expedition.

Say Hello to CouchDB

Before we start making databases and documents, you should check that the server is running. Make a simple GET request to the server to see if you get any response.

GET http://127.0.0.1:5984/

If the server is running, it will respond with a welcome message and a version number.

{"couchdb":"Welcome","version":"0.11.0"}

You can also check to see what databases allready exists on the server, by doing a GET request to /alldbs.

GET http://127.0.0.1:5984/_all_dbs

The resonse will be formatted as an JSON array with database names.

["business-cards","newdb","testdb"]

Databases

CouchDB exposes most everything as resources accessable by urls. This is one part of what makes it RESTful. So when you want to access one of the existing databases, you will make a GET request for the name of the database.

GET http://127.0.0.1:5984/business-cards

This request want return the complete database with all its documents. It will instead return some information about the database.

{"db_name":"business-cards", "doc_count":2, "doc_del_count":1, "update_seq":4, "purge_seq":0, "compact_running":false, "disk_size":16473, "instance_start_time":"1273129328746881", "disk_format_version":5}

As you can see, our business card database from our introduction contains two documents. It also contains one deleted document which is not purged yet.

Since the name of the database is used as an url, there has to be some constraints on what characters can be used. To make it easy to support different platforms, and avoid issues with casing, CouchDB will only allow lowercase letters in the range 'a' to 'z'. It will also allow digits and the following characters: _ $ ( ) – /

By allowing the /-character, it is possible to group different databases into namespaces, like /blog/posts and /blog/comments. It can be a little tricy to test, though. You will need to encode the / into /, but HTTP Client, which I use, doesn't like that. It will encode the % into % and send %2F instead of /. I guess there is some way of escapting this, but I don't know for sure.

So lets create our first database. I will make myself a database of geocaches. So what would be more natural then just to PUT it right there at the server?

PUT http://127.0.0.1:5984/geocaches

If the database was created as expected, we will get a positive response.

{"ok":true}

If the database name is taken, we will get an error response. The response will contain a standard HTTP status code, 412 Precondition Failed, and the body of the response will give us even more information.

{"error":"file_exists","reason":"The database could not be created, the file already exists."}

Similar, if we try to use one of the forbidden characters, we will get a 400 Bad Request with additional information on how to behave in the future.

{"error":"illegal_database_name", "reason":"Only lowercase characters (a-z), digits (0-9), and any of the characters _, $, (, ),  , -, and / are allowed"}

Nice.

If you need to remove the database again, you make a…..?  Yes, a DELETE request.

DELETE http://127.0.0.1:5984/geocaches

If everything goes well, the server will respond with a happy 'ok', and the database with all its documents is gone. No 'please, confirm' or 'are you sure', so now you're warned.

Documents

In the world of CouchDB, a document consists of key-value pairs. Every document needs an unique id, which is stored in the '_id'-key. When you create a new documet, CouchDB will provide one for you if you want it too. The id will then be in the form of a guid, a string of 32 hexadecimal values.

The id of a document will be used as part of the uri when you later want to retrieve or change your document. If the data you want to store already have a unique identification, you could use that and get more domain specific urls. Be careful though. Not every character is suitable to be part of an uri, and you would need to escape them appropriately .

Create

I want to store geocaches, and every geocache has an unique name consisting of letters and digits. I will therefore use this name as the id for my document. Heres my first geocache:

{
"url": "http://www.geocaching.com/seek/cache_details.aspx?guid=ab5cc0fb-4f0f-4809-98a1-60697daa5e08",
"sym": "Geocache Found",
"lon": 18.973967,
"lat": 69.696533,
"urlname": "Nesten hjemme",
"type": "Geocache|Traditional Cache",
"time": "2009-06-07T07:00:00Z",
"name": "GC1TCJR",
"desc": "Nesten hjemme by Geo_Raiders, Traditional Cache (2/1.5)"
}

I want to store this as an document with the key GC1TCJR in the database geocaches, so lets PUT it up there.

PUT http://127.0.0.1:5984/geocaches/GC1TCJR

{"url":"http://www.geocaching.com/seek/cache_details.aspx?guid=……."}

If everything goes by the plan, CouchDB will responde with a 201 Created status, together with something like this:

{"ok":true,"id":"GC1TCJR","rev":"1-f70033bafd9d72e174cecb67f58b2e4f"}[/js]

If a document with the same id already exists, you will get a 409 Conflict status and an error.

You may wonder what the "rev" value returned is for. CouchDB will never change a document. Instead, it will make a new document stored with the same id. The reason for this is to handle concurrency, and you should not rely on this for any revision control in your application. You can compact a database to get rid of wasted space.

If we want CouchDB to generate a id for us, we could use a POST instead. We would also omit the id from the url, since the id yet remains to be decided.

POST http://127.0.0.1:5984/geocaches

{"url":"http://www.geocaching.com/seek/cache_details.aspx?guid=……"}

Now that we have successfully stored our document, we can retreive it by doing a GET.

GET http://127.0.0.1:5984/geocaches/GC1TCJR

As you can see from the response, CouchDB has added the id and revision fields to our document.

"_id":"GC1TCJR","_rev":"1-f70033bafd9d72e174cecb67f58b2e4f","url":"http://www.geocaching.com/……"}

Update

Storing updates is nearly as simple as creating new documents. The only difference is that the document must contain both the id and the rev values, and the rev value needs to point to the latest revision. So if you get a document first, make your changes, and then put it up on the server again, things should go nicely. If someone else has stored an update between your get and put request, you will get an conflict error that you need to resolve. How you resolve the conflict is up to the you. You could try to merge your changes with the new revision in the database, or you could simply update your rev value with the latest from the databse and do a new put, effectively overwriting what changes that may have been done.

PUT http://127.0.0.1:5984/geocaches/GC1TCJR

{"_id":"GC1TCJR","_rev":"1-f70033bafd9d72e174cecb67f58b2e4f","url":"http://www.geoinfo.com/……

CouchDB will respond with a 200 OK status, and inform you about the new revision number.

{"ok":true,"id":"GC1TCJR","rev":"2-6aa44736b02f96844fdfbf4eece00b83"}

Delete

Can you guess how to delete a document? Close. You make a DELETE request, but you have to include the latest revision in the url as well.

DELETE http://127.0.0.1:5984/geocaches/GC1TCJR?rev=1-f70033….

As for updating, CouchDB keep your old document around for a while. It will add a new revision, and mark it as a stub for a deleted document.

The response will tell you the new revision number.

{"ok":true,"rev":"3-6b8ecfd8bd5d1cb8ae3fd49aea9dfe8d"}

If you try to do a get on the deleted document, you will get a 404 Object not found response with additional information stating that the document was deleted.

{"error":"not_found","reason":"deleted"}

Listing

Now that you know the basic CRUD operations, it would be nice to get a list of all our documents. CouchDB provides a special url we can use for this.

GET http://127.0.0.1:5984/geocaches/_all_docs

The response will contain information of total number of rows, and an array of ids and revision numbers.

{"total_rows":4,"offset":0,"rows":[

{"id":"GC13H9E","key":"GC13H9E", "value":{"rev":"1-8e3c239c5da38d0fbe91ffdd24582b02"}},

{"id":"GC144MQ","key":"GC144MQ", "value":{"rev":"1-865f40ed3047aefdf8c45520c84c4321"}},

{"id":"GC145KB","key":"GC145KB", "value":{"rev":"1-83ba674d10ce404be7fc90460b1a9a13"}},

{"id":"GCYXEG","key":"GCYXEG", "value":{"rev":"1-880c0d1767c8f6db0ce815fe0b35afdb"}}

]}

The alldocs uri can take some parameters that makes it possible to paginate the result. You can use a combination of 'startkey', 'endkey', 'limit' and 'skip'. Here is a couple of examples that you can try:

GET http://127.0.0.1:5984/geocaches/_all_docs?limit=2

=> First two documents

GET http://127.0.0.1:5984/geocaches/_all_docs?limit=2&skip=2

=> Third and fourth document

GET http://127.0.0.1:5984/geocaches/_all_docs?startkey=GC144MQ&limit=2&skip=1

=> Two next documents after GC144MQ

What's next?

Well, then. You should now have enough information to start making basic use of CouchDB. There are still a lot to cover, though. Did you know that you can upload binary attachments to documents? You can. What you can't, though, is to query your data using SQL. Instead you will be making views, but that has to be waiting for now.

Introducing CouchDB

image A while back, I became aware of something called CouchDB. Since then, I have spend a little time now and then trying to understand what it is, and if I should care. I still struggle a bit with the whole idea of ‘document datebases’ and when it is more appropriate then relational databases, but want let a lack of understanding stop me from getting my hands wet.

The basics

So here is the most basic ideas behind CouchDB and how it works.

Documents

While a relational db stores rows in tables, CouchDB stores jason-formatted documents. While a table in a relational db has a defined set of columns that can have values, there is no definition on what values a document in CouchDB should contain. So when a table enforce every row to be in the same format, each and every document in a CouchDB database could contain completly different kind of values.

The book CouchDB: The Definitive Guide uses business cards as an illustration on this. In a traditional database, you would have a table where each row represents one business card. You then have to choose what columns you would have. You would need a name, a telephone number, a url to a website and an email address. After a while, you realise that you would need a column for a company name and a new column for a second telephonenumber for the company. In the third iteration, you realise that you sill need  more phonenumbers, and maybe even a fax. Now, should we make more telephone columns, or should we have a one-to-many relationship from a business card to a list of phonenumbers? And what about all our trendy business contacts with only a single url on their card? There would be a whole lot of emtpy columns then.Table of business cards

In CouchDB, you would have a business card database where each card would be represented by one document. The content of the document would be defined by the data on that given businesscard.Business cards as documents

As you can see, this makes everything very flexible. No need for any migration of existing documents when you suddanly find your first business card with a twitter name on it.

HTTP and REST

CouchDB is basically a small webserver listening to a given port. When communicating with the server, you would make HTTP requests specifying what you want, and CouchDB would make an apropriate respons. Instead of inventing some new rpc format, it uses the basic http request methods for each and every method. So if you want to get a document, you send a GET request. Do create or update a document, you use PUT or POST, and to remove a document, you use DELETE.

A service designed in this way, is often refered to as RESTful.

JSON

Every response comming from CouchDB, and every document stored inside CouchDB, will be in the form of JSON. JSON, or JavaScript Object Notation,  is a standard made for representing objects in a way that should be easy both for human and machine to write and consume.

A JSON object consists of a set of name-value pairs, where the value can be either an array of values, a JSON object, a string, a number, true, fase or null. Even though the synax is simple, it is possible to represent list of complex, nested objects.

A JSON representation of one of our business cards could look something like this:

{
   "name": "Joseph A. Gutierrez",
   "website": "http://wallpaperdealer.com/",
   "phone": [
       "41-397-0567",
       "954-399-5693"
   ],
   "fax": "518-923-6525"
}

Installation

Like many other open source applications, CouchDB is distributed as source code, some dependencies to other open source technologies, and a description on how to build for you spesific platform. This makes it easy to port everything to different platforms and operating systems, but a little more complex for windows users like me that are used to get everything wrapped up nice in a install executable.

Luckly, there are people out there that provieds us with binaries, and a binary for CouchDB for windows is made available through the CouchDB Wiki.

The installer will suggest installing CouchDB as an service, which makes it easy enough for now.

Configuration

CouchDB stores configuration data in two ini files. You can find them where you installed CouchDB in the subfolder \etc