Thursday, August 30, 2007

Kernow now available via Java Web Start

I've been playing around with making Kernow available through Java Web Start. This should be the ideal way to run Kernow as it places a shortcut on your desktop (and in your start menu in Windows) and auto-updates whenever a new version is available.

Reading around it seems Java Web Start has had mixed reviews. Personally I really like it, perhaps because I'm using Java 1.6 and Netbeans 6 M10 which makes it all pretty straightforward (auto jar-signing is really helpful in M10).

Give it a go, let me know what you think: Kernow - Java Web Start

Friday, August 17, 2007

Using XQuery and the slash operator to quickly try out XPaths

This is the coolest thing I've seen in a while...

In XQuery you can constuct a node just by writing it, eg:

<node>I'm a node</node>

and then you can use slash operator to apply an XPath to that node:

<node>I'm a node</node>/data(.)

returns "I'm a node"

The XML doesn't have to be limited to a single node - you can do:

<root>
  <node>foo</node>
  <node>bar</node>
</root>/node/data(.)

...to get "foo bar".

Or:

<root>
  <node>foo</node>
  <node>bar</node>
</root>/node[1]

to get:

<node>foo</node>


Using this technique in combination with Kernow's XQuery Sandbox makes it straightforward to paste in some XML and start trying out some XPaths.

Thursday, August 16, 2007

When a = b and a != b both return true...

In XPath = and != are set operators. That is, they return true if any item on the left hand side returns true when compared with any item on the right hand side. Or in other words:

some x in $seqA, y in $seqB satisfies x op y

...where "op" is = or != (or > or < etc)

To demonstrate this take the two sets ('a', 'b') and ('b', 'c'):

$seqA = $seqB returns true because both sets contains 'b'

$seqA != $seqB returns true because setA contains 'a' which is not equal to 'c' in setB

This catches me out a lot, even though I've been caught out before several times. I really have to think hard about what it is exactly that I'm comparing, and still end up getting it wrong.

A simple rules to follow is "never use != where both sides are sequences of more than one item". 99.9% of the time you won't need to, as much as it feels like the right thing to do.

Below are some of the most common operations on sequences, put together for a reference.

The two sequences are ('a', 'b') and ('b', 'c'), which can be defined in XSLT as:

<xsl:variable name="seqA" select="('a', 'b')" as="xs:string+"/>
<xsl:variable name="seqB" select="('b', 'c')" as="xs:string+"/>

or in XQuery as:

let $seqA := ('a', 'b')
let $seqB := ('b', 'c')



Select all items in both sequences

($seqA, $seqB)

Result: a b b c



Select all items in both sequences, eliminating duplicates

distinct-values(($seqA, $seqB))

Result: a b c



Select all items that occur in $seq1 but not $seq2

$seqA[not(. = $seqB)]

Result: a



Select all items that occur in both sequences

$seqA[. = $seqB]

Result: b



Select all items that do not occur in both sequences

($seqA[not(. = $seqB)],$seqB[not(. = $seqA)])

or

($seqA, $seqB)[not(. = $seqA[. = $seqB])]


Result: a c



Determine if both sequences are identical

deep-equal($seqA, $seqB)

Result: false



Test if all items in the sequence are different

count(distinct-values($seqA)) eq count($seqA)

Result: true


Wednesday, August 15, 2007

The Worlds Fastest Sudoku Solution in XSLT 2.0 for the Worlds Hardest Sudoku - Al Escargot


I get a lot of traffic to this blog because of my Sudoku solver. Google analytics tells me most of it lands on the original version that I wrote, and not the optimised version that's now the worlds fastest XSLT 2.0 solution - an issue which this post should hopefully rectify.

The puzzle on the right is apparently the worlds hardest Sudoku puzzle. If I run my solver "Sudoku.xslt" using Kernow's performance testing feature I get this result:

Ran 5 times
Run 1: 328 ms
Run 2: 344 ms
Run 3: 328 ms
Run 4: 391 ms
Run 5: 328 ms
Ignoring first 2 times
Total Time (last 3 runs): 1 second 47 ms
Average Time (last 3 runs): 349 ms


So on my machine the average execution time is 349ms, which is pretty good considering the original version would take minutes for several puzzles. As far as I know this version will solve all puzzles in under a second on my machine (Core 2 duo E6600, 2gb).

How does it do it? This is taken from the web page where it's hosted:

It accepts the puzzle as 81 comma separated integers in the range 0 to 9, with zero representing empty. It works by continuously reducing the number of possible values for each cell, and only when the possible values can't be reduced any further it starts backtracking.

The first phase attempts to populate as many cells of the board based on the start values. For each empty cell it works out the possible values using the "Naked Single", "Hidden Single" and "Naked Tuples" techniques in that order (read here for more on the techniques). Cells where only one possible value exists are populated and then the second phase begins.

The second phase follows this process:

* Find all empty cells and get all the possible values for each cell (using Naked Single and Hidden Single techniques)
* Sort the cells by least possible values first
* Populate the cells with only one possible value
* If more there's more than one value, go through them one by one
* Repeat

This is how it solves the Al Esgargot: A slightly modified version of the solution gives this output with the $verbose parameter set to true. As you can see it's found that it can insert a 1 at position 66 using the static analysis of the puzzle (position 66 is middle-right of the bottom-left group). Next it's decided that there are two possible values 4 and 6 at index 12 (middle-right cell of top-left group), so it tries 4 and continues. With that 4 in place it's found that there's only one possible value at index 39, a 3, so it inserts that and continues. It will keep reducing the possible values based on the current state of the board, inserting the only possible values or trying each one when there are many, until either there are no possible values for an empty cell, or the puzzle is solved.

(the solution is shown below)

Populated single value cell at index 66 with 1
Trying 4 out of a possible 4 6 at index 12
Only one value 3 for index 39
Trying 5 out of a possible 5 7 at index 10
Trying 1 out of a possible 1 9 at index 13
Only one value 9 for index 15
Trying 6 out of a possible 6 7 at index 16
Only one value 7 for index 17
Trying 2 out of a possible 2 4 at index 7
Trying 6 out of a possible 6 8 at index 2
Only one value 8 for index 3
Only one value 2 for index 48
Only one value 6 for index 57
Only one value 5 for index 60
Only one value 9 for index 63
Only one value 5 for index 74
Only one value 1 for index 78
Only one value 3 for index 69
Only one value 5 for index 71
Only one value 6 for index 81
Only one value 4 for index 36
! Cannot go any further !
Trying 8 out of a possible 8 at index 2
Only one value 6 for index 3
Trying 4 out of a possible 4 5 at index 4
Only one value 3 for index 9
Only one value 3 for index 23
Only one value 4 for index 26
Only one value 3 for index 53
Only one value 5 for index 54
Only one value 4 for index 36
Only one value 6 for index 44
Only one value 8 for index 48
Only one value 2 for index 57
Only one value 8 for index 73
Only one value 9 for index 47
Only one value 7 for index 50
Only one value 6 for index 60
Only one value 9 for index 64
! Cannot go any further !
Trying 5 out of a possible 5 at index 4
Only one value 9 for index 47
Only one value 2 for index 67
Only one value 7 for index 49
Only one value 9 for index 64
Only one value 2 for index 80
Only one value 1 for index 32
Only one value 2 for index 48
Only one value 5 for index 50
Only one value 8 for index 58
Only one value 8 for index 73
! Cannot go any further !
Trying 4 out of a possible 4 at index 7
Only one value 3 for index 9
Only one value 4 for index 23
Only one value 3 for index 24
Only one value 2 for index 26
Only one value 3 for index 53
Only one value 5 for index 54
Only one value 8 for index 4
Trying 2 out of a possible 2 6 at index 2
Only one value 6 for index 3
Trying 7 out of a possible 7 8 at index 19
Only one value 8 for index 20
Only one value 7 for index 29
Only one value 9 for index 40
Only one value 7 for index 50
Only one value 6 for index 44
Only one value 2 for index 49
Only one value 2 for index 28
Only one value 8 for index 48
Only one value 2 for index 57
Only one value 5 for index 67
Only one value 7 for index 58
Only one value 8 for index 61
Only one value 3 for index 68
Only one value 2 for index 70
! Cannot go any further !
Trying 8 out of a possible 8 at index 19
Only one value 7 for index 20
Only one value 8 for index 29
Only one value 9 for index 47
Only one value 2 for index 48
Only one value 8 for index 57
Only one value 4 for index 37
Only one value 9 for index 40
Only one value 7 for index 49
! Cannot go any further !
Trying 6 out of a possible 6 at index 2
Only one value 2 for index 3
Only one value 6 for index 57
Trying 7 out of a possible 7 8 at index 19
Only one value 8 for index 20
Trying 2 out of a possible 2 4 at index 28
Only one value 7 for index 29
Only one value 9 for index 47
Only one value 2 for index 49
Only one value 9 for index 40
Only one value 6 for index 44
Only one value 7 for index 50
Only one value 9 for index 59
! Cannot go any further !
Trying 4 out of a possible 4 at index 28
Only one value 9 for index 37
Only one value 4 for index 44
Only one value 6 for index 42
Trying 2 out of a possible 2 7 at index 29
Only one value 7 for index 47
Only one value 2 for index 49
Only one value 7 for index 32
Only one value 9 for index 50
! Cannot go any further !
Trying 7 out of a possible 7 at index 29
Only one value 1 for index 32
Only one value 2 for index 33
Only one value 2 for index 47
Trying 7 out of a possible 7 9 at index 49
Only one value 9 for index 50
Only one value 6 for index 77
Only one value 1 for index 78
Only one value 5 for index 80
Only one value 5 for index 56
Only one value 8 for index 60
Only one value 2 for index 61
Only one value 8 for index 70
Only one value 6 for index 71
Only one value 9 for index 74
Only one value 2 for index 64
Only one value 4 for index 81
Only one value 4 for index 58
Only one value 9 for index 67
Only one value 8 for index 73
Only one value 2 for index 76
Done!

1, 6, 2,   8, 5, 7,   4, 9, 3,
5, 3, 4,   1, 2, 9,   6, 7, 8,
7, 8, 9,   6, 4, 3,   5, 2, 1,


4, 7, 5,   3, 1, 2,   9, 8, 6,
9, 1, 3,   5, 8, 6,   7, 4, 2,
6, 2, 8,   7, 9, 4,   1, 3, 5,


3, 5, 6,   4, 7, 8,   2, 1, 9,
2, 4, 1,   9, 3, 5,   8, 6, 7,
8, 9, 7,   2, 6, 1,   3, 5, 4,


If you have a solution that can statically detect more cells to fill using different techniques than I have, or has a better strategy than simply backtracking when there's more than one value, then I'd be interested to know it works.

I'm pretty sure the XSLT is as good as it can be, but if you think it can be improved in any way then let me know.