The other day I participated in CodeSprint Sudoku by Interviewstreet. It’s not the first CodeSprint I’ve participated in, but it is the first one I’ve won. Woo hoo! Thanks to Interviewstreet for hosting it and Gayle Laakmann McDowell for sponsoring it and providing the prizes :). Also, a big shout-out to everyone in #codesprint (webclient) :P.
I had intended to write something about my submission, but there’s a good chance I wouldn’t have gotten around to it. However, the folks at Interviewstreet asked if I would explain my approach for a blog post they’re going to be doing, so thanks for the extra incentive :).
Just to set it up, the challenge was to output a solution to a partially solved 9×9 Sudoku board which is given as input. The input is formatted as 9 lines of space-separated digits from 0 to 9, 0 indicating an empty cell. The output had to be any solution to the input board, and it has to be in the same format as the input; of course all the 0’s have to be replaced with a valid number.
My programming language was Python 2.7.
I recently finished taking CS188.1x Artificial Intelligence on edX, and one of the things covered in the class that I didn’t have much prior experience with was constraint satisfaction. Unfortunately, that was the one major topic covered in the class that we weren’t required to implement for homework, so I had been looking for a good opportunity to implement it myself and put what I had learned into practice. As it happens, Sudoku is one of the prototypical examples of a constraint satisfaction problem. So I began. I started coding a class that would represent the board, planning in my head how I was going to implement constraints, etc. It was going to be awesome.
I’m not sure what made me think about the scoring – maybe I finally wondered how this was going to be scored anyway, or it might have been something I saw on the #codesprint IRC channel – but whatever it was, it caused me to actually check how the score was calculated. Turns out, the score was being calculated as follows:
score = (1000 - characters in the source code) / 10
So this was a code golfing challenge. There I was, 596 characters in with not much functionality to show for it yet, and it was already longer than quite a few entries. Even after golfing the code I had, it only went down to 256 characters, and that was just for the SudokuBoard class and it’s 3 methods. Clearly this approach wasn’t going to work.
A New Plan
I didn’t see how someone could fit much of a CSP solver into a few hundred bytes, at least not using any of the allowed languages (prove me wrong, please :P), so I figured that the solutions were (a) mostly brute-force, (b) probably using backtracking search, and (c) therefore recursive. I also knew that storing the board as a two dimensional list would greatly slow down a brute force search. I made a simple Lights Off (original(er)) solver once, and was able to greatly increase performance by switching the board representation from a two dimensional list of booleans to an int, with each of the first 25 bits representing one of the cells of the 5×5 board. That meant that instead of indexing into the list, I had to use bit manipulation, division and modulo. After several optimizations, those lines of code were a mess of ^’s, **’s, /’s and %’s, but it was fast.
Of course, you can’t represent a cell in Sudoku with one bit. You’d need more like ~3.32 bits per cell, or more practically 4. It would work, but it would take some code, and I didn’t have many bytes to spare. So I compromised on using bit manipulation to index into a string that contained a one-dimensional representation of the board. The input parsing code was painfully long:
''.join(''.join(x.split())for x in sys.stdin.read().splitlines())
I ended up shortening this later. Here are some of those shorter versions and others I’ve though of since, with the one above on top for easy length comparison:
1 2 3 4 5
''.join(''.join(x.split())for x in sys.stdin.read().splitlines()) ''.join(x.replace(' ','')for x in sys.stdin.read().splitlines()) ''.join(x.replace(' ','')for x in(raw_input()for x in*9)) j=''.join;f(j(j(raw_input().split())for x in*9)) sys.stdin.read().replace('\n','').replace(' ','')
This takes the input as described in the problem description and leaves you with an 81 character string, each character representing a cell, and every nine characters representing a row of the board.
At that point I needed to figure out how to compactly use bit manipulation to access the cells so I could check for collisions; by collisions, I mean cells that are in the same row, column, or sub-square as the current cell being processed. Here’s what I came up with:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
s = sudoku_board_as_string x/9^y/9 # evaluates to 0 if s[x] and s[y] both refer to # cells in the same row of the Sudoku board (x-y)%9 # evaluates to 0 if s[x] and s[y] both refer to # cells in the same column of the Sudoku board x/27^y/27 # evaluates to 0 if s[x] and s[y] both refer # to cells in either the top, middle or bottom # three-row section of the Sudoku board x%9/3^y%9/3 # evaluates to 0 if s[x] and s[y] both refer # to cells in either the left, center or right # three-column section of the Sudoku board x/27^y/27|x%9/3^y%9/3 # evaluates to 0 if s[x] and s[y] # both refer to cells in the same # sub-square of the Sudoku board
The first two are fairly simple.
x/9^y/9 divides both x and y by 9 (the number of characters per row) and XORs the quotients together. If they’re equal (meaning they’re in the same row), the XOR evaluates to 0.
(x-y)%9 subtracts y from x to find their difference (the sign doesn’t matter), then mods that by 9. If the difference between x and y is a multiple of 9 (meaning they’re in the same column), it evaluates to 0.
I split up the third logical operation into two parts to make it easier to understand.
x/27^y/27 evaluates to 0 if x and y are both in the same three-row section (either the top three rows, the middle three rows or the bottom three rows) using the same technique used to check if two cells are in the same row.
x%9/3^y%9/3 evaluates to 0 if x and y are both in the same three-column section (either the left three columns, the center three columns or the right three columns). The mod by 9 gives the column number and the division of that by 3 gives the vertical section the cell is in. This doesn’t use the same method I used for detecting a column collision, though it might be more accurate to say that the the column collision check doesn’t use this method. It would work –
x%9^y%9 both evaluate to 0 for the same values – but using the special case method for one column is shorter than applying the general solution. The last step here is to OR these two expressions together (
x/27^y/27|x%9/3^y%9/3). OR keeps a bit on if it’s on in either operand. So, the only way
x/27^y/27|x%9/3^y%9/3 can evaluate to 0 is if both expressions being OR’ed evaluate to 0, meaning x and y are in the same horizontal section and vertical section; in other words, they’re in the same sub-square.
Now that I had a way to check if a cell collides with another cell based on row, column or sub-square, I just had to combine them. That’s the easy part though. If any of the expressions evaluate to 0, that means there’s a collision; therefore, all that’s needed is multiplication. If any multiplier is 0, the entire expression evaluates to 0, indicating a collision. I ended up with:
Notice that because (x-y)%9 comes first, it doesn’t need parentheses around it.
I still needed to write the code to plug this into, but I already had a good idea how it was going to work. For the purpose of readability, let’s turn the above expression into a function,
I already knew the main function was going to be recursive. Here’s some pseudocode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
r=range # Since I'm using range more than once, this saves space def f(s): x=s.find('0') # This finds the index of the first cell that # isn't assigned for the backtracking search # If there are no more 0's in s, x is assigned -1 above. # If x==-1 (I could have just used that! Or better yet, used # -1==x and removed the space between if and the condition), # then not~x evaluates to True. Therefore this only runs when # the solution has been found. if not~x:print'\n'.join(' '.join(s[p:p+9])for p in r(0,89,9));exit() # These are numbers that are in cells that collide with the # cell currently being processed unusable_numbers = [s[y]for y in r(89)if check(x,y)] for c in'123456789': if c not in unusable_numbers: f(s[:x]+c+s[x+1:]) # If the assignment works, the function # calls itself while replacing the '0' # at the current cell with the value of c
The first part looks pretty good, but the second part can obviously be shortened. Here it is through a few cycles:
1 2 3 4
unusable_numbers = [s[y]for y in r(89)if check(x,y)] for c in'123456789': if c not in unusable_numbers: f(s[:x]+c+s[x+1:])
Get rid of the variable:
1 2 3
for c in'123456789': if c not in[s[y]for y in r(89)if not check(x,y)]: f(s[:x]+c+s[x+1:])
Turn it into a list comprehension (for brevity; it’s not going to be used anywhere):
[f(s[:x]+c+s[x+1:])for c in'123456789'if c not in[s[y]for y in r(89)if not check(x,y)]
After some pondering, I rearranged the list comprehension:
[c in[s[y]for y in r(89)if not check(x,y)]or f(s[:x]+c+s[x+1:])for c in'123456789']
And then rearranged it some more:
[c in[check(x,y)or s[y]for y in r(89)]or f(s[:x]+c+s[x+1:])for c in'123456789']
And that was it. After putting everything together and adding a final line to get input and call the function, I had:
1 2 3 4 5 6 7
# 266 bytes r=range def f(s): x=s.find('0') if not~x:print'\n'.join(' '.join(s[p:p+9])for p in r(0,89,9));exit() [c in[(x-y)%9*(x/9^y/9)*(x/27^y/27|x%9/3^y%9/3)or s[y]for y in r(89)]or f(s[:x]+c+s[x+1:])for c in'123456789'] j=''.join;f(j(j(raw_input().split())for x in r(9)))
Done! Right? No! There was still someone with a shorter program, and I knew there was more I could do. Look at all that I/O code. There must be something I could cut.
On The Path To Enlightenment
A string with newlines and spaces is still just a string. I was parsing the input to remove the white space and then putting it back before outputting. I was using bit manipulation to check the cells; why not just reconfigure it to work on the raw input?
It was pretty easy once I figured it out. I just had to double all the numbers I was dividing and mod’ing by, as well as modify the call to range to return the indexes that are actually cells and not white space. At that point I could get rid of the majority of the I/O code.
1 2 3 4 5 6 7
# 218 bytes r=range def f(s): x=s.find('0') if not~x:print s;exit() [c in[(x-y)%18*(x/18^y/18)*(x/54^y/54|x%18/6^y%18/6)or s[y]for y in r(0,162,2)]or f(s[:x]+c+s[x+1:])for c in'123456789'] f('\n'.join(raw_input()for x in r(9)))
And at 218 bytes that’s my final submission.
The function is recursive, but it doesn’t return anything. It just runs backtracking search until it finds a solution, outputs the solution, and exits using the
exit() does take an argument that it uses as the process’s error message, and if you run a program that uses
exit("error message") in a terminal, you might see it, and so it’ll look like the program printed it out. However, it didn’t actually go through stdout. If you don’t care about it going through stdout, then go ahead and save some bytes by replacing that entire print/exit line with
~x or exit(s) (keep reading for an explanation of this
In my final submission, I have this snippet of code:
[(x-y)%18*(x/18^y/18)*(x/54^y/54|x%18/6^y%18/6)or s[y]for y in r(0,162,2)]
This list comprehension creates the list of numbers that a cell can’t have assigned to it due to collisions; this is later checked before an assignment is even attempted. To understand how this works, you need to understand how Python handles the
or operator. Simply put, the
or statement will return the value of the first operand that is
True, or if no operand is
True, it’ll return the last operand. If it does reach an operand that evaluates to
True, it won’t even look at the rest. Here are some examples:
1 2 3 4 5 6 7 8 9 10 11 12 13
>>> 4 or undefined_variable 4 >>> False or undefined_variable Traceback (most recent call last): File "", line 1, in NameError: name 'undefined_variable' is not defined >>> 3 or 2 3 >>> 0 or 5 5 >>> ~x or exit(s) # from just above; if ~x is false, meaning # that x == -1, it'll try to evaluate exit(s), # which of course just exits the program
Let’s replace the bit manipulation stuff which a function again, for readability:
[check(x,y) or s[y] for y in r(0,162,2)]
The list comprehension loops through every cell and adds it’s value to the list if it collides with the current cell being processed. More precisely, for every cell y, whatever
check(x,y) or s[y] evaluates to is added to the list. So, if there is a collision,
check(x,y) will evaluate to 0, and therefore the or statement will evaluate to
s[y], and that’s what will be added to the list. On the other hand, if
check(x,y) evaluates to something non-zero, then that number will be added to the list. The reason that the values in the list coming from
check(x,y) aren’t interfering with values from
s[y] is that
check(x,y) is returning integers, whereas
s[y] is returning strings. That’s something that took me a bit to figure out.
The same trick with
or is used to decide whether or not to attempt an assignment. Assume excluded_values has been assigned the list comprehension above:
[c in excluded_values or f(s[:x]+c+s[x+1:]) for c in '123456789']
What this means is either c is in excluded_values, or we’re going to assign c to the current cell.
The contest was officially over, but we weren’t done. There were still people in #codesprint shrinking their code. Here are some things I’ve done to further shorten mine, in no particular order.
One obvious thing that I had missed was switching from
sys.stdin.read(). I had started with
sys.stdin.read(), then moved to
'\n'.join() when it became more byte-efficient. Now
sys.stdin.read() was the better choice once again. But actually, I stumbled across an even better, less well known choice;
os.read(). Compare the following:
1 2 3 4 5 6 7
f('\n'.join(raw_input()for x in r(9))) import sys;f(sys.stdin.read()) import os;f(os.read(0,162)) # os.read(filedescriptor, buffersize) # The first argument, 0, refers to # stdin, and the second argument, # 162, indicates how many bytes to # read.
Now that I’m not using range here anymore (and I could have used
*9 anyway), I can remove the
r=range line at the top of the program and just call
range() directly in the list comprehension. The final result after these modifications saves 16 bytes:
1 2 3 4 5 6
# 203 bytes def f(s): x=s.find('0') if not~x:print s;exit() [c in[(x-y)%18*(x/18^y/18)*(x/54^y/54|x%18/6^y%18/6)or s[y]for y in range(0,162,2)]or f(s[:x]+c+s[x+1:])for c in'123456789'] import os;f(os.read(0,162))
range(), I removed the step argument from the call to
range() in the list comprehension, which then allowed me to remove the start argument, saving 4 bytes. This does double the amount of work done, but code golf is about short code, not short run times, right?
I also changed
if not~x:print s;exit() to
if x<0:print s;exit(). The condition I’m really checking for is x being -1, and that’s the only time it’ll be less than 0, so it works. It also saves 2 bytes:
1 2 3
if not~x:print s;exit() # old state if-1==x:print s;exit() if x<0:print s;exit() # new state
And finally, I replaced
5**18 evaluates to
3814697265625, which has every digit from 1-9, the program will still work, but it’ll repeat work due to the repeated digits. This saves 1 precious byte.
Here’s the current version, 22 bytes slimmer than my final contest submission:
1 2 3 4 5 6
# 196 bytes def f(s): x=s.find('0') if x<0:print s;exit() [c in[(x-y)%18*(x/18^y/18)*(x/54^y/54|x%18/6^y%18/6)or s[y]for y in range(162)]or f(s[:x]+c+s[x+1:])for c in'%d'%5**18] import os;f(os.read(0,162))
Into The Sunset
I had a lot of fun doing the contest and, somewhat surprisingly, writing this post/article/essay/walkthrough :). I’m looking forward to a lot more fun from the people at Interviewstreet. I’m looking at you HackerRank..