
I just posted my first article and it's about sudoku, let me know what you think if you have time





It doesn't produce solvable sudokus, just solved ones.
Jm
www.menendezpoo.com





It produces a solveable grid. You can do what you want with it. It is a solvable sudoku, its just not a solvable puzzle. Take the output and turn it into one. However you want to do that is your choice.





Actually, what you've produced is not "solvable"  it's "solved". This is the easy part of producing Sudokus  the hard part is getting rid of just enough that you have a unique, solvable sudoku puzzle. There is a super efficient way to come up with solved sudokus as you've done with no backtracking at all. Look at section 8.3 of http://www.math.washington.edu/~morrow/mcm/team2280.pdf[^] which explains a direct and essentially immediate way of producing solved puzzles.





I've toyed with the idea of coding a Sudoku game for the Atari 2600. The hardest part would seem to be puzzle generation in a constrainedRAM environment (128 bytes, including stack). If the cartridge were limited to applying transformations to puzzles stored in ROM, I wonder how many puzzles would be required to make the game interesting (and ensure that players didn't routinely find themselves "recognizing" transformed puzzles)?
Also, I wonder whether it would be practical to have a number puzzles that could be assembled by combining a "solved" grid with a list of prerevealed revealed spaces. Obviously only some such combinations would work, but storing e.g. 256 9x9 solved grids of numbers along with 256 9x9 sets of prerevealed spaces and a list of pairs of solved grids plus revealsets might allow a large number of puzzles to be stored in a reasonable amount of code space.





I think there are better ways than the preset "holes". In fact, I kinda doubt that that would work at all.
One possibility is to generate them. I've got a generator (which is why I had to code to produce solved puzzles). It essentially takes a random solution and intelligently drops out squares and solves the resulting puzzle looking for all possible solutions. If it finds exactly one solution, it tries dropping out more cells. If it finds more than one, it starts revealing more cells to disambiguate. It randomly orders the cells to see which ones it's dropping out and which ones it's putting back. Eventually, you have a uniquely solvable puzzle.
Another possibility is to put lots of "base" puzzles in and, after picking one at random, "shuffling" it and or/rotate/reflect it You can also move the 3 cell blocks around in random order. By the time you've done all that, it will look completely different thatn the original. The chances that somebody would recognize that such a modified puzzle is "equivalent" to one they've already done is essentially zero. That would really give you a large number I think.
I'd do the latter if you wanted something quick and dirty and then go back in later and implement the former.





Hi Darrell,
Impressive link you had posted earlier. I am not a Math student and struggle a bit to implement the 8.3.3 from that paper, which is in fact what you explain here above as the 'former' solution.
What I dont understand is the following:
The paper claims taking 40 values from the solved solution into an empty grid. Then removing values one by one while checking that the grid is still valid. This is what I dont understand. The values will always be valid and occur only once in the same row and column since its taken from a solved solution in first place.
Next it claims remove them as long until the appropriate difficulty is generated. But how do I know the difficulty only through code? So that the user who wanted an easy game gets an easy difficulty etc.?
I wished there was more explanation about 8.3.3.
It seems you are very fit in math, do you have any idea how this was meant on the paper? and how to implement it?
Many Thanks,
Houman





If you look back at the flow chart in 8.3.3, it doesn't check that the grid is "valid", it checks that it has a "unique solution".
Oh  I see where you're talking about. The flowchart doesn't say "valid" but they use it in the wording above. I'd rely more on the flowchart if I were you. If you do, you'll see that "valid" in this sense means "uniquely solvable". So they're not using valid in the sense of "solvable"  you're right, they're all valid in that sense. They're using it in the sense of "uniquely solvable" in which case, if you remove enough givens, you can end up with invalid sudoku puzzles. If the board they're currently looking at has more than one solution, then it's not valid as a sudoku puzzle. Sudoku puzzles must have exactly one solution. If you remove enough givens (meaning "cells that have their values given in the initial puzzle"), you can end up with more than one solution. In fact, if you remove them all, you end up with a blank sudoku board where any possible solution whatever will fit in. You want a board with a single unique solution so they're checking for that here. Then they start removing the givens  i.e., clearing cells. Each time they remove one there's a possibility that the resulting puzzle will have more than one solution so they check for uniqueness each time they remove a given. If it's nonunique, then it's unacceptable and they put that given back in and try another. If it does have a unique solution, they check it's difficulty to see if it lies within their acceptable limits and if it does, then they're done. If it doesn't, they start from scratch.
Actually, it seems to me that they wouldn't need to start from scratch if the difficulty isn't acceptable. They could just keep trying to remove other givens, but that's not what their flowchart says they do. I probably wouldn't do it that way. At least not initially  maybe there's actually a good reason for that but they don't explain what it would be in the paper. In my program I don't check difficulty at all. I'm just trying to generate a random puzzle of any difficulty so that last step on checking for difficulty isn't part of my program.





Darrell,
Sorry for the late response. I had to study much further before being able to respond and to continue the discussion.
You are right about what you said. It makes now a lot more sense to me. I think the challenge here is to create valid SuDoKus by utilizing Latin squares first, then selecting randomly 40 cells out of the previous grid as Givens. As the third step one given shall be removed at a time and every time a given is removed there will be a good chance that the puzzle doesn't lead to the same unique solution as before any more and that's bad.
In order to prevent of falling into this trap, the paper suggests using a Brute Force algorithm to check if the solution is still unique after each removal. So far so good, however there is a huge problem with this approach. A brute force algorithm is nothing more than a guess algorithm that is able to crack any puzzle. A human brain doesnt work like that, in fact only a selection of patterns can be solved by humans, while there are far more patterns  including the brute force  that can only be calculated by machines.
I came across some other papers. They suggested against using a brute force for validation. Since the end result could end up having very different level of difficulty than anticipated. So the idea is using instead of the brute force pattern the Naked Single and Hidden Single (Easy level patterns) as long as possible. If the puzzle is solvable (valid) this way, it means we have an easy puzzle generated. If more difficult patterns were required to get a solvable (valid) puzzle, then the end result would have been a more difficult puzzle respectively.
Something that is now confusing to me is the following: Using XWing for instance is considered a more advanced pattern. However an XWing never gives an absolute result but two possibilities of what could be. Literally like forking a flow. What if one way doesnt lead to a unique solution, would I have to "undo" everything back to this step and go the other branch down and see how this could lead to a valid solution? But since you havent been playing with difficulty, you might have no answer to this.
In the last part of your post, you mentioned your code generated puzzles. but from what I remember it only created a valid fully sudoku, but it doesnt create a valid puzzle from it. Or do you have your code extended in the meanwhile and not posted here. In that case, could I have a copy? I am curious how you implemented the brute force algorithm.





To the best of my knowledge, a Sudoku puzzle does not need to be uniquely solvable to be a valid Sudoku puzzle. The puzzle can have multiple solutions and still be a completely valid puzzle. In fact, most Sudoku puzzles that I've tried (especially programs) have more than one valid solution. I highly doubt that with the thousands of Sudoku puzzle books out there, every puzzle has a unique solution. Check out Sudoku on wikipedia; it's stated in the first paragraph that a puzzle "typically" has a unique solution, not that it's required to.
If you scroll further down to the mathematics section, it states that out of the 6,670,903,752,021,072,936,960 possible solution grids, only approximately 48,000 have been proven to have unique solutions with only 17 givens at the start of the puzzle. It should also be noted that 17 givens has been proven to be the lowest number of givens possible with a unique solution. However, you could have 77 givens and not have a uniquely solvable puzzle. But it's still a Sudoku puzzle, and a valid (although incredibly easy to solve) one at that.





Using the method I talked about in the previous post, I get a solved sudoku in 0.000015 seconds using 120 lines of code, most of which is initializing the 12 basic 3x3 Latin Squares. Here's the code:
static int[] arRowSwap = new int[] { 0, 3, 6, 1, 4, 7, 2, 5, 8 };
static int[, ,] LatinSquares3X3 = new int[12, 3, 3]
#region 3x3 Latin Square Enumeration
{
{
{0,1,2},
{1,2,0},
{2,0,1}
},
{
{0,1,2},
{2,0,1},
{1,2,0}
},
{
{1,2,0},
{0,1,2},
{2,0,1}
},
{
{2,0,1},
{0,1,2},
{1,2,0}
},
{
{1,2,0},
{2,0,1},
{0,1,2}
},
{
{2,0,1},
{1,2,0},
{0,1,2}
},
{
{0,2,1},
{2,1,0},
{1,0,2}
},
{
{0,2,1},
{1,0,2},
{2,1,0}
},
{
{2,1,0},
{0,2,1},
{1,0,2}
},
{
{1,0,2},
{0,2,1},
{2,1,0}
},
{
{2,1,0},
{1,0,2},
{0,2,1}
},
{
{1,0,2},
{2,1,0},
{0,2,1}
}
};
#endregion
static int[] RandomPermutation(int n, Random rnd)
{
int[] intRet = new int[n];
for (int i = 0; i < n; i++)
{
intRet[i] = i;
}
for (int i = 0; i < n; i++)
{
int iSwap = rnd.Next(i, n);
int iSwapVal = intRet[iSwap];
intRet[iSwap] = intRet[i];
intRet[i] = iSwapVal;
}
return intRet;
}
static int[] RandomPermutation(int n)
{
return RandomPermutation(n, new Random());
}
public static int[,] SolutionGenerator()
{
Random rnd = new Random();
int[,] ls = new int[9, 9];
int i3x3Outer = rnd.Next(12);
int[] arPermute = RandomPermutation(9);
// Move a random latins square to each box
for (int iRowMajor = 0; iRowMajor < 3; iRowMajor++)
{
for (int iColMajor = 0; iColMajor < 3; iColMajor++)
{
int iRowBase = iRowMajor * 3;
int iColBase = iColMajor * 3;
int i3x3 = rnd.Next(12);
int upperDigitVal = LatinSquares3X3[i3x3Outer, iRowMajor, iColMajor] * 3;
for (int iRowMinor = 0; iRowMinor < 3; iRowMinor++)
{
for (int iColMinor = 0; iColMinor < 3; iColMinor++)
{
ls[arRowSwap[iRowBase + iRowMinor], iColBase + iColMinor] =
arPermute[LatinSquares3X3[i3x3, iRowMinor, iColMinor] + upperDigitVal] + 1;
}
}
}
}
return ls;
}





Impressive. Its a very nice algorithm.





I can't claim credit  that would go to the paper's author  I just did the implementation. The author isn't actually listed in the paper but it's under Jim Morrow's website at the University of Washington so I assume he had something to do with it. Yours is a great demo of backtracking also. I really am not trying to demean it (it's probably about what I would have come up with before I saw the paper)  I just thought you might like to see an alternative.
Regards,
Darrell





Its ok, i'm not offended. I'm still quite happy with my own attempt considering I'm a psychology student who does this on the side for fun. haha. I appreciate the link though. It's an impressive algorithm that i may implement in future projects.





Sorry if my previous response was vague. In regards to it being solvable. Producing the solvable grid from the solution grid is the easiest part. Selectingly removing squares will produce a puzzle grid.





Actually, I also think you don't need to iterate through the squares 3 times, just check each condition on your way through, so here is how a modified sub would look. I did not test it but I think it should be faster.
Private Function Conflicts(ByVal CurrentValues As Square(), ByVal test As Square) As Boolean
For Each s As Square In CurrentValues
If (s.Across <> 0 And s.Across = test.Across) OrElse _
(s.Down <> 0 And s.Down = test.Down) OrElse _
(s.Region <> 0 And s.Region = test.Region) Then
If s.Value = test.Value Then
Return True
End If
End If
Next
Return False
End Function





Thanks or the improvements, will be sure to test them when I get back into coding I will make the adjustments. Psychology degrees don't lend much time to coding these days.





The ANZAC wrote: Psychology degrees don't lend much time to coding these days.
And how does that make you feel?
"Old age is like a bank account. You withdraw later in life what you have deposited along the way."  Unknown
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it."  Michael Simmons





I've made all the changes you suggested. A slight increase in performance is noted. Thankyou.





could you update the zip file of this article?





Done





Remove the three declarations in conflicts
Dim t As Integer = 0





In the conflicts function you can remove the commons list variable and the clear calls as it appears to not be used anywhere.





I believe one additional performance improvement would be to move the 'Dim i As Integer = GetRan(0, Available(c).Count  1)' line inside the if statement immediately following. It is only required if there are available numbers, so you do not need to generate it in the case the you are backtracking.
Dim i As Integer = GetRan(0, Available(c).Count  1)
If Not Available(c).Count = 0 Then 'if every number has been tried and failed then backtrack
Dim z As Integer = Available(c).Item(i)
If Conflicts(Squares, Item(c, z)) = False Then 'do a check with the proposed number
Squares(c) = Item(c, z) 'this number works so we add it to the list of numbers
Available(c).RemoveAt(i) 'we also remove it from its individual list
c += 1 'move to the next number
Else
Available(c).RemoveAt(i) ' this number conflicts so we remove it from its list
GoTo A 'get a new number
End If
Else
For y As Integer = 1 To 9 'forget anything about the current square
Available(c).Add(y) 'by resetting its available numbers
Next
Squares(c  1) = Nothing 'go back and retry a different number
c = 1 'in the previous square
GoTo A
End If





in the
Private Function GetRan(ByVal lower As Integer, ByVal upper As Integer) As Integer
Dim r As New Random
GetRan = r.Next(lower, upper + 1)
End Function
move the dim r as new random at the class level and remove it from the function
you should see a "small" boost




