Duncan’s blog

February 6, 2009

Project Euler problem 56

Filed under: Project Euler,Python — duncan @ 12:00 am
Tags: , , ,

Problem 56:

A googol (10100) is a massive number: one followed by one-hundred zeros; 100100 is almost unimaginably large: one followed by two-hundred zeros. Despite their size, the sum of the digits in each number is only 1.

Considering natural numbers of the form, ab, where a, b < 100, what is the maximum digital sum?

So, let’s do nested loops, both going 1 to 100. Inside the inner loop, calculate a^b (or i^j in this case). Then loop through the digits of that value, adding them up. Keep track of which is the largest value this produces.

Running time about 3 seconds in Python:

import time
tStart = time.time()

maxsum = 0

for i in range(100):
    for j in range(100):
        num = i ** j
        total = 0

        for k in str(num):
            total += int(k)

        if total > maxsum:
            maxsum = total

print(maxsum)
print("time:" + str(time.time() - tStart))        

February 5, 2009

SQL nested joins

Filed under: Uncategorized — duncan @ 12:00 am
Tags: , , ,

I’ve been working on a query recently that has been doing my nut in. It involved five tables, possibly six depending how I wanted to do it. I had several tables that were going to be inner joined to each other. Then there were two tables which would be inner joined together, but outer joined to the other tables. I only wanted to return any rows from the first of those two tables if there were also matching rows in the second of the two tables.

To try and explain it, here’s a rough ER diagram I put together in Gliffy.com:

6-way join

So, I’m inner joining Category to SubCategory and SubCategory_School and School. That’s the easy part.

I’m only interested in rows from KeyIndicator where there are corresponding entries in School_KeyIndicator joining them to the School table. And that’s where I was having the difficulty.

I wasn’t getting anywhere with this, and couldn’t find anything on the web other than very simple examples joining two tables together. I had a look to see if Pinal Dave’s excellent SQL Authority blog had any useful information. He didn’t but he did link to an article by Coldfusion über-blogger Ben Nadel, Grouping JOIN Clauses In SQL, which was exactly what I was looking for.

SELECT	C.ID AS CategoryID, C.Name AS CategoryName,
	SC.ID AS SubCategoryID, SC.Name AS SubCategoryName,
	KI.ID, KI.Name,
	S.ID AS SchoolKIID, S.Data,
	SC_S.ID AS ScoreID , SC_S.Score
FROM	Category C
		INNER JOIN SubCategory SC
			ON C.ID = SC.CategoryID
		LEFT OUTER JOIN
		(
			KeyIndicator KI
				INNER JOIN School_KeyIndicator S
					ON KI.ID = S.KeyIndicatorID
					AND S.School = #URL.ID#
		)
			ON SC.ID = KI.SubCategoryID
		LEFT OUTER JOIN SubCategory_School SC_S
			ON SC.ID = SC_S.SubCategoryID
			AND SC_S.School = #URL.ID#
ORDER BY C.ID, SC.ID, KI.OrderBy, KI.ID

In the end I eliminated the School table from my query, as I didn’t actually need it. However what really made this work was the nested join, i.e. the part inside the brackets. Basically I do my INNER JOIN between KeyIndicator and School_KeyIndicator. Because that’s in the bracket, that gets evaluated first. Then the result of that is LEFT OUTER JOINed to the SubCategory table.

February 4, 2009

Glenrothes by-election register missing, presumed dead

It’s been widely reported that the Sheriff Court in Kirkcaldy has managed to lose the official electoral register from the Glenrothes by-election in November last year. The register lists all the names of who voted, at which polling stations (but not how they voted), and is meant to be kept for a year, and made open for inspection. The SNP made repeated requests to see it, starting in November shortly after the election, and it’s taken until now for the Sheriff’s office to admit they don’t have it.

SNP MSP Tricia Marwick said:

It is almost beyond belief that a by-election which attracted media coverage throughout the UK, which delivered a surprise result and had a much higher turnout than anticipated now has no records to show who actually voted.

I have no evidence of foul play. I sincerely hope not. But the turnout at this by-election surprised everyone and the result was a surprise.

Without coming straight out and saying it, it sounds to me like she’s hinting that the election might have been rigged; without the register they’ll never know. Lindsay Roy say they’d be happy for a recount, because the ballot papers haven’t been lost. However there’s no way of telling who voted without the register, so the ballots could have been rigged by registering multiple false names on the electoral register (for instance).

The Courier are reporting Marwick’s statement slightly differently:

Asked if she suspected foul play, she said, “No I do not. Nor do I believe it was a fair election“.

Of course that implies some level of Machiavellian plotting from all involved; the Labour politicians not only managing to pinch the election, but also to collaborate with someone in the sheriff court to steal and destroy the evidence. Incredibly unlikely, it’s almost guaranteed to be pure incompetence on the parts of someone in the courts. Sounds like the register’s been chucked into their secure paper disposal bucket.

On the other hand, this is just yet another public sector loss of important data, and the Information Commissioner should issue an enforcement notice against the Scottish Courts Service. Not just for the loss, but also the lateness in admitting they’d lost it.

Some sources:

Also reported in The Spoof: Mugabe: Brown an ‘Excellent Student’

Mr Mugabe said, “Mr Brown has been an excellent student, for a colonialist. He quickly learned the rules of ‘Banana-Republic’ electioneering and, with my help, he ensured that Glenrothes, [population of 47,000] had 1.2 million registered postal voters, all of whom voted Labour“.

:-)

Project Euler problem 53

Filed under: Project Euler,Python — duncan @ 12:00 am
Tags: , ,

Problem 53:

There are exactly ten ways of selecting three from five, 12345:

123, 124, 125, 134, 135, 145, 234, 235, 245, and 345

In combinatorics, we use the notation, 5C3 = 10.

In general,
nCr = n! / (r!(n-r)!)
where r <= n, n! = n*(n-1)*…*3*2*1, and 0! = 1.


It is not until n = 23, that a value exceeds one-million: 23C10 = 1144066.

How many, not necessarily distinct, values of nCr, for 1 <= n <= 100, are greater than one-million?

At first I looked at this and thought it was all about iterating through the combinations of digits, just like how it lists 123, 124, etc. Which sounds tricky. But it’s much simpler than that. In fact they give you the hardest part of the algorithm, the equation needed:
n! / (r!(n-r)!)

In Python this solution takes about 0.36 seconds. I’m going to start using the time module to calculate running times.

import time
tStart = time.time()

values = 0

def factorial(x):
    y = 1
    for i in range(1,x+1):
        y = y * i
        
    return y

def calc(n, r):
    return  factorial(n) / (factorial(r) * factorial(n-r))

for i in range(1,101):
    for j in range(1, i):
        if calc(i, j) > 1000000:
            values += 1
           

print(values)
print("time:" + str(time.time() - tStart))

So, two functions, one calculates the factorial of x, the other just calculates our equation. The second function could instead have just been expressed as one line in our procedural code.

Then loop from 1 to 100. Within that, do an inner loop from 1 to whatever our current outer loop is on, passing the two loop counters into our function. We don’t really care which numbers produce values greater than a million; we only care how many numbers there are. So value is just a simple counter.

February 3, 2009

Project Euler problem 40

Filed under: Coldfusion,Project Euler,Python — duncan @ 12:00 am
Tags: , , , ,

Problem 40:

An irrational decimal fraction is created by concatenating the positive integers:

0.123456789101112131415161718192021…

It can be seen that the 12th digit of the fractional part is 1.

If dn represents the nth digit of the fractional part, find the value of the following expression.

d1 * d10 * d100 * d1000 * d10000 * d100000 * d1000000

Firstly I completed this in Coldfusion. I got the correct answer by brute-forcing it, but the solution was slow. This took about 11 minutes on a CF 5 server; it would undoubtedly be a lot quicker on any CFMX server. I then rewrote it in Python, which spits out the correct answer within a couple of seconds.

I thought it might be interesting to do a code comparison between the two languages.
CFML:

<cfset fraction = "">
<cfset total = 1>

<cfloop index="i" from="1" to="1000000">
<!--- we're going to break out of the loop sooner than that --->
	<cfset fraction = fraction & i>
	
	<cfif Len(fraction) GTE 1000000>
		<cfbreak>
	</cfif>
</cfloop>

<cfset total = total *	Mid(fraction, 1, 1) * 
	Mid(fraction, 10, 1) * 
	Mid(fraction, 100, 1) * 
	Mid(fraction, 1000, 1) * 
	Mid(fraction, 10000, 1) * 
	Mid(fraction, 100000, 1) * 
	Mid(fraction, 1000000, 1)>

<cfoutput>#total#</cfoutput>

Python:

fraction = ""
total = 1

for i in range(1,1000000):
    fraction = fraction + str(i)

    if len(fraction) >= 1000000:
        break

total = total * \
        int(fraction[0]) * \
        int(fraction[9]) * \
        int(fraction[99]) * \
        int(fraction[999]) * \
        int(fraction[9999]) * \
        int(fraction[99999]) * \
        int(fraction[999999])

print(total)

Almost identical code. I’m still not anywhere near writing pythonic code, but it’s early days.

Basically I loop from 1 to 1000000 (not really that far, because I break out of the loop once our string has length of 1000000). As I go I append each loop counter to the string.

Then I just multiply together the 1st, 10th, 100th etc values from the string. That part could also have been done in a loop to simplify the code slightly. The whole thing’s not particularly clever, but it works.

If you’re coming to this from a Coldfusion background, you’re maybe wondering why I’m doing a DIV between each of the values before multiplying them. Because whitespace is important in Python, the \ can be used to join two lines together where the line length gets too long.

February 2, 2009

Project Euler problem 34

Filed under: Coldfusion,Project Euler — duncan @ 12:00 am
Tags: , , , , ,

Problem 34:

145 is a curious number, as 1! + 4! + 5! = 1 + 24 + 120 = 145.

Find the sum of all numbers which are equal to the sum of the factorial of their digits.

Note: as 1! = 1 and 2! = 2 are not sums they are not included.

This one doesn’t seem too hard. However, the tricky part is determining what the upper bound should be. I wasn’t really sure how this one would progress, so decided to set a relatively low upper bound, 99999. I was then going to check what sort of values I was getting back, then increase the bound as necessary.

However running the following code, I saw there weren’t very many values at all. Just on the off-chance I’d got the final solution, I entered it into the Project Euler page, and I had indeed got it right, purely by luck.

I later discovered that the real upper bound that you would theoretically need to use is 9999999. Doing that with the following code would take much longer.

<cfscript>
function factorial(x)
{
	var i = 0; 
	var factorial = 1;
	
	for (i = x; i GT 1; i = i - 1)
	{
		factorial = factorial * i;
	}
	
	return factorial;
}
</cfscript>

<cfset numbers = ArrayNew(1)>
<cfset total = 0>

<!--- how far do we need to loop?  Let's just start and see how it progresses... --->
<cfloop index="i" from="3" to="99999">
	<cfset sum = 0>
	
	<!--- get the factorial of each digit of i --->
	<cfloop index="j" from="1" to="#Len(i)#">
		<cfset sum = sum + factorial(Mid(i, j, 1))>
	</cfloop>
	
	<!--- does i = the sum of the factorials? --->
	<cfif i EQ sum>
		<cfset ArrayAppend(numbers, i)>
	</cfif>
</cfloop>

<cfdump var="#numbers#">

<cfloop index="i" from="1" to="#ArrayLen(numbers)#">
	<cfset total = total + numbers[i]>
</cfloop>

<cfoutput>#total#</cfoutput>

Factorials can get very large quickly. Fortunately I’m only ever going to be putting values from 0 to 9 into my factorial function.

Note that 0! = 1, not zero. This is the empty product. Fortunately my factorial function takes care of that.

Then I loop, ignoring 1 and 2, to my lucky choice of 99999 for the upper limit (the comment in my code gives an idea of how I was thinking!). Sum up the factorial of each digit. Check if that sum is the same as our number; if so add it to an array.

At the end I dumped out the array, just to see what sort of results I was getting. Then later I added in the final loop, adding up the values of the array to give the correct solution.

Judging by the forum discussion, there wasn’t any clear idea of what the upper limit should be, with most people seeming to take an educated guess.

Spoiler alert:
Interestingly, there are only four numbers that this works for, and they’re known as factorions. Two of those are 1 and 2, so the solution to this puzzle only requires you to work out two other numbers. And one of them, 145, has already been given to you in the problem!

February 1, 2009

Project Euler problem 28

Filed under: Coldfusion,Project Euler — duncan @ 7:37 pm
Tags: , , ,

Problem 28:

Starting with the number 1 and moving to the right in a clockwise direction a 5 by 5 spiral is formed as follows:

21 22 23 24 25
20  7  8  9 10
19  6  1  2 11
18  5  4  3 12
17 16 15 14 13

It can be verified that the sum of both diagonals is 101.

What is the sum of both diagonals in a 1001 by 1001 spiral formed in the same way?

There’s various ways to look at this problem. One thing I noticed was that, in the example above, the numbers we’re adding go:
1,
3, 5, 7, 9,
13, 17, 21, 25

So, excluding the 1 in the centre, we have four numbers that increment by 2, then four numbers that increment by 4. If we made the spiral wider, the next set of four numbers would increment by 6, and so on. I used that observation as the basis for this CFML:

<cfset total = 1>
<cfset runningtotal = 1>

<cfset length = 1001>

<cfloop index="i" from="2" to="#(length-1)#" step="2">
	<cfloop index="j" from="1" to="4">
		<cfset runningtotal = runningtotal + i>

		<cfset total = total + runningtotal>
	</cfloop>
</cfloop>

<cfoutput>#total#</cfoutput>

I’m using two variables here, one is our grand total. The other is a running total. So on each iteration of the outer loop, our running total will have increased by (4 * 2), (4 * 4), (4 * 6), etc… The grand total will have all instances of running total added to it.

There’s neater ways this could have been expressed, but this was how I chose to do it; it’s simple and fast.

A word of warning: even though the problem states “What is the sum of both diagonals“, that’s not really the solution they’re looking for. If you were to add up the two diagonals in the example above, you’d have:
Diagonal 1: 21 + 7 + 1 + 3 + 13 = 45
Diagonal 2: 17 + 5 + 1 + 9 + 25 = 57
Adding those two values would give you 102, not 101. This is because you’d have added 1 twice. What the problem really wants here is the sum of all the unique values of both diagonals.

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.