Duncan's blog

June 19, 2011

Project Euler problem 26

Filed under: Project Euler,Python — duncan @ 9:27 pm

A unit fraction contains 1 in the numerator. The decimal representation of the unit fractions with denominators 2 to 10 are given:

1/2 = 0.5
1/3 = 0.(3)
1/4 = 0.25
1/5 = 0.2
1/6 = 0.1(6)
1/7 = 0.(142857)
1/8 = 0.125
1/9 = 0.(1)
1/10 = 0.1

Where 0.1(6) means 0.166666…, and has a 1-digit recurring cycle. It can be seen that 1/7 has a 6-digit recurring cycle.

Find the value of d < 1000 for which 1/d contains the longest recurring cycle in its decimal fraction part.

Another one I initially attempted with ColdFusion, but it didn’t give decimals to enough precision (at least without using the underlying Java), so turned to Python.

To me this seemed like a good place to use a regular expression, where you can match a group and any repetitions of it, using back-references. Some problems around which parts should be greedy or not. For instance if you have 0.01010101 is the repeating pattern 0101 twice, or 01 four times?

Here’s the regular expression I used:

“^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$”

And here’s an explanation of the various parts of it:

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The ^ matches the beginning of the string, and the \$ matches the end of the string. They’re not always strictly necessary, but it can be good practice for examples like this.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The value is 1/d. As d starts at 2 and increases to 1000, the value starts at 0.5 and gets closer and closer to zero. So the value is always going to start with a zero. Normally a . in a regular expression indicates ‘any character’. The \. just tells the regular expression engine to ‘escape’ it and treat it as a string literal, not a regex special character.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

[0-9] just means match any digit between 0 and 9. The * means 0 or many. Notice this is before the part in the ( ) that I’m really interested in. So in other words, there may be some digits before the pattern. e.g. for 1/6, which is 0.166666… I don’t care about that first digit 1.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The ( ) is a way of grouping parts of a regex, in this case for the benefit of back-referencing.
The fact we’re given a 6 digit example means we can assume the answer has to be at least 7 digits. So I used the {7,} syntax to mean the group has to be a minimum of 7 digits long.
The ? means 0 or 1. In other words, only give me up to 1 matching group that is a minimum of 7 characters long, and don’t be greedy!

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The \1 is a back reference to the earlier grouped part. If there were multiple parts each with their own ( ) you could back-reference them with \2, \3 etc.
And the + means 1 or more, so there has to be at least 1 repeating group.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The final [0-9] is for any trailing digits that don’t fit the pattern. Even though once the pattern starts in theory it continues infinitely, in reality you might get a value that is rounded, e.g. where you have 0.16666667 instead of 0.16666666…
The * means zero or many of these trailing digits, and the ? stops it from being greedy. Interestingly, you’d assume that just the ? on its own would be more efficient, as that extra digit is just going to be one optional digit and no more. But that reduces the performance somewhat.

And here’s the complete code.

```import re
from decimal import *

def displaymatch(match):
if match is None:
return 0

return len(match.group(1))

length = 0
maxlen = 0
d = 0
getcontext().prec = 2000

for i in range(1,1000):
fraction = 1/Decimal(i)
pattern = re.search(r"^[0-9]\.[0-9]*([0-9]{7,}?)(\1+)[0-9]*?\$", str(fraction))

length = displaymatch(pattern)

if length > maxlen:
maxlen = length
d = i

print(d)```

I’m using the Python regular expression and Decimal libraries. I also had to increase the precision of the decimal context to up to 2000, as lower values were giving me incorrect answers. i.e. it was truncating the fraction a bit, which meant my pattern matching wasn’t finding the correct answer. So by increasing the precision, I increased the length of the fraction being returned, and therefore improved the chances of finding the correct answer.

1. ```def period(d):
"Return the period of the recurring decimal 1/d, or 0 if it terminates."
while d % 2 == 0: d //= 2
while d % 5 == 0: d //= 5
for n in range(1, d):
if (10 ** n - 1) % d == 0:
return n
return 0```

Comment by Gareth Rees — June 19, 2011 @ 10:17 pm

2. I edited your comment to maintain the all-important whitespace. I’m going to have to think about your solution there, it’s not immediately obvious to me what it’s doing!

Comment by duncan — June 20, 2011 @ 6:17 am

3. I am not fluent in Python, but I have a description of what I think is Gareth’s solution approach. I migth be off though. However, in any case it is a valid solution approach.

The implementation is written in C# but should be fairly readable.

http://www.mathblog.dk/2011/project-euler-26-find-the-value-of-d-1000-for-which-1d-contains-the-longest-recurring-cycle/

/Kristian

Comment by Kristian Edlund — June 20, 2011 @ 6:55 am

4. Are you working through each problem? And, BTW, neat code. I am hacking a little and tidying a little, but only up to problem 7, but really enjoying it.

J

http://www.financialagile.com/reflections/8-software/95-problem-with-this-design

Comment by Jamie — June 20, 2011 @ 7:29 am

• I tried working through them a while back, then got about as far as I could after almost 40 problems! Occasionally I come back to them and try again, with varying degrees of success!

Comment by duncan — June 20, 2011 @ 8:15 am

5. There are two parts to the algorithm I gave.

First, cast out all factors of 2 and 5 from d. This gets rid of the non-repeating part of the decimal. For example, 1/168 = 0.0059523809523809521… repeats with period 6, but only after the non-repeating sequence “005”. But after casting out all the factors of 2, we are left with 1/21 = 0.047619047619047616… which has the same period, but no initial non-repeating sequence.

Second, find the smallest number of the form 999….9 that is divisible by d, if any. For suppose that 1/d is repeating with period p. (For example, 1/21 = 0.047619047619047616… which repeats with period 6.) Then 10**p/d is an integer k, plus 1/d. (For example, 10**6/21 = 47619.047619047618…) So we have 10**p/d = k + 1/d. Therefore, k = (10**p − 1)/d and so d divides 10**p − 1.

You can improve the efficiency a bit by not accumulating 10**p mod d as you go round the loop, but it makes the code a bit longer:

```def period(d):
"Return the period of the recurring decimal 1/d, or 0 if it terminates."
while d % 2 == 0: d //= 2
while d % 5 == 0: d //= 5
k = 1 # Loop invariant: k == 10**p % d
for p in range(1, d):
k = (k * 10) % d
if k == 1:
return p
return 0```

Comment by Gareth Rees — June 20, 2011 @ 2:35 pm

• Oops! For “not accumulating” read “accumulating”.

Comment by Gareth Rees — June 20, 2011 @ 2:36 pm

6. […] preso spunto da qui e da qui per il pattern della regex. Dalla prima analisi, prendiamo per buono il fatto di testare solo i […]

Pingback by Problem Euler #26 « bancaldo™ — November 16, 2012 @ 7:31 am

Blog at WordPress.com.