# Duncan’s blog

## June 19, 2011

### Project Euler problem 26

Filed under: Project Euler,Python — duncan @ 9:27 pm

A unit fraction contains 1 in the numerator. The decimal representation of the unit fractions with denominators 2 to 10 are given:

1/2 = 0.5
1/3 = 0.(3)
1/4 = 0.25
1/5 = 0.2
1/6 = 0.1(6)
1/7 = 0.(142857)
1/8 = 0.125
1/9 = 0.(1)
1/10 = 0.1

Where 0.1(6) means 0.166666…, and has a 1-digit recurring cycle. It can be seen that 1/7 has a 6-digit recurring cycle.

Find the value of d < 1000 for which 1/d contains the longest recurring cycle in its decimal fraction part.

Another one I initially attempted with ColdFusion, but it didn’t give decimals to enough precision (at least without using the underlying Java), so turned to Python.

To me this seemed like a good place to use a regular expression, where you can match a group and any repetitions of it, using back-references. Some problems around which parts should be greedy or not. For instance if you have 0.01010101 is the repeating pattern 0101 twice, or 01 four times?

Here’s the regular expression I used:

“^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$”

And here’s an explanation of the various parts of it:

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The ^ matches the beginning of the string, and the \$ matches the end of the string. They’re not always strictly necessary, but it can be good practice for examples like this.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The value is 1/d. As d starts at 2 and increases to 1000, the value starts at 0.5 and gets closer and closer to zero. So the value is always going to start with a zero. Normally a . in a regular expression indicates ‘any character’. The \. just tells the regular expression engine to ‘escape’ it and treat it as a string literal, not a regex special character.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

[0-9] just means match any digit between 0 and 9. The * means 0 or many. Notice this is before the part in the ( ) that I’m really interested in. So in other words, there may be some digits before the pattern. e.g. for 1/6, which is 0.166666… I don’t care about that first digit 1.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The ( ) is a way of grouping parts of a regex, in this case for the benefit of back-referencing.
The fact we’re given a 6 digit example means we can assume the answer has to be at least 7 digits. So I used the {7,} syntax to mean the group has to be a minimum of 7 digits long.
The ? means 0 or 1. In other words, only give me up to 1 matching group that is a minimum of 7 characters long, and don’t be greedy!

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The \1 is a back reference to the earlier grouped part. If there were multiple parts each with their own ( ) you could back-reference them with \2, \3 etc.
And the + means 1 or more, so there has to be at least 1 repeating group.

^0\.[0-9]*([0-9]{7,}?)\1+[0-9]*?\$

The final [0-9] is for any trailing digits that don’t fit the pattern. Even though once the pattern starts in theory it continues infinitely, in reality you might get a value that is rounded, e.g. where you have 0.16666667 instead of 0.16666666…
The * means zero or many of these trailing digits, and the ? stops it from being greedy. Interestingly, you’d assume that just the ? on its own would be more efficient, as that extra digit is just going to be one optional digit and no more. But that reduces the performance somewhat.

And here’s the complete code.

```import re
from decimal import *

def displaymatch(match):
if match is None:
return 0

return len(match.group(1))

length = 0
maxlen = 0
d = 0
getcontext().prec = 2000

for i in range(1,1000):
fraction = 1/Decimal(i)
pattern = re.search(r"^[0-9]\.[0-9]*([0-9]{7,}?)(\1+)[0-9]*?\$", str(fraction))

length = displaymatch(pattern)

if length > maxlen:
maxlen = length
d = i

print(d)```

I’m using the Python regular expression and Decimal libraries. I also had to increase the precision of the decimal context to up to 2000, as lower values were giving me incorrect answers. i.e. it was truncating the fraction a bit, which meant my pattern matching wasn’t finding the correct answer. So by increasing the precision, I increased the length of the fraction being returned, and therefore improved the chances of finding the correct answer.

Theme: Rubric. Blog at WordPress.com.