Convert string containing roman numeral to integer equivalent

I have the following line:

str = "MMX Lions Television Inc"

      

And I need to convert it to:

conv_str = "2010 Lions Television Inc"

      

I have the following function to convert a Roman numeral to its integer equivalent:

numeral_map = zip(
    (1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1),
    ('M', 'CM', 'D', 'CD', 'C', 'XC', 'L', 'XL', 'X', 'IX', 'V', 'IV', 'I')
)

def roman_to_int(n):
    n = unicode(n).upper()

    i = result = 0
    for integer, numeral in numeral_map:
        while n[i:i + len(numeral)] == numeral:
            result += integer
            i += len(numeral)
    return result

      

How to use re.sub

to get the correct string here?

(Note: I tried using the regex

one described here: How do you only match valid roman numerals to a regex? But it didn't work.)

+3


source to share


2 answers


re.sub()

can accept a function as a replacement, the function will receive a single argument, which is a Match object, and must return the replacement string. You already have a function to convert a roman numbered string to int, so it shouldn't be difficult.

In your case, you need a function like this:

def roman_to_int_repl(match):
    return str(roman_to_int(match.group(0)))

      

You can now modify the regex from the linked question to find matches in the larger string:

s = "MMX Lions Television Inc"
regex = re.compile(r'\b(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')
print regex.sub(roman_to_int_repl, s)

      



Here's a regex version that won't replace "LLC" in a string:

regex = re.compile(r'\b(?!LLC)(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')

      

You can also use the original regex with the modified replace function:

def roman_to_int_repl(match):
    exclude = set(["LLC"])   # add any other strings you don't want to replace
    if match.group(0) in exclude:
        return match.group(0)
    return str(roman_to_int(match.group(0)))

      

+2


source


Always check the Python Package Index when looking for a shared function / library.

This is a list of modules associated with the "roman" keyword .



For example, "romanclass" has a class that implements the conversion, quoting the documentation:

So a programmer can say:

>>> import romanclass as roman

>>> two = roman.Roman(2)

>>> five = roman.Roman('V')

>>> print (two+five)

and the computer will print:

VII

      

+5


source







All Articles