Roman numeral to integer

Introduction

Roman numerals are a writing system for natural numbers (= positve integers) by using a combination of latin letters as symbols with a fixed integer value assigned to each symbol.

There are 7 symbols and their assigned integer values are:

  • M = 1000, D = 500, C = 100, L = 50, X = 10, V = 5, I = 1

While the conversion of a roman numeral is in its most simple form just the sum of the integer values e.g.

  • CXXIII = 100 + 10 + 10 + 1 + 1 + 1 = 123

a few additional rules must be observed:

  • The symbols are sorted from higher to lower values from left to right (e.g. IIIXXC is not a valid roman numeral)

  • A maximum of three of the same symbols can be used together. Numbers like 4 are represented by the ‘subtractive notation’ IV = -1 + 5 = 4 instead of IIII = 1 + 1 + 1 + 1 = 4, which is an exception to the rule above, because here the lower valued symbol precedes the higher valued one.

Note that the smallest valid roman number is I = 1, the highest one is MMMCMXCIX = 3999, there are no negative values, and there is no symbol for zero.

Exercise 1

Approach

Get each symbol from left to right by iterating over the roman numeral and test if the next symbol has a higher value the current one. If that’s the case substract its value, otherwise add its value to the sum of all symbol.

Complete the function that converts a roman numeral to an integer

Know how

Using a dict as lookup table

Though the data type of a dict is a mapping and not a sequence like str, tuple and list they can be seen as a sequence of key / value pairs to be used as a lookup table. The most noticable difference between a mapping and a sequence is that a dict’s value can only be accessed by its key and not by its index.

Using a dict as lookup table = look up a value by its key

Iterating over a string

for ... in ... : loops are the simplest way to iterate over any sequence with a fixed step size.

Version 1 - using a 'for ... in ... :' loop

To test if the last item has been reached it is also required to know the current index. Therfore the built-in function enumerate() has to be used.

Using 'enumerate()' to get the index within a 'for ... in ... :' loop

Slicing a string

Parts of strings can be ‘sliced’ (= cut out) by using square brackets and indices. Be aware that start indices start counting at 0 and not 1. Up to three numbers can be supplied sequence[start:stop:step] with the last two being optional.

Slicing out a sequence of two letters of a string

Go to solution.

Exercise 2

Approach

An alternative approach is to treat the 6 possible double-letter combinations

  • CM = 900, CD = 400, XC = 90, XL = 40, IX = 9, IV = 4

as an extension to the 7 single-letter symbols by and test for double-letter symbols first and if that fails use the single-letter symbols.

Complete the function that converts a roman numeral to an integer

Know how

Iterating over a string with varying step sizes

Depending if a two-letter combination has been found or not the step size has to be changed. Here a while ...: loop might be the easier way though it can also be done with a for ... in ... : loop (in a not so obvious way).

Using a 'while ... :' loop to skip every second letter

Using a 'for ... in ... :' loop to skip every second letter

Using a 'for ... in ... :' loop to skip every second letter in combination with 'enumerate()'

Go to solution.