Object-oriented vs. Functional Modelling of Musical Arithmetic in Python
Classical vs. Functional Modelling of Musical Arithmetic in Python¶
I'm currently building a music theory library in Python, called Ophis.
import ophis
This is an attempt create a utility that "understands" music theory and can manipulate music information, to be used as a base for other applications. This would be handy for all sort of things, from music theory educational apps to AI composition.
In this notebook, we'll look at how I originally implemented basic musical arithmetic in Ophis, the problems with that approach, and why I am moving from a classical to a functional design.
A Classical OOP Design¶
My first approach in implementing this was classically object oriented, and influenced by an essentially Platonic ontology.
The idea was that musical building blocks would be, as much as possible, similar to integers.
# A `Chroma` is the *idea* of a note letter name
# Example: "A" or "D FLAT"
# 35 chromae are initialized to constants on load,
# representing all 7 letter names,
# with sharps, flats, double sharps, and double flats.
ophis.wcs # Western Chroma Set,
# the complete list of all initialized chromae
One of the main ideas here is that there is one and only one representation of the idea of C SHARP or F NATURAL . Moreover, the chromae can be inspected, and know how to represent themselves.
ophis.FSHARP.unicode
ophis.FSHARP.ascii
ophis.FSHARP.base
Chromae also carry all the logic needed for musical manipulation and mathematical representation.
int(ophis.FSHARP)
ophis.FSHARP.augment()
ophis.FSHARP.diminish()
A Pitch
is a Chroma
with an octave designation. Using the special __call__
method on Chroma
, and the __repr__
method on Pitch
, I was able to make their interactive representation is intuitive.
# in Chroma class
def __call__(self, octave):
return Pitch(self, octave)
# in Pitch class:
def __repr__(self):
return self.chroma.name + "(" + self.octave + ")"
# The "standard Python" way to create a pitch.
ophis.Pitch(ophis.GFLAT, 2)
# The Ophis canonical way.
ophis.GFLAT(2)
Intervals
(without octaves) and QualifiedIntervals
(with octaves) have a similar relationship to each other as Chroma
and Pitch
.
Rather than initializing every possible musical interval, the qualities (major, minor, perfect, augmented, diminished) are initialized and callable, to create an intuitive API.
ophis.Major(2) # A Major second.
ophis.Perfect(4, 2) # A Perfect fourth, plus 2 octaves.
Function caching is used to ensure that only one of any interval is created. (Some experimental benchmarking showed that this would matter in large scores.)
id(ophis.minor(2).augmented()) == id(ophis.Major(2))
And, of course, you can use both types of intervals to manipulate chromae and pitches.
ophis.G + ophis.Major(2)
ophis.A(2) + ophis.Perfect(5)
ophis.FSHARP(1) + ophis.Major(2, 2)
All this lets you do complicated musical manipulation and representation.
(ophis.FFLAT + ophis.Perfect(5)).diminish().unicode
Obviously, all this is only the beginning of what is needed for a music theory library. But it is a beginning. The next submodule will build up Duration
and TimeSignature
, leading to the creation of Measure
and eventually Score
. My current plan is to use pandas.DataFrame
for multi-voice scores, as that would allow cross-voice analysis in a way that multi-dimensional lists would not.
Problems Appear¶
So that's great, but...
I can't but help wonder if some of this is overwrought.
A number of interrelated concerns occured to me while working on this implementation.
Logic is hard to reason about¶
The math of moving from note to note is riddled with off-by-one and modulo arithmetic problems.
- An interval representing no change (from a note to itself) is called a unison, represented with a 1. A difference of one step is called a second, and so on.
- The first scale degree is 1. (Not zero indexed.)
- We frequently think about scales as having eight notes, but in reality they only have seven. When this is zero indexed, the notes go from 0-6. This is fine for arithmetic, but when thinking as a musician it is jarring.
Because of this difficulty in clear thinking on my part, I often found myself using the guess-and-check method for remembering when to add or subtract a one.
I wrote rigorous tests along the way to keep these errors out, so everything ends up fine in the end. However, this made for slow and sometimes demoralizing progress, and I would hate to have to go back and reason about this code after being away from it.
Incorrect assumptions about logical order¶
The first attempt to implement basic Chroma
functionality assumed that Interval
— the relationship between two chromae — would depend on Chroma
. It turns out this is exactly backwards. Interval
is logically prior to Chroma
. There is no way to define abstract named pitches without their relationships already existing.
Practically speaking, discovering this simply meant I had to re-order some code. But this challenged my thinking about what the fundamental building blocks of music theory actually are.
Convoluted logic and utility data structures¶
Here's an example, the augment
method from the Chroma
class.
def augment(self, magnitude=1, modifier_preference="sharp"):
"""Return a chroma higher than the one given.
Args:
magnitude (:obj:`int`, :obj:`Interval`,
or obj with an ``int`` value; optional):
the distance to augment by.
Integer values are interpreted as half steps.
Defaults to 1.
modifier_preference (:obj:`str`,
``'sharp'`` or ``'flat'``;
optional)
Defaults to ``'sharp'``.
Examples:
>>> C.augment()
CSHARP
>>> C.augment(1, 'flat')
DFLAT
>>> C.augment(minor(3))
EFLAT
>>> D.augment(2)
E
>>> E.augment()
F
>>> E.augment(2, 'flat')
GFLAT
"""
value_candidates = self.essential_set.chroma_by_value(
int(self) + int(magnitude)
)
try:
letter_candidates = self.essential_set.chroma_by_letter(
self.base_num + magnitude.distance
)
solution, = value_candidates & letter_candidates
return solution
except:
return value_candidates.enharmonic_reduce(modifier_preference)
If it isn't obvious, here's what it does:
- Calculate the integer value of the target
Chroma
and find the set ofChroma
objects which have the integer value we're looking for. - Try:
- Calculate the letter name of the target
Chroma
and find the set ofChroma
that have the name value we're looking for. - Find and return the union of the integer-value set and the note-name value set.
- Calculate the letter name of the target
- Except:
- Return a member of the integer-value set, basing the selection on some logic (defined elsewhere) that prefers sharps to flats or flats to sharps in certain instances.
This works, but it isn't at all how a musician thinks about this operation. Moreover, it depends on the essential_set
, the collection of all initialized chromae. (Referred to above as wcs
, the Wesern Chroma Set.) It would be bad enough if this was just used to keep the pool of initialized chromae, so that methods returning C Sharp always returned the same C Sharp. But it doesn't just do that. An inordinate amount of musical knowledge and logic crept into the ChromaSet
class that defines the essential_set
. While I'm positive that some of this is due to bad coding on my part, I think the bulk of it is due to bad conceptualization.
The final problem with this is that it is non-obvious. This code is hard to read and reason about, because it isn't clear what is actually happening.
Fragile Primitives¶
Python doesn't really allow you to protect object attributes or module-level constants. There are some things you can do to ensure object attributes aren't reassigned accidentally (and I've done them), but (as far as I can tell) module-level constants cannot be protected.
This is a problem, since the fundamental building blocks of music theory in the current implementation are initialized as constants. The object representing C Sharp is created and assigned to the name CSHARP
. If that name gets reassigned, you are basically hosed. This could lead to hard-to-trace errors and frustrating interactive sessions.
Poor isomorphism to numbers¶
One of the design goals of Ophis is to be able to treat musical concepts as numbers. That's why the arithmetic operators are implemented and everything has an integer value. I wanted it to be easy for math utilities to operate on pitches and intervals. This would enable things like advanced theoretical analysis and machine learning.
But, they aren't numbers. They just aren't.
You can't (meaningfully) have a Chroma with a float, decimal, or fractional value. This means that microtones are not presently accounted for and will require an extension, the logic of which I can only guess at.
You also can't meaningfully multiply or divide values. Offhand, I'm not sure why you would want to do so, but I can imagine approaches to musical analysis where it would be needed.
Further, even with supported integer-based operations, using any standard math tool requires notes from a score to be converted into numbers, manipulated or analyzed, and converted back. There's no direct access to Ophis "primitives" in Numpy, SciKitLearn, or anything else.
These problems piled up over time as I implemented the basic logic and worked out the implications. Technical debt accumulates through a process of small compromises and justifications. By the time I became aware of the scope of the problem, I had two thoughts:
- Re-architecting everything would take too long to be worthwhile. I would probably get disheartened and give up.
- I can refactor the internals in the future to make things a bit clearer and cleaner. In the meantime, good documentation would make the code maintainable.
So, my plan was to just keep moving. But then, thinking about the isomorphism problem, I realized another poorly-mapped isomorphism.
Poor isomorphism between Chroma and Intervals¶
Or really, no explicit isomorphism at all. And this is a problem because these are really the same thing.
I had implemented the Interval
class, and written all the logic for how intervals are inverted, augmented, and diminished. This requires understanding of the interplay between interval distances (third, fourth, sixth) and their qualities (Major, minor, Perfect), and how many half-steps each are. And of course there's that zero-indexing stuff to think about (second = 1, third = 2).
Then I implemented the Chroma
class, and wrote almost the same logic (but just a bit different) for how pitches are augmented and diminished (pitches aren't inverted). This requires an understanding of the interplay between note letter names (C, D, E), how those letter names map to a zero-indexed numerical representation (C = 0, D = 1, E = 2), and how modifiers like sharp and flat affect the total number of halfsteps from C (the origin point in modern music theory).
But these are, as I said, exactly the same thing.
Every note can be represented as an interval from C. And not only can it be represented that way, but that is exactly how it was already defined. There is no other reasonable way to (numerically) define notes.
Here's an example in case this isn't clear:
- E Natural is a Major Third away from C.
- In our zero-indexed representation intervals, a Third is
2
. - A Major Third is
4
half-steps.
Those two numbers, (2, 4)
, are an integral part of the definition of E Natural; without them, you can't do any of the manipulation that makes the Chroma
meaningful.
print(ophis.Major(3).distance, int(ophis.Major(3)))
print(ophis.E.base_num, int(ophis.E))
Obviously, this holds for every other Chroma
as well.
print(ophis.Perfect(5).distance, int(ophis.Perfect(5)))
print(ophis.G.base_num, int(ophis.G))
print(ophis.Augmented(6).distance, int(ophis.Augmented(6)))
print(ophis.ASHARP.base_num, int(ophis.ASHARP))
Further, it turns out that these two numbers are the only things you need to know in order to do any standard musical manipulation you might want to do.
g_or_p5 = (4,7) # Tuple representing G or a Perfect Fifth
e_or_maj3 = (2,4) # Tuple representing E or a Major Third
# Add tuples element wise.
sum_of_tuples = (
g_or_p5[0] + e_or_maj3[0],
g_or_p5[1] + e_or_maj3[1]
)
sum_of_tuples # (6,11)
g_augmented_by_maj3 = ophis.G.augment(ophis.Major(3))
e_augmented_by_p5 = ophis.E.augment(ophis.Perfect(5))
print(g_augmented_by_maj3)
print(e_augmented_by_p5)
print(ophis.B.base_num, int(ophis.B))
z = ophis.Perfect(5) + ophis.Major(3)
z
print(z.distance, int(z))
So any chroma and any interval can be represented by a two-tuple, while manipulations originally implemented as methods in different classes can be a unified set of pure functions that accept tuples as arguments.
Great.
But two-tuples don't provide all the additional information you need to notate pitches or otherwise make them understandable as music.
So we need some "translation" functions. This still involves a lot of "magic number" coding, but hopefully it can be condensed into a small set of mappings that are easy to reason about.
import bidict # efficient two-way indexing of dicts
primary_map = [
# half steps, scale degree / interval number,
# interval name, letter name, Perfect?
# (False=Major)
(0, 1, "unison", "C", True), #0
(2, 2, "second", "D", False), #1
(4, 3, "third", "E", False), #2
(5, 4, "fourth", "F", True), #3
(7, 5, "fifth", "G", True), #4
(9, 6, "sixth", "A", False), #5
(11, 7, "seventh", "B", False) #6
]
# Split primary map into bidicts for each value.
# For faster, more sensible referencing.
# This feels wrong and I need a better way.
# Maybe something with named tuple...
hs_map = bidict.bidict(
{x:item[0] for x, item in enumerate(primary_map)}
)
interval_map = bidict.bidict(
{x:item[1] for x, item in enumerate(primary_map)}
)
interval_name_map = bidict.bidict(
{x:item[2] for x, item in enumerate(primary_map)}
)
name_map = bidict.bidict(
{x:item[3] for x, item in enumerate(primary_map)}
)
quality_map = {
x:item[4] for x, item in enumerate(primary_map)
}
# How to translate between
# diatonic intervals and modified intervals.
interval_quality_map = {
True: bidict.bidict({ # Diatonic is Perfect
-2 : 'double diminished',
-1 : 'diminished',
0 : 'Perfect',
1 : 'Augmented',
2 : 'Double Augmented'
}),
False: bidict.bidict({ # Diatonic is Major
-2 : 'diminished',
-1 : 'minor',
0 : 'Major',
1 : 'Augmented',
2 : 'Double Augmented'
})
}
modifiers = bidict.bidict({
-2 : 'doubleflat',
-1 : 'flat',
0 : 'natural',
1 : 'sharp',
2 : 'doublesharp'
})
import functools
# Single Dispatch:
# two functions with the same signature.
#
# The type of the first argument determines
# which function is executed.
# This way, if a tuple is passed in,
# a string is returned,
# and if a string is passed in,
# a tuple is returned.
@functools.singledispatch
def chroma(x):
return None
@chroma.register(tuple)
def _(xy):
x,y = xy
name = name_map[x]
modifier = modifiers[y - hs_map[x]]
return " ".join([name, modifier])
@chroma.register(str)
def _(letter, modifier):
x = name_map.inv[letter]
mod_diff = modifiers.inv[modifier]
y = hs_map[x] + mod_diff
return x,y
@functools.singledispatch
def interval(x):
return None
@interval.register(tuple)
def _(xy):
x,y = xy
name = interval_name_map[x]
q_mod = y - hs_map[x]
q = interval_quality_map[quality_map[y]][q_mod]
return " ".join([q, name])
@interval.register(str)
def _(q, n): # quality, number
x = n - 1
is_perfect = quality_map[x]
q_mod = interval_quality_map[is_perfect].inv[q]
y = hs_map[x] + q_mod
return x, y
def augment(a, b):
return tuple(map(sum,zip(a,b)))
def diminish(a, b):
return tuple(y - b[x] for x,y in enumerate(a))
chroma((0,0))
interval((2,3))
chroma('D', 'sharp')
interval('Major', 3)
chroma(augment(chroma('C', 'sharp'), interval('minor', 3)))
%timeit chroma(
augment(chroma('C', 'sharp'), interval('minor', 3))
)
ophis.CSHARP.augment(ophis.minor(3))
%timeit ophis.CSHARP.augment(ophis.minor(3))
A functional approach:
- simplifies the math and logic
- preserves important isomorphisms
- requires much less code
- executes much faster
The only downside is that the API for interactive use is a little less elegant, but not so much as to be a problem.
Where to Go From Here¶
The quick functional implementation demonstrated here doesn't include all the things that the OO approach currently has.
Foremost, this version needs to include modulo arithmetic.
augment((7,11),(1,1)) # Should be (0,0), in musical logic.
# This result is meaningless.
chroma((8,12)) # KeyError
Additionally, I need to include octave designations. The arithmetic is almost included for free with the functional approach, but the translation functions don't support it.
# Maj Second (or D) and min 3, with a third term for octave designation
augment((1,2,1), (2,3,2))
# F, 2 octaves above Middle C
chroma((3,5,3)) # ValueError
The translation functions and associated dictionaries need to be extended to include multiple representations such as Unicode, ASCII, and Lilypond.
Finally, I might also do some experiementation with a hybrid approach that would keep the OO API intact. However, that might get too complicated.
Is Functional Really Better?¶
I don't know.
The difference in execution speed can probably be resolved by cleaning up the OO implememenation and importing the tuple-based arithmetic instead of the union-of-sets approach. I suspect that this would also make the OO code easier to read and reason about.
I'm not at all convinced that there is something inherently wrong with object oriented code. However, I do think that a classical paradigm promotes a "different things are different things" mentality. In some cases this is probably helpful. In music, at least in this case, it obscured a fundamental sameness between two important concepts. Functional programming forced me to recognize that sameness. Someone else might have recognized it anyway, and written a good OO implementation first.
If there is a generalizable lesson here, I think it might this: Think through a few different approaches and paradigms. Try coding up more than one logical implementation in the domain. See how that sheds light on the underlying problems and data structure.
Also, don't be afraid to redesign things. Especially if you don't have any users yet.
Comments
Comments powered by Disqus