Skip to main content

Learn About Python Decorators by Writing a Function Dispatcher

Python decorators transform how functions work.

@this_is_a_decorator
def some_function():
    return something

"""
>>> some_function()
Something else!
"""

Between the core language and the standard library, there are several decorators that come with Python. Also, many popular frameworks provide decorators to help eliminate boilerplate and make building applications faster and easier.

Decorators are easier to use than to understand.

This article will try to help you understand how decorators work and how to write them. To do that, we'll write a basic implementation of a dispatch function, which will conditionally call function implementations based on the value of an argument.

A Problem in Need of Decorators

How often have you written code that looks something like this?

def if_chain_function(status):
    if status == "red":
        # large block of code
    elif status == "orange":
        # large block of code
    elif status == "yellow":
        # large block of code
    else:
        # some default large block of code

Most of us have written some version of that more than we'd like to admit. Some languages even have a special mechanism (usually called switch) for doing it.

But it has problems.

  • As the number of cases increases, it becomes hard to read.
  • The code blocks in each case will likely evolve independently, some getting refactored into clean functions, some not, and some being a mix of inline code and custom function calls.
  • New cases have to be added explicitly in this function. This precludes pluggable cases from external modules, and also just adds mental load.
  • If the cases are anything other than simple comparisons, the whole thing quickly becomes difficult to reason about.

To solve these problems, making the code more readable and easier to extend, we're going to look at function dispatching.

A little bit about function dispatching

Conventionally, function dispatching is related to the type of the argument. That is, function dispatching is a mechanism for doing different things, depending on whether you pass in an int or a str or a list or whatever.

Python is dynamically typed, so you don't have to specify that a function only accepts some specific type as an argument. But if those different types have to be handled differently, you might end up with code that looks eerily similar to the code above.

def depends_on_type(x):
    if type(x) == str:
        # large block of code
    else if type(x) == int:
        # large block of code
    else if type(x) == list:
        # large block of code
    else:
        # some default large block of code

This has all the same problems mentioned above. But, unlike the first example, Python has a solution to this already:
@functools.singledispatch.

This is a decorator which transforms a function into a single-dispatch generic function. You then register other functions against it, specifying a type of object (that is, a class name). When the function is called, it:

  1. looks up the type of the first argument
  2. checks its registry for that type
  3. executes the function registered for that type
  4. if the type wasn't registered, the original function is executed

If this sounds complicated, don't worry. Using singledispatch is simpler than explaining it.

In [1]:
import functools

@functools.singledispatch
def dispatch_on_type(x):
    # some default logic
    print("I am the default implementation.")
    
@dispatch_on_type.register(str)
def _(x):
    # some stringy logic
    print(f"'{x}' is a string.")
    
@dispatch_on_type.register(int)
def _(x):
    # some integer logic
    print(f"{x} is an integer.")
    
@dispatch_on_type.register(list)
def _(x):
    # some list logic
    print(f"{x} is a list.")
In [2]:
dispatch_on_type(set())
I am the default implementation.
In [3]:
dispatch_on_type("STRING")
'STRING' is a string.
In [4]:
dispatch_on_type(1337)
1337 is an integer.
In [5]:
dispatch_on_type([1,3,3,7])
[1, 3, 3, 7] is a list.

You can do all sorts of cool things with functools.singledispatch, but it doesn't solve the problem in the code at the top of the page. For that, we're going to create a decorator similar to singledispatch that dispatches based on the value of the first argument instead of the type.

Along the way we'll learn more about how decorators work.

Writing Decorators

The @ decorator syntax is syntactic sugar that covers passing a function to another function and then returning a function.

What?

Again, this is easier to show than to explain.

In [6]:
def a_decorator(func):
    return func


# The sweet decorator way...
@a_decorator
def some_function():
    print("Some function.")

# Which has exactly the same effect as...
def some_other_function():
    print("Some other function.")
    
some_other_function = a_decorator(some_other_function)

A decorator is just a function that returns a function.

When used with the @ syntax,

  1. the decorator function is called, with the decorated function passed as an argument;
  2. the return value of the decorator function is assigned to the same name as the decorated function.
  3. When you call the decorated function, you are actually calling the function that was returned by the decorator (which may or may not call the original function's code).

But the above example returned the original function without altering it. The point of decorators is to return something other than the original function, in order to transform the function in some way.

To do this, we usually define another function inside the decorator, and then return that.

In [7]:
def never_two(func):
    def wrapper(*args, **kw):
        x = func(*args, **kw)
        return x if x != 2 else 3
    return wrapper
    
@never_two
def add(x,y):
    return x + y
    
In [8]:
add(1,1)
Out[8]:
3

The wrapper function is defined inside never_two, but it is not executed when never_two is executed (which happens at the line where @never_two appears). Notice — it isn't called anywhere. (That is, you don't see anything like wrapper(1,1).)

Instead, the wrapper function is returned by @never_two, and assigned to the name add. Meanwhile, the code in the original add definition is inside wrapper, where it is called func.

When add(1,1) is called:

  1. The code defined in wrapper is executed, because it was assigned to add.
  2. The arguments passed into add (the two 1s) are passed on to func when it is called at x = func(*args, **kw).
  3. The code originally defined at add (return x + y) is executed, because that code was assigned to func.
  4. The output of func (the original add) is compared to 2, and altered accordingly.
  5. The code defined under wrapper (currently being called add) returns 3.

Two points might be helpful here:

  • Think of a function as just everything from the parenthesis onward, excluding the name. Once you think of a function as a block of code that accepts arguments, which can be assigned to any name, things get a little easier to understand.

  • The (*args, **kw) is a way to collect, pass on, and then unpack all the positional and keyword arguments. A full treatment is beyond the scope of this article. For now, just notice that whatever is passed into wrapper is simply passed on to func.

Writing a Dispatch Decorator

Let's look at the syntax of functools.singledispatch again, and think about what we need to do to emulate it for values instead of types.

In [9]:
@functools.singledispatch
def dispatch_on_type(x):
    print("I am the default implementation.")
    
@dispatch_on_type.register(str)
def _(x):
    print(f"'{x}' is a string.")

Decorators that Return Decorators

Notice that we actually have two decorators:

  • functools.singledispatch
  • dispatch_on_type.register

This means inside singledispatch, the decorated function (in this case, dispatch_on_type) is being assigned an additional attribute, .register, which is also a decorator function.

That might look something like:

In [10]:
def outer_decorator(func):
    
    def inner_decorator(func):
        
        def inner_wrapper():
        
            print("Inner wrapper.")
        
        return inner_wrapper
    
    def wrapper():
        print("The wrapper function.")
    
    wrapper.decorator = inner_decorator
    
    return wrapper

@outer_decorator
def a_function():
    print("Original a_function.") # This will never execute.

@a_function.decorator
def another_function():
    print("Original another_function.") # This will never execute.
In [11]:
a_function()
The wrapper function.
In [12]:
another_function()
Inner wrapper.

Unpacking that a bit:

  • outer_decorator defines two functions,inner_decorator and wrapper
  • wrapper is returned by outer_decorator, so it will be executed when a_function is called
  • inner_decorator is assigned as an attribute of wrapper, so a_function.inner_decorator becomes a usable decorator
  • inner_decorator defines inner_wrapper and returns it, so it will be executed when another_function is called

Decorators with Arguments

You may have noticed that up until now, the decorators created in this article did not included parentheses or arguments when attached to functions. This is because the decorated function is actually passed as the only argument to the function call.

But when registering functions against types, singledispatch included an argument.

In [13]:
@functools.singledispatch
def dispatched():
    return

@dispatched.register(str)
def _():
    return

Incredibly, the way to achieve this is to nest yet another function into the decorator.

That is because, really, register isn't a decorator. Instead, register is a function which returns a decorator when passed an argument.

Let's take our previous example and expand it to include this idea.

In [14]:
def outer_decorator(func):
    
    def faux_decorator_w_arg(arg):
        
        def actual_decorator(func):
            
            def inner_wrapper():
        
                print(f"Inner wrapper. arg was: {arg}")
        
            return inner_wrapper
        
        return actual_decorator
    
    def wrapper():
        print("The wrapper function.")
    
    wrapper.decorator = faux_decorator_w_arg
    
    return wrapper

@outer_decorator
def a_function():
    print("Original a_function.") # This will never execute.

@a_function.decorator("decorator_argument")
def another_function():
    print("Original another_function.") # This will never execute.
In [15]:
a_function()
The wrapper function.
In [16]:
another_function()
Inner wrapper. arg was: decorator_argument

Putting it Together

So now we know how to create decorators that return decorators and accepts arguments. With this, plus a dictionary that maps registered values to functions, we can create a dispatch on value decorator.

In [17]:
def dispatch_on_value(func):
    """
    Value-dispatch function decorator.
    
    Transforms a function into a value-dispatch function,
    which can have different behaviors based on the value of the first argument.
    """
    
    registry = {}

    def dispatch(value):

        try:
            return registry[value]
        except KeyError:
            return func

    def register(value, func=None):
       
        if func is None:
            return lambda f: register(value, f)
        
        registry[value] = func
        
        return func

    def wrapper(*args, **kw):
        return dispatch(args[0])(*args, **kw)

    wrapper.register = register
    wrapper.dispatch = dispatch
    wrapper.registry = registry

    return wrapper
In [18]:
@dispatch_on_value
def react_to_status(status):
    print("Everything's fine.")
    

@react_to_status.register("red")
def _(status):
    # Red status is probably bad.
    # So we need lots of complicated code here to deal with it.
    print("Status is red.")
In [19]:
react_to_status("red")
Status is red.

There are a few things here which might not be obvious. So let's take a closer look.

def dispatch(value):

        try:
            return registry[value]
        except KeyError:
            return func

This is called by wrapper, and is the mechanism that determines which registered function is executed. It looks in the registry and returns the appropriate function (without executing it). If the value isn't registered, this will raise a KeyError. The except block catches that error and returns the original function.

def register(value, func=None):

        if func is None:
            return lambda f: register(value, f)

        registry[value] = func

        return func

This acts as both the faux_decorator and the actual_decorator.

It can be called with one or two positional arguments; if the second one is omitted, it is set to None.

At @react_to_status.register("red"), it is being called with only the value argument. This causes the lambda expression to be returned, with value already interpreted. (That is, the return value is lambda f: register("red", f)).

This is the same as:

if func is None:
        def actual_decorator(func):
            register(value, func)
        return actual decorator

But the lambda expression is a bit easier to read, once you know what it is doing.

This returned lambda function is then the actual decorator, and is executed with the wrapped function as its one argument. The function argument is them passed to register, along with the value that got set when the lambda was created.

Now register runs again, but this time it has both arguments. The if func is None is skipped, and the function is added to the registry with value as the key. The function is returned back to the point when the register decorator was called, but it gets assigned to the name _, because we never need to call it directly.

def wrapper(*args, **kw):
        return dispatch(args[0])(*args, **kw)

This is the function that actually gets executed when react_to_status is called. It calls dispatch with the first argument (arg[0]), which return the appropriate function. The returned function is immediatly called, with *args, **kw passed in. Any output from the function is returned to the caller of react_to_status, which completes the entire dispatch process.

Going Further

This tutorial looked at value dispatch in order to dig into how decorators work. It does not provide a complete implementation for a practical value dispatch decorator.

For example, in practice you'd probably want value dispatch to include:

  • values within a range
  • values in a collection
  • values matching a regular expression
  • values meeting criteria defined in a fitness or sorting function

And we didn't even talk about additional functools features that help sort out introspection or dealing with the many problems created by decorators.

For a more complete, production ready implementation of this idea, see Dispatch on Value by minimind.


Credits

The final form of dispatch_on_value was based heavily on the Ouroboros implementation of functools.singledispatch.

Object-oriented vs. Functional Modelling of Musical Arithmetic in Python

Classical vs. Functional Modelling of Musical Arithmetic in Python

I'm currently building a music theory library in Python, called Ophis.

In [1]:
import ophis

This is an attempt create a utility that "understands" music theory and can manipulate music information, to be used as a base for other applications. This would be handy for all sort of things, from music theory educational apps to AI composition.

In this notebook, we'll look at how I originally implemented basic musical arithmetic in Ophis, the problems with that approach, and why I am moving from a classical to a functional design.

A Classical OOP Design

My first approach in implementing this was classically object oriented, and influenced by an essentially Platonic ontology.

The idea was that musical building blocks would be, as much as possible, similar to integers.

In [2]:
# A `Chroma` is the *idea* of a note letter name 
#     Example: "A" or "D FLAT"
# 35 chromae are initialized to constants on load, 
#   representing all 7 letter names, 
#   with sharps, flats, double sharps, and double flats.

ophis.wcs # Western Chroma Set, 
          # the complete list of all initialized chromae
Out[2]:
{BDUBFLAT,
 GDUBSHARP,
 BFLAT,
 DDUBSHARP,
 F,
 DFLAT,
 D,
 ADUBFLAT,
 AFLAT,
 G,
 CFLAT,
 C,
 FDUBFLAT,
 GDUBFLAT,
 B,
 GSHARP,
 BSHARP,
 GFLAT,
 ASHARP,
 FDUBSHARP,
 CDUBFLAT,
 DDUBFLAT,
 E,
 EDUBFLAT,
 CSHARP,
 EFLAT,
 EDUBSHARP,
 FFLAT,
 A,
 BDUBSHARP,
 CDUBSHARP,
 DSHARP,
 ADUBSHARP,
 ESHARP,
 FSHARP}

One of the main ideas here is that there is one and only one representation of the idea of C SHARP or F NATURAL . Moreover, the chromae can be inspected, and know how to represent themselves.

In [3]:
ophis.FSHARP.unicode
Out[3]:
'F♯'
In [4]:
ophis.FSHARP.ascii
Out[4]:
'F#'
In [5]:
ophis.FSHARP.base
Out[5]:
'F'

Chromae also carry all the logic needed for musical manipulation and mathematical representation.

In [6]:
int(ophis.FSHARP) 
Out[6]:
6
In [7]:
ophis.FSHARP.augment()
Out[7]:
G
In [8]:
ophis.FSHARP.diminish()
Out[8]:
F

A Pitch is a Chroma with an octave designation. Using the special __call__ method on Chroma, and the __repr__ method on Pitch, I was able to make their interactive representation is intuitive.

# in Chroma class

def __call__(self, octave):
    return Pitch(self, octave)

# in Pitch class:

def __repr__(self):
    return self.chroma.name + "(" + self.octave + ")"
In [9]:
# The "standard Python" way to create a pitch. 
ophis.Pitch(ophis.GFLAT, 2)
Out[9]:
GFLAT(2)
In [11]:
# The Ophis canonical way.

ophis.GFLAT(2)
Out[11]:
GFLAT(2)

Intervals (without octaves) and QualifiedIntervals (with octaves) have a similar relationship to each other as Chroma and Pitch.

Rather than initializing every possible musical interval, the qualities (major, minor, perfect, augmented, diminished) are initialized and callable, to create an intuitive API.

In [12]:
ophis.Major(2) # A Major second.
Out[12]:
Major(2)
In [13]:
ophis.Perfect(4, 2) # A Perfect fourth, plus 2 octaves.
Out[13]:
Perfect(4)^2

Function caching is used to ensure that only one of any interval is created. (Some experimental benchmarking showed that this would matter in large scores.)

In [14]:
id(ophis.minor(2).augmented()) == id(ophis.Major(2))
Out[14]:
True

And, of course, you can use both types of intervals to manipulate chromae and pitches.

In [15]:
ophis.G + ophis.Major(2)
Out[15]:
A
In [16]:
ophis.A(2) + ophis.Perfect(5)
Out[16]:
E(3)
In [17]:
ophis.FSHARP(1) + ophis.Major(2, 2)
Out[17]:
GSHARP(3)

All this lets you do complicated musical manipulation and representation.

In [18]:
(ophis.FFLAT + ophis.Perfect(5)).diminish().unicode
Out[18]:
'B♭'

Obviously, all this is only the beginning of what is needed for a music theory library. But it is a beginning. The next submodule will build up Duration and TimeSignature, leading to the creation of Measure and eventually Score. My current plan is to use pandas.DataFrame for multi-voice scores, as that would allow cross-voice analysis in a way that multi-dimensional lists would not.

Problems Appear

So that's great, but...

I can't but help wonder if some of this is overwrought.

A number of interrelated concerns occured to me while working on this implementation.

Logic is hard to reason about

The math of moving from note to note is riddled with off-by-one and modulo arithmetic problems.

  • An interval representing no change (from a note to itself) is called a unison, represented with a 1. A difference of one step is called a second, and so on.
  • The first scale degree is 1. (Not zero indexed.)
  • We frequently think about scales as having eight notes, but in reality they only have seven. When this is zero indexed, the notes go from 0-6. This is fine for arithmetic, but when thinking as a musician it is jarring.

Because of this difficulty in clear thinking on my part, I often found myself using the guess-and-check method for remembering when to add or subtract a one.

I wrote rigorous tests along the way to keep these errors out, so everything ends up fine in the end. However, this made for slow and sometimes demoralizing progress, and I would hate to have to go back and reason about this code after being away from it.

Incorrect assumptions about logical order

The first attempt to implement basic Chroma functionality assumed that Interval — the relationship between two chromae — would depend on Chroma. It turns out this is exactly backwards. Interval is logically prior to Chroma. There is no way to define abstract named pitches without their relationships already existing.

Practically speaking, discovering this simply meant I had to re-order some code. But this challenged my thinking about what the fundamental building blocks of music theory actually are.

Convoluted logic and utility data structures

Here's an example, the augment method from the Chroma class.

def augment(self, magnitude=1, modifier_preference="sharp"):
    """Return a chroma higher than the one given.

    Args:
        magnitude (:obj:`int`, :obj:`Interval`,
                   or obj with an ``int`` value; optional): 
            the distance to augment by. 
            Integer values are interpreted as half steps. 
            Defaults to 1.
        modifier_preference (:obj:`str`, 
                             ``'sharp'`` or ``'flat'``;
                             optional)
            Defaults to ``'sharp'``. 

    Examples:

        >>> C.augment()
        CSHARP

        >>> C.augment(1, 'flat')
        DFLAT

        >>> C.augment(minor(3))
        EFLAT

        >>> D.augment(2)
        E

        >>> E.augment()
        F

        >>> E.augment(2, 'flat')
        GFLAT
    """

    value_candidates =  self.essential_set.chroma_by_value(
        int(self) + int(magnitude)
    )
    try:
        letter_candidates = self.essential_set.chroma_by_letter(
            self.base_num + magnitude.distance
        )
        solution, = value_candidates & letter_candidates
        return solution
    except:
        return value_candidates.enharmonic_reduce(modifier_preference)

If it isn't obvious, here's what it does:

  • Calculate the integer value of the target Chroma and find the set of Chroma objects which have the integer value we're looking for.
  • Try:
    • Calculate the letter name of the target Chroma and find the set of Chroma that have the name value we're looking for.
    • Find and return the union of the integer-value set and the note-name value set.
  • Except:
    • Return a member of the integer-value set, basing the selection on some logic (defined elsewhere) that prefers sharps to flats or flats to sharps in certain instances.

This works, but it isn't at all how a musician thinks about this operation. Moreover, it depends on the essential_set, the collection of all initialized chromae. (Referred to above as wcs, the Wesern Chroma Set.) It would be bad enough if this was just used to keep the pool of initialized chromae, so that methods returning C Sharp always returned the same C Sharp. But it doesn't just do that. An inordinate amount of musical knowledge and logic crept into the ChromaSet class that defines the essential_set. While I'm positive that some of this is due to bad coding on my part, I think the bulk of it is due to bad conceptualization.

The final problem with this is that it is non-obvious. This code is hard to read and reason about, because it isn't clear what is actually happening.

Fragile Primitives

Python doesn't really allow you to protect object attributes or module-level constants. There are some things you can do to ensure object attributes aren't reassigned accidentally (and I've done them), but (as far as I can tell) module-level constants cannot be protected.

This is a problem, since the fundamental building blocks of music theory in the current implementation are initialized as constants. The object representing C Sharp is created and assigned to the name CSHARP. If that name gets reassigned, you are basically hosed. This could lead to hard-to-trace errors and frustrating interactive sessions.

Poor isomorphism to numbers

One of the design goals of Ophis is to be able to treat musical concepts as numbers. That's why the arithmetic operators are implemented and everything has an integer value. I wanted it to be easy for math utilities to operate on pitches and intervals. This would enable things like advanced theoretical analysis and machine learning.

But, they aren't numbers. They just aren't.

You can't (meaningfully) have a Chroma with a float, decimal, or fractional value. This means that microtones are not presently accounted for and will require an extension, the logic of which I can only guess at.

You also can't meaningfully multiply or divide values. Offhand, I'm not sure why you would want to do so, but I can imagine approaches to musical analysis where it would be needed.

Further, even with supported integer-based operations, using any standard math tool requires notes from a score to be converted into numbers, manipulated or analyzed, and converted back. There's no direct access to Ophis "primitives" in Numpy, SciKitLearn, or anything else.

These problems piled up over time as I implemented the basic logic and worked out the implications. Technical debt accumulates through a process of small compromises and justifications. By the time I became aware of the scope of the problem, I had two thoughts:

  • Re-architecting everything would take too long to be worthwhile. I would probably get disheartened and give up.
  • I can refactor the internals in the future to make things a bit clearer and cleaner. In the meantime, good documentation would make the code maintainable.

So, my plan was to just keep moving. But then, thinking about the isomorphism problem, I realized another poorly-mapped isomorphism.

Poor isomorphism between Chroma and Intervals

Or really, no explicit isomorphism at all. And this is a problem because these are really the same thing.

I had implemented the Interval class, and written all the logic for how intervals are inverted, augmented, and diminished. This requires understanding of the interplay between interval distances (third, fourth, sixth) and their qualities (Major, minor, Perfect), and how many half-steps each are. And of course there's that zero-indexing stuff to think about (second = 1, third = 2).

Then I implemented the Chroma class, and wrote almost the same logic (but just a bit different) for how pitches are augmented and diminished (pitches aren't inverted). This requires an understanding of the interplay between note letter names (C, D, E), how those letter names map to a zero-indexed numerical representation (C = 0, D = 1, E = 2), and how modifiers like sharp and flat affect the total number of halfsteps from C (the origin point in modern music theory).

But these are, as I said, exactly the same thing.

Every note can be represented as an interval from C. And not only can it be represented that way, but that is exactly how it was already defined. There is no other reasonable way to (numerically) define notes.

Here's an example in case this isn't clear:

  • E Natural is a Major Third away from C.
  • In our zero-indexed representation intervals, a Third is 2.
  • A Major Third is 4 half-steps.

Those two numbers, (2, 4), are an integral part of the definition of E Natural; without them, you can't do any of the manipulation that makes the Chroma meaningful.

In [19]:
print(ophis.Major(3).distance, int(ophis.Major(3)))
print(ophis.E.base_num, int(ophis.E))
2 4
2 4

Obviously, this holds for every other Chroma as well.

In [20]:
print(ophis.Perfect(5).distance, int(ophis.Perfect(5)))
print(ophis.G.base_num, int(ophis.G))
4 7
4 7
In [21]:
print(ophis.Augmented(6).distance, int(ophis.Augmented(6)))
print(ophis.ASHARP.base_num, int(ophis.ASHARP))
5 10
5 10

Further, it turns out that these two numbers are the only things you need to know in order to do any standard musical manipulation you might want to do.

In [22]:
g_or_p5 = (4,7) # Tuple representing G or a Perfect Fifth
e_or_maj3 = (2,4) # Tuple representing E or a Major Third

# Add tuples element wise.
sum_of_tuples = (
    g_or_p5[0] + e_or_maj3[0], 
    g_or_p5[1] + e_or_maj3[1]
)

sum_of_tuples # (6,11)
Out[22]:
(6, 11)
In [23]:
g_augmented_by_maj3 = ophis.G.augment(ophis.Major(3))
e_augmented_by_p5 = ophis.E.augment(ophis.Perfect(5))

print(g_augmented_by_maj3)
print(e_augmented_by_p5)
B
B
In [24]:
print(ophis.B.base_num, int(ophis.B)) 
6 11
In [25]:
z = ophis.Perfect(5) + ophis.Major(3)
z
Out[25]:
Major(7)
In [26]:
print(z.distance, int(z)) 
6 11

So any chroma and any interval can be represented by a two-tuple, while manipulations originally implemented as methods in different classes can be a unified set of pure functions that accept tuples as arguments.

Great.

But two-tuples don't provide all the additional information you need to notate pitches or otherwise make them understandable as music.

So we need some "translation" functions. This still involves a lot of "magic number" coding, but hopefully it can be condensed into a small set of mappings that are easy to reason about.

In [27]:
import bidict # efficient two-way indexing of dicts

primary_map = [
    # half steps, scale degree / interval number, 
    # interval name, letter name, Perfect? 
    #                             (False=Major) 
    (0,  1, "unison",  "C", True),  #0
    (2,  2, "second",  "D", False), #1
    (4,  3, "third",   "E", False), #2
    (5,  4, "fourth",  "F", True),  #3
    (7,  5, "fifth",   "G", True),  #4
    (9,  6, "sixth",   "A", False), #5
    (11, 7, "seventh", "B", False)  #6
]

# Split primary map into bidicts for each value.
#     For faster, more sensible referencing.
#     This feels wrong and I need a better way.
#     Maybe something with named tuple...

hs_map = bidict.bidict(
    {x:item[0] for x, item in enumerate(primary_map)}
)
interval_map = bidict.bidict(
    {x:item[1] for x, item in enumerate(primary_map)}
)
interval_name_map = bidict.bidict(
    {x:item[2] for x, item in enumerate(primary_map)}
)
name_map = bidict.bidict(
    {x:item[3] for x, item in enumerate(primary_map)}
)
quality_map = {
    x:item[4] for x, item in enumerate(primary_map)
}

# How to translate between 
# diatonic intervals and modified intervals.
interval_quality_map = {
    True: bidict.bidict({ # Diatonic is Perfect
        -2 : 'double diminished', 
        -1 : 'diminished', 
         0 : 'Perfect',
         1 : 'Augmented', 
         2 : 'Double Augmented'
    }),
    False: bidict.bidict({ # Diatonic is Major
        -2 : 'diminished', 
        -1 : 'minor', 
         0 : 'Major', 
         1 : 'Augmented', 
         2 : 'Double Augmented'
    })
}

modifiers = bidict.bidict({
    -2 : 'doubleflat',
    -1 : 'flat',
     0 : 'natural',
     1 : 'sharp', 
     2 : 'doublesharp'
})
In [28]:
import functools

# Single Dispatch:
#     two functions with the same signature.
#
#     The type of the first argument determines 
#     which function is executed.
#     This way, if a tuple is passed in, 
#         a string is returned,
#     and if a string is passed in, 
#         a tuple is returned.

@functools.singledispatch
def chroma(x):
    return None

@chroma.register(tuple)
def _(xy):
    x,y = xy
    name = name_map[x]
    modifier = modifiers[y - hs_map[x]]
    return " ".join([name, modifier])

@chroma.register(str)
def _(letter, modifier):
    x = name_map.inv[letter]
    mod_diff = modifiers.inv[modifier]
    y = hs_map[x] + mod_diff
    return x,y

@functools.singledispatch
def interval(x):
    return None

@interval.register(tuple)
def _(xy):
    x,y = xy
    name = interval_name_map[x]
    q_mod = y - hs_map[x]
    q = interval_quality_map[quality_map[y]][q_mod]
    return " ".join([q, name])

@interval.register(str)
def _(q, n): # quality, number
    x = n - 1
    is_perfect = quality_map[x]
    q_mod = interval_quality_map[is_perfect].inv[q]
    y = hs_map[x] + q_mod
    return x, y
    
def augment(a, b):
    return tuple(map(sum,zip(a,b)))

def diminish(a, b):
    return tuple(y - b[x] for x,y in enumerate(a))
In [29]:
chroma((0,0))
Out[29]:
'C natural'
In [30]:
interval((2,3))
Out[30]:
'diminished third'
In [31]:
chroma('D', 'sharp')
Out[31]:
(1, 3)
In [32]:
interval('Major', 3)
Out[32]:
(2, 4)
In [33]:
chroma(augment(chroma('C', 'sharp'), interval('minor', 3)))
Out[33]:
'E natural'
In [34]:
%timeit chroma(
    augment(chroma('C', 'sharp'), interval('minor', 3))
)
22.3 µs ± 1.4 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [35]:
ophis.CSHARP.augment(ophis.minor(3))
Out[35]:
E
In [36]:
%timeit ophis.CSHARP.augment(ophis.minor(3))
101 µs ± 3.95 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

A functional approach:

  • simplifies the math and logic
  • preserves important isomorphisms
  • requires much less code
  • executes much faster

The only downside is that the API for interactive use is a little less elegant, but not so much as to be a problem.

Where to Go From Here

The quick functional implementation demonstrated here doesn't include all the things that the OO approach currently has.

Foremost, this version needs to include modulo arithmetic.

In [37]:
augment((7,11),(1,1)) # Should be (0,0), in musical logic.
Out[37]:
(8, 12)
In [38]:
# This result is meaningless.
chroma((8,12))  # KeyError
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-38-114c113c80ec> in <module>()
      1 # This result is meaningless.
----> 2 chroma((8,12))

/Users/adamwood/amwenv/lib/python3.5/functools.py in wrapper(*args, **kw)
    741 
    742     def wrapper(*args, **kw):
--> 743         return dispatch(args[0].__class__)(*args, **kw)
    744 
    745     registry[object] = func

<ipython-input-28-f2a6c2b4cc47> in _(xy)
     13 def _(xy):
     14     x,y = xy
---> 15     name = name_map[x]
     16     modifier = modifiers[y - hs_map[x]]
     17     return " ".join([name, modifier])

/Users/adamwood/amwenv/lib/python3.5/site-packages/bidict/_common.py in proxy(self, *args)
     94         attr = getattr(self, attrname)
     95         meth = getattr(attr, methodname)
---> 96         return meth(*args)
     97     proxy.__name__ = methodname
     98     proxy.__doc__ = doc or "Like dict's ``%s``." % methodname

KeyError: 8

Additionally, I need to include octave designations. The arithmetic is almost included for free with the functional approach, but the translation functions don't support it.

In [39]:
# Maj Second (or D) and min 3, with a third term for octave designation
augment((1,2,1), (2,3,2)) 
Out[39]:
(3, 5, 3)
In [40]:
# F, 2 octaves above Middle C
chroma((3,5,3)) # ValueError
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-3c4313b92e77> in <module>()
----> 1 chroma((3,5,3)) # F, 2 octaves above Middle C

/Users/adamwood/amwenv/lib/python3.5/functools.py in wrapper(*args, **kw)
    741 
    742     def wrapper(*args, **kw):
--> 743         return dispatch(args[0].__class__)(*args, **kw)
    744 
    745     registry[object] = func

<ipython-input-28-f2a6c2b4cc47> in _(xy)
     12 @chroma.register(tuple)
     13 def _(xy):
---> 14     x,y = xy
     15     name = name_map[x]
     16     modifier = modifiers[y - hs_map[x]]

ValueError: too many values to unpack (expected 2)

The translation functions and associated dictionaries need to be extended to include multiple representations such as Unicode, ASCII, and Lilypond.

Finally, I might also do some experiementation with a hybrid approach that would keep the OO API intact. However, that might get too complicated.

Is Functional Really Better?

I don't know.

The difference in execution speed can probably be resolved by cleaning up the OO implememenation and importing the tuple-based arithmetic instead of the union-of-sets approach. I suspect that this would also make the OO code easier to read and reason about.

I'm not at all convinced that there is something inherently wrong with object oriented code. However, I do think that a classical paradigm promotes a "different things are different things" mentality. In some cases this is probably helpful. In music, at least in this case, it obscured a fundamental sameness between two important concepts. Functional programming forced me to recognize that sameness. Someone else might have recognized it anyway, and written a good OO implementation first.

If there is a generalizable lesson here, I think it might this: Think through a few different approaches and paradigms. Try coding up more than one logical implementation in the domain. See how that sheds light on the underlying problems and data structure.

Also, don't be afraid to redesign things. Especially if you don't have any users yet.

Registering Functions Against Object Methods in Python

My big side project right now is a music theory library in Python, called Ophis. Among many other concerns, I'm trying to make the API as natural and easy to use as possible. This often means finding ways of creating objects other than ClassName(args).

Ophis has the classes Chroma and Pitch. A chroma is a note name without an octave (the idea of C), while a pitch is a chroma with a specified octave (middle C).

The problem with this is that the conventional way of referring to a pitch would then be:

ophis.Pitch(ophis.C, 0)

You can see, Ophis has already initialized all the note names (chromae) you would need. We could do that with pitches...

C0 = Pitch(C, 0)
C1 = Pitch(C, 1)

# later, in user code...

ophis.C1

...but I think we all know the problem with that. It requires initializing several hundred pitch objects that may never be used. Most songs don't use every note. And every physical note has multiple names because of enharmonic spelling (F♯ == G♭).

So, what if the API looked like this?

ophis.C(1)

That's cool. Pretty easy to do, too.

class Chroma:

  #
  #
  #

  def __call__(self, octave):
    return Pitch(self, octave)

What if we went deeper?

Once you realize this is a good idea, the next thing you realize is.... what about chords?

ophis.Chord(ophis.C, Major)

Well, that looks pretty similar, doesn't it?

So, um... okay...

class Chroma:

    #
    #
    #

    def __call__(self, x):
        try:
          return Pitch(self, x)
        except TypeError:
          return Chord(self, x)

There are problems with this.

  • Definitions for Pitch and Chord are in modules that get loaded after Chroma. This doesn't create any errors (because the function isn't run on load), but still feels wrong.
  • It is brittle. If I change the name of Pitch or Chord, I have to go back and change it here. The tightly-wound nature of music terminology means I have long-since given up the idea of loose coupling, but I'm trying to make these types of dependencies only go up the conceptual ladder, not back down it.
  • What if I want to add more things to this method? Eventually I'm going to end up creating a series of type checks.

When I was working through this, I didn't see any way around a series of type checks, but I thought I could solve the first two problems with some creative coding.

I decided I could register functions into a dict, stored on the class. The keys for the dict would be types, and the values would be the functions to run when __call__ is called with that particular type as an argument. These functions could be registered at the point when the type that the function is supposed to return is created.

Something like...

class Chroma:

    #
    #
    #

    _callable_funcs = dict()

    def __call__(self, x, x_type=None):

        if callable(x):
            self.__class__._callable_funcs[x_type] = x
        else:
            return self.__class__._callable_funcs[type(x)](self, x)


# This code has not been tested.

I got (a version of) this to work, and I was feeling pretty darn proud of myself for thinking of this solution, and implementing it.

Then I had this feeling like this was all very familiar. Maybe I had read about this type of thing?

I quickly discovered three things:

Unfortunately, I have two problems:

  • The @singledispatch decorator only looks at the first argument of a function call. The first argument of a method call is always self. So, out of the box, this dosn't work for instance methods.
  • @singledispatch was added in v3.4, making it still a little newish. Since I'm writing a utility library for others to use, and not my own application, it seems unwise to rely on something that everyone might not have.

But, now I can do two things:

  • See if anyone has already figured out a way to apply @singledispatch to a method. (Someone has.)
  • Potentially re-implement @singledispatch myself, for backwards compatibility.

Right...

# oph_utils.py

try:
    from functools import singledispatch
except:
    # A re-implementation of @singledispatch
    # has been left as an exercise for the reader
    # because I haven't done one yet.

def method_dispatch(func):
    """
    An extension of functools.singledispatch,
    which looks at the argument after self.
    """
    dispatcher = singledispatch(func)
    def wrapper(*args, **kw):
        return dispatcher.dispatch(args[1].__class__)(*args, **kw)
    wrapper.register = dispatcher.register
    update_wrapper(wrapper, func)
    return wrapper

# chroma.py

class Chroma():

    #
    #
    #


    @oph_utils.method_dispatch
    def __call__(self, x):
        return self

# pitch.py


import chroma as ch

class Pitch:

    def __init__(self, chroma, octave=0):
          self.chroma = chroma
          self.octave = int(octave)


ch.Chroma.__call__.register(int, Pitch)


# In user code:

ophis.C(0) == ophis.Pitch(ophis.C, 0)
# True

And finally, to encourage this usage...

class Pitch:

    #
    #
    #


    __repr__(self):
        return "".join([
            self.chroma.__repr__()], "(",
            self.octave.__repr__()], ")"
            ])


# At a terminal...

>>> ophis.Pitch(ophis.C, 0)
C(0)

Feels Pythonic, yes?

Further Reading

Intersection of Non-Empty Sets in Python

Suppose you generate several sets on the fly, and you want to find the elements that are in all the sets. That's easy, it's the intersection of sets.

# One syntax option
result = set_one & set_two & set_three

# Another option
result = set.intersection(set_one, set_two, set_three)

But let's suppose that one or more of your sets is empty. The intersection of any set and an empty set is an empty set. But, that's not what you want. (Well, it wasn't what I wanted, anyway.)

Suppose you want the intersection of all non-empty sets.

List comprehension

If the sets are in a list, you can remove the empties. Then unpack the list into the set.intersection() function.

list_of_sets = [set_one, set_two, set_three]

# Empty sets evaluate to false,
# so will be excluded from list comp.
non_empties = [x for x in list_of_sets if x]

solution_set = set.intersection(*non_empties)

The asterisk before non_empties unpacks the list into a series of positional arguments. This is needed because set.intersection() takes an arbitrary number of sets, not an iterable full of sets. (It's the same asterisk as in *args in function definitions.)

(Note: You could use a filter instead of a list comprehension, but Guido thinks a list comprehension is better. I agree.)

With iterable unpacking (tuple unpacking)

In my case, I was generating the sets in my code, and the solution set always contained only one item. And I wanted the item, not a set with the item. So...

# initialize an empty list
list_of_sets = []

# each time I create a set,
# append set to list when it is created,
# instead of naming them individually
list_of_sets.append( thing_that_generates_a_set() )

# drop the empties, find the intersection
# and unpack the remaining single element
solution, = set.intersection(*[x for x in list_of_sets if x])

The comma after solution turns the assignment into a tuple unpacking. If you unpack a collection of one, you get the single item.

By the way, if you end up with more than one item in your collection, and only want the first item, you can do:

first_item, *_ = some_collection

The * indicates a variable number of positional arguments (it's the same asterisk as in *args and in passing the list to set.intersection() above), and the underscore is used as a convention for "not using this stuff."

# you could have done this instead

first_item, *stuff_i_will_not_care_about = some_collection

I'll be using that *_ below, in the actual code.

Why would you ever do this?

The generalized problem

From a pool of items, there are three attributes to select for. Specifying any two of them should produce one and only one result.

More specifically...

Musical intervals.

A musical interval has:

  • a quality (Major, Minor, Perfect, Augment, or Diminished)
  • a number (Unison (1), Second (2), Third (3) ... Octave (8))
  • a distance of half_steps (for example, a major third is 4 half steps)

If you know any two of these, you can select the correct one.

Some actual code

class Interval():

  #####################################
  # ... all sorts of things removed ...
  #####################################


  instances = set()
  # all instances of Interval


  @classmethod
  def get_intervals(cls, *, quality=None, number=None, half_steps=None):
      """Return a set of intervals."""

      candidate_sets = []

      candidate_sets.append({x for x in cls.instances if x.quality == quality})

      candidate_sets.append({x for x in cls.instances if x.number == number})

      candidate_sets.append({x for x in cls.instances if x.half_steps == half_steps})

      candidate_sets = [x for x in candidate_sets if len(x) > 0]

      return set.intersection(*candidate_sets)

  @classmethod
  def get_interval(cls, quality=None, number=None, half_steps=None):
      """ Return a single interval."""

      try:
          interval, = cls.get_intervals(quality=quality, number=number, half_steps=half_steps)

      ## if there was not one and only one result
      except ValueError:

          # only select by half_steps
          candidates = [x for x in cls.instances if half_steps == x.half_steps]

          # select the first one,
          # based on quality priority:
          # Perfect, Major, Minor, Dim, Aug
          interval, *_ = sorted(candidates, key=lambda x: x.quality.priority)

        return interval

In the actual code, there's a bunch of other things going on, but this is the general idea.

Another approach

For my specific use case, another approach is simply to not create a set for the unspecified attribute.

if quality is not None:
    candidate_sets.append({x for x in cls.instances if x.quality == quality})

if number is not None:
    candidate_sets.append({x for x in cls.instances if x.number == number})

if half_steps is not None:
    candidate_sets.append({x for x in cls.instances if x.half_steps == half_steps})

In my working code, I actually do both. This allows for a potentially meaningful result even if something is specified incorrectly. I could have decided to let bad input cause explicit failure, but I think I'd rather not in this case.

So... what's the point?

This post looks like a tutorial on list comprehension. Or maybe set operations. But really this post is about problem solving while writing code.

The code solution to this problem is really easy... but only if you've figured out the problem you need to solve.

I started with the following problem:

Find the intersection of all non-empty sets, from an arbitrary pool of sets, not knowing which ones would be empty.

So I started Googling variations on that theme. But there aren't any "intersection of just the good sets" functions. Then I tried to start writing a question for Stack Overflow, and as soon as I had written the title, I knew the answer.

Starting with a collection of sets, drop the empty sets and find the intersection of the remaining sets.

As soon as I broke my one problem into two steps, the problem was immediately solved:

  1. Create a new collection without the empties. (List comp.)
  2. Find the intersection of that list.

At the same moment I realized these steps, it also become clear that the original group of sets should be a collection, not just several unrelated objects.

So, the moral of the story is...

If you can't find the solution to your specific problem, restate your problem as a series of steps.

Verbiage

I hate the word verbiage.

First, we need to deal with the fact that it is the wrong word. Most of the time, when people say verbiage, they really mean verbage --- that is, the wording. Verbiage, properly, means excessive wordiness, not the specifics of word choice.

But this isn't what I hate about it. I would hate it just as much if it meant precisely what every one uses it to mean. My problem is with the idea itself. I hate what people are saying when they say verbiage.

Every time I have ever heard the word verbiage, the person has been talking about the precise way that something is worded. The context is always about improving something.

  • Can you make this more clear by fixing up the verbiage?
  • After you get the first draft of the design done, ask Adam to help you clean up the verbiage.
  • Maybe we can change the verbiage on this form to make it more user friendly.

Without fail, a request to work on the verbiage is symptomatic of a deeply flawed design and engineering process. We got to this point because people were decorating, not designing, and now we are going to try to get out of it by changing the words the user sees.

This causes more problems, of course.

The reason the words aren't clear and precise in the UI is that the mental model developed by the engineering team is either confused or just plain wrong. In order to make the application easy to use, our Verbiage Specialist has to overlay a new mental model --- often, the one that should have been used in the first place. This new mental model, and the collection of verbages that go with it, will be imprecise and incomplete because the Verbiage Engineering Team can't tell the developers to restructure the database and rename all the application's variables. The result is that the UI becomes temporarily easier to use, but at the cost of taking on additional Verbiage Debt. Somewhere deep in an internal wiki or Confluence page is a OVM (Object-Verbiage Mapper) glossary telling you that dev:event_property => user:"Device Status". But nobody reads internal wiki pages, so the problem just gets worse.

You cannot fix an application by redecorating the UI. Fixing the verbiage is just redecorating + technical debt. If you find yourself fixing up the verbiage, the problems are much deeper.

So how do you avoid Verbiage Debt?

Stop treating writers as Verbiage Technicians and think of them as Verbiage Architects. (I'm sure there's a good word for this already.) Your Verbiage Team, along with your Pictures of Things Engineers, need to be involved from the beginning with the design of your application, and they need to be fully-fledged members of the engineering team --- not hired hands, consultants, helpers, or otherwise after-the-facters.

Building software has more to do with creating mental models than it does with writing code. Humans create mental models in language and pictures.

Your language and pictures people are as important as your coders.

Designing vs. Decorating

My wife spent some time in an Interior Design master's degree program. One of the things that frequently frustrated her was the conflation, by people outside the industry, of interior design and interior decorating.

  • "Oh, so like, you're learning how to pick out furniture and stuff."
  • "Can you help me pick paint colors in my bedroom?"
  • "That's cool, like that show on HGTV."

Decorating is primarily about aesthetics --- how things look. Design is about function --- how things work. There is certainly overlap between the professions, but their focus and concern is very different.

At least, though, nearly everyone in the industry --- and certainly everyone at her school --- understood the difference. Since my wife was there the school has actually changed the name of the program to Interior Architecture, to make the focus more clear.

I'm not sure the software industry as a whole understands the difference between decorating and design. Part of the problem is that we don't use the word "decorator," to describe people with graphics skills and no sense of the underlying software. Everyone is a "designer." The best we have done is to try to make distinctions between "UX Design" and "Graphic Design."

In fact, I think the push in the last decade or so to use the word "UX" is an attempt to make the distinction. Unfortunately, I don't think it has helped. Like Tech Writers calling themselves "Documentation Specialists," the change in label has been driven as much by a desire for a cooler resume as by any real change in practices. The distinction we need to make is not between "graphics" and "UX," and certainly not between "UX" and "UI" (as if those are, you know, actually different things, really). The distinction we need to make is between design and decoration.

Have you ever sat in a redesign review that solved exactly none of the problems of the original design? The new thing looks better, but it functions the same. Decorating

Have you ever been involved in a process where some non-engineer Product Manager drew pictures of screens and buttons, and then someone with Photoshop skills and no coding experience turned that into a mockup? Decorating.

Have you ever been asked, after the graphics person has completed an entire set of screen mockups, to "help with some of the verbiage" in order to make things more clear? Decorating.

Any process that separates out the work of contributors --- first the engineers do something and then hand it off to the graphics person and then the tech writer writes about it later --- will tend toward decorating. Design requires people to actually talk to each other, preferably in the same room. Design requires that a person drawing and labelling a form input understand the conceptual model the form is interacting with.

I suggest we stop futzing with labels for types of people and buzzwords that feel helpful but aren't. This problem cannot be solved by finding an even cooler replacement word for "UX," and then blogging about how "UX is dead, we're doing XZ now." Just keep "design" and "decoration" in your head as an evaluative tool. Look at how things are being done and ask yourself --- it this designing or is it decorating? Then, if there's too much decorating, don't spend a lot of energy convincing people about the difference. Just begin to change the process.

And don't let someone with Photoshop skills redesign an app they don't understand and have never used.

Docs as Code

Practices

  • Docs are written in plain text formats such as Markdown or reStructured Text.
  • Docs are stored as flat files, not database entries.
  • Docs are authored in a code editor of the writer's choice, not a monolithic authoring application.
  • Docs are kept under version control.
  • Doc versions are organized in parallel to product versions.
  • Docs are built and deployed from source in an automated process that mirrors product deployment.
  • Docs are automatically tested for internal consistency and compliance to style guides.
  • Whenever reasonable, writers use the same tools and processes as developers.
  • Writers are integrated into the development team.

Benefits

  • Writers have more control over their authoring environment.
  • Less friction in the authoring process.
  • Elimination of inconsistencies between docs and product.
  • Less need for human proofreading.
  • Coordinated releases of docs with product.
  • Developers are more likely to contribute to docs.
  • Writers and developers have more awareness of and respect for each others' work.
  • Authoring and deployment tools are mostly free; hosting requires less overhead.

DocOps Isn't Just the Fun Part

Somewhere in the last year I decided I was into DocOps.

What that really meant for me is that I am into Docs-as-code, which is a related trend, but not quite the same. I care about things like single-source documents (DRY), version control, plain text editing, style linting, and automated deployment. I write little Python or Bash scripts to pipe tools together and customize the output of static site generators. I'm learning a lot, having a lot of fun, and finally weaving together a number of different skill sets and interests I've picked up over the years (writing, coding, project management).

When I was the only writer at a startup, this was all really effective. I could fool myself into thinking I was doing DocOps. And maybe I was, but only in that particular context.

But now I work at a big, hulking enterprise company. And all of the sudden it is clear that DocOps isn't just the fun technology bits, just like how DevOps isn't just about knowing how to deploy Docker on Kubernetes. It's about dealing with people and dealing with organizations.

I just want to stand up my docs somewhere. "Give me SSH access to a directory with a public URL." At the startup I just made a decision and had live docs published my second or third day there. At the enterprise? Not so simple. My tooling has to go through security checks. Engineers have to sign off on deployment processes. Customer service has a vested interest in how documents are delivered. Can we integrate to Salesforce knowledge base? How do I pip install from behind a firewall?

If I'm into DocOps, this is what I'm into. Not just hacking on writing tools (as much fun as that is), but also being effective in an organization. I was very effective in a startup, where hacking on things was how the organization operated. Now I have to level up and learn how to be effective at scale.

The Real Reason I Love Static Site Generators

There's a lot to like about static site generators like Jekyll, Nikola, and Sphinx.

  • Hosting is much simpler, and can usually be done for free.
  • Static sites are inherently more secure than dynamic ones.
  • Very fast page load times.
  • Authoring in a code editor that I have control over.
  • Markdown and reStructured Text are both faster to type than HTML or rich content in a WYSIWYG editor.
  • Version control.
  • The ability to manage the build and deploy process like code.

There are probably more benefits I'm not thinking of at the moment. When I first started using Jekyll, my main motivation was wanting to simplify hosting and exert control over authoring. I discovered the other benefits along the way, and they have really changed my professional life.

But I've realized there's one thing that has come to matter the most to me:

Static sites revive and make real the notion of a document on the web.

In database-backed CMSes, the pretty URL is a noble lie. Content is smeared around in a database and accessed through ?id=1234 parameters or internal query mechanisms. This is fine, and really the only way to handle massive amounts of content.

But the web was built to serve documents, not database results. In an age where content-as-data is on such hyperdrive that people think a single-page app blog system is a reasonable idea, it is calming to use a technology that works the way the web was always supposed to work.

And this has as much to do with the mental model as with the technology. (Maybe more.) The individual documents that make up a static site are handled as documents before being processed to HTML. If I want to change the content on some blog post, I edit a file on my local computer. I don't have to log in and use an application. It is transparent, and there's a direct relationship between a single file in my source and a single URI on my site. Now it feels like the URI actually identifies a resource, and is not just a cleverly-disguised search pattern.

I understand why we moved past the web of documents. But if you're producing documents, maybe it's the right model.

File Names

There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton

I cannot help you with cache invalidation.
-- Adam Michael Wood

I recently saw a question about file names in the Episcopal Communicators Facebook Group:

Question about file names.

This is a question about filenames for websites.

When we first developed our website, our consultant told me that when we put a file on there, it's important to give the file a date and a unique and descriptive name.

While that works for some files, it doesn't for others. It caused me to end up with a lot of old files on my website.

What I changed was that I stopped changing file names. So instead of mileage_rates_2016.pdf, I just call it mileage_rates.pdf. That way every link is correct, everywhere on the site.

However, when we link to outside websites, like the wider church's site, we end up with obsolete links. Case in point: the Manual of Business Methods:

We had full_manual_updated_09-30-2013.pdf.

And now the link is full_manual_updated_012815_0.pdf

Is there any need to give dates to files like this? It's important for the organization to archive old versions, but is there any need to have unique names so that websites like ours end up with older versions?

I summed a few file name best practices, but... I have a lot to say about this topic. File naming is one of those weird little things I have irrationally strong feelings about, and the ubiquity of bad file naming practices is a constant source of rage in my life.

Read more…