Thursday November 26 2020

Haskell Proposal: Simplify Deriving - Sat Nov 21 16:00

Haskell’s type classes and deriving facilities are a killer feature for type safety and extensibility. Over nearly 30 years they’ve acquired quite a bit of cruft and language extensions. With DerivingVia, we now have the ability to dramatically simplify the deriving story.

This post outlines a change to the language that would hopefully be adopted with the next version of the language standard. They get less reasonable and more dramatic as the post goes on.

GHC has a ton of extensions that only serve to unlock additional type classes to the “stock” deriving strategy. Derive{Functor,Foldable,Traversable,Generic,Lift,etc}. We can remove all of these extensions by folding them into the stock deriving strategy.

DeriveAnyClass is a footgun. It allows you to write any type class in a deriving clause. It pastes in an “empty” instance, relying on DefaultSignatures to fill in the values.

DefaultSignatures is used to give a single default implementation of a type class if the underlying type matches a more restrictive constraint. This is primarily used to provide Generic-based implementations with very little syntax.

By privileging a single default, it makes any other possible defaults less useful and less discoverable.

The DeriveAnyClass utility is subsumed by DerivingVia.

This extension is subsumed by DerivingVia, also.

Now that there’s only two strategies, we can get rid of DerivingStrategies.

Currently, you must write the complete type in a DerivingVia clause.

This can be cumbersome for a very large type.

It’s also annoyingly repetitive, and can lead to errors.

A wildcard can be used to indicate either:

a. The underlying type of a newtype, or b. The type of the data declaration.

There are two ways to derive things: StandaloneDeriving and attached deriving. Attached deriving is redundant, but convenient. StandaloneDeriving is more powerful, but less convenient. Attached deriving clauses don’t work with GADTs.

The problem with the above proposal is that it carries a significant syntax cost. The keyword deriving is repeated for each instance, the keyword instance is repeated, the via _ clause is repeated, and the type name is repeated. Multiple instances should be derivable with the same context.

In this block, we define the ToJSON and FromJSON instances using the same Generically viatype. We can still use _ to refer to the type, since we know the type we’re deriving for: Foo. This recovers the syntax convenience of “attached deriving.”

This also recovers the convenience of attached deriving. Let’s look at the main point - GADTs. Otherwise we could just remove StandaloneDeriving (with the nice benefit/tragedy of banning orphan derived instance).

The _ refers to the type name, without any variables applied. So you need to apply the type variables in the instance head. That’s a bit annoying, but maybe it’s fine

GHC provides a newtype Stock a = Stock a that hooks in to DerivingVia somehow. Now we’re down to one deriving strategy.

OK, so maybe you don’t like getting rid of attached deriving. Let’s get rid of standalone deriving instead. We need StandaloneDeriving for two reasons:

The type variable f is in scope from the data declaration.

EDIT: @quickdudley and @nnotm have correctly pointed out that you also want to be able to define instances of a class at the definition module of a class. These are perfectly valid instances, and so we must keep StandaloneDeriving.

Alright, post is done. These ideas are certainly controversial and Bad, but man wouldn’t it be nice to have a simpler story around deriving and type class instances? The current story is so complex, and I think we can genuinely simplify Haskell-the-language by trimming some fat here.

EDIT: @i_am_tom posted a reference to the Concrete Class Dictionaries GHC proposal, which subsumes a lot of this.