I consider myself to be learning haskell. I am proficient enough to solve Advent of Code and do some small projects using it. I love doing it but I always feel like there’s more to it. Apparently there is: this blog post from Reasonably Polymorphic caught me, I was probably exactly the intended audience.
What’s in the blog post?
They visualize the Builder Pattern, where an Object is created through repeated mutation, which, when transferred to Haskell, should be replaced by creating objects through Monoids
and the corresponding Semigroup function <>
.
I parse a programming language using parsec
and I did exactly what was proposed to enhance my structure creation.
Before, my code was this
Old Code
data StructStatement = Variable VariableName VariableType
| Function Function.Function
data Struct = Struct
{ name :: String
, variables :: [(VariableName, VariableType)]
, functions :: [Function]
}
deriving (Show)
addVariable :: Struct -> VariableName -> VariableType -> Struct
addVariable (Struct sn vs fs) n t = Struct sn ((n, t): vs) fs
addFunction :: Struct -> Function -> Struct
addFunction (Struct sn vs fs) f = Struct sn vs (f:fs)
accumulateStruct :: Struct -> StructStatement -> Struct
accumulateStruct s (Variable n t) = addVariable s n t
accumulateStruct s (Function f) = addFunction s f
Then using a fold over Struct _ [] []
(which is basically mempty
I just realized) would get me the complete struct.
It is kind of ugly:
foldl accumulateStruct (Struct structIdentifier [] []) <$!> braces (many structMember)
Now my code is this
New Code
data Struct = Struct
{ name :: String
, body :: StructBody
}
deriving (Show)
data StructBody = StructBody
{ variables :: [(VariableName, VariableType)]
, functions :: [Function]
}
deriving stock (Generic, Show)
deriving (Semigroup, Monoid) via Generically StructBody
Which shorter and easier to use, the entire construction only looks like this now:
mconcat <$!> UbcLanguage.braces (many structMember)
I love the new construction method using Semigroup and Monoid. However, I don’t understand them in depth anymore. I have written my own instance of Semigroup and Monoid, and I assume these deriving clauses do something similar.
Handwritten Semigroup instance
instance Semigroup StructBody where
(<>) s1 s2 = StructBody
{ variables = variables s1 <> variables s2
, functions = functions s1 <> functions s2
}
Monoid instance is trivial then, just default all the values to mempty.
I also have a dump of the generated class instances using -ddump-deriv -dsuppress-all
:
Generated instances
instance Semigroup StructBody where
(<>) :: StructBody -> StructBody -> StructBody
sconcat :: NonEmpty StructBody -> StructBody
stimes ::
forall (b_a87f :: *). Integral b_a87f =>
b_a87f -> StructBody -> StructBody
(<>)
= coerce
@(Generically StructBody
-> Generically StructBody -> Generically StructBody)
@(StructBody -> StructBody -> StructBody)
((<>) @(Generically StructBody))
sconcat
= coerce
@(NonEmpty (Generically StructBody) -> Generically StructBody)
@(NonEmpty StructBody -> StructBody)
(sconcat @(Generically StructBody))
stimes
= coerce
@(b_a87f -> Generically StructBody -> Generically StructBody)
@(b_a87f -> StructBody -> StructBody)
(stimes @(Generically StructBody))
instance Monoid StructBody where
mempty :: StructBody
mappend :: StructBody -> StructBody -> StructBody
mconcat :: [StructBody] -> StructBody
mempty
= coerce
@(Generically StructBody) @StructBody
(mempty @(Generically StructBody))
mappend
= coerce
@(Generically StructBody
-> Generically StructBody -> Generically StructBody)
@(StructBody -> StructBody -> StructBody)
(mappend @(Generically StructBody))
mconcat
= coerce
@([Generically StructBody] -> Generically StructBody)
@([StructBody] -> StructBody) (mconcat @(Generically StructBody))
In the documentation it says that there is an instance (Generic a, Monoid (Rep a ())) => Monoid (Generically a)
which is defined exactly like the generated instance ghc dumped (source) which uses the Monoid of (Rep a ()
) which isn’t defined anywhere.
Where does the monoid come from?
This is the generated type Rep
Generated
Derived type family instances:
type Rep StructBody = D1
('MetaData "StructBody" "Ubc.Parse.Syntax.Struct" "main" 'False)
(C1
('MetaCons "StructBody" 'PrefixI 'True)
(S1
('MetaSel
('Just "variables")
'NoSourceUnpackedness
'NoSourceStrictness
'DecidedLazy)
(Rec0 [(VariableName, VariableType)])
:*: S1
('MetaSel
('Just "functions")
'NoSourceUnpackedness
'NoSourceStrictness
'DecidedLazy)
(Rec0 [Function])))
but I cannot find a Monoid
instance.
Do you know where I could learn about this?
Thank you for your time and attention
Edit: fixed a problem with a deriving clause, added a missing code block
Reading between the lines, the documentation has the key:
Generic instances … correspond to
Semigroup
andMonoid
instances defined by pointwise lifting.In more words: each generic type can be broken up into a tuple-like row of components, and the generic type admits a monoid/semigroup whenever every component in the row admits a monoid/semigroup. In your handwritten
Semigroup
instance, the given code is agnostic as to the types ofvariables
andfunctions
; all that matters is that they already haveSemigroup
instances of their own.Let me answer the other question: where’s the monoid in the generated
Rep
? Well, there isn’t one! TheRep
merely has a struct-like product of component types. If a monoid exists for each component, then a monoid for the entire struct exists (and is built from the obvious pointwise lifting!) but otherwise there isn’t a monoid derived from the struct itself. This should be a notable contrast from generic instances for e.g.Functor
, where everyRep
has exactly zero or oneFunctor
due to the algebra of the semiring of types (there is an underlying algebraic equation with at most one possible solution.)Thank you for the detailed answer, especially the explanation ‘in more words’ and the link helped me understand what happens in this Monoid instance.