How to abstract over a “back and forth” transforma

2020-07-02 10:07发布

Consider this example (from https://codereview.stackexchange.com/questions/23456/crtitique-my-haskell-function-capitalize):

import Data.Char

capWord [] = []
capWord (h:t) = toUpper h : map toLower t

capitalize = unwords . map capWord . words

Is there a good way to abstract over the "back and forth" transformation, e.g. unwords . f . words? The best I could come up was

class Lift a b | a -> b where
  up :: a -> b
  down :: b -> a

instance Lift String [String] where
  up = words
  down = unwords

lifted :: (Lift a b) => (b -> b) -> a -> a
lifted f = down . f . up

capitalize = lifted (map capWord)

but it feels not very flexible, and needs MultiParamTypeClasses, FunctionalDependencies, TypeSynonymInstances and FlexibleInstances - which might be an indicator that it goes slightly over the top.

标签: haskell
5条回答
一夜七次
2楼-- · 2020-07-02 10:25

It's indeed not flexible enough! How would you lift a function to work on a line-by-line basis? You're going to need a newtype wrapper for that! Like so

newtype LineByLine = LineByLine { unLineByLine :: String }

instance Lift LineByLine [String] where
    up = lines . unLineByLine
    down = LineByLine . unlines

But now there is no good reason to prefer the word-by-word version over the line-by-line one.

I would just use unwords . map f . words, to me that's the idiomatic "Apply f to all the words and put them back together". If you do this more often, consider writing a function.

查看更多
闹够了就滚
3楼-- · 2020-07-02 10:34

You could use a lens for this. Lenses are quite a lot more general than this I think, but anything where you have such bidirectional functions can be made into a lens.

For example, given words and unwords, we can make a worded lens:

worded :: Simple Iso String [String]
worded = iso words unwords

Then you can use it to apply a function inside the lens, e.g. lifted f x becomes (worded %~ f) x. The only downside of lenses is that the library is extremely complicated, and has many obscure operators like %~, even though the core idea of a lens is actually quite simple.

EDIT: This is not an isomorphism

I had incorrectly assumed that unwords . words is equivalent to the identity function, and it is not, because extra spaces between words are lost, as correctly pointed out by several people.

Instead, we could use a much more complicated lens, like the following, which does preserve the spacing between words. Although I think it's still not an isomorphism, this does at least mean that x == (x & worded %~ id), I hope. It is on the other hand, not in the least a very nice way of doing things, and not very efficient. It is possible that an indexed lens of the words themselves (rather than a list of the words) may be better, although it permits fewer operations (I think, it's really hard to tell when lenses are involved).

import Data.Char (isSpace)
import Control.Lens

worded :: Simple Lens String [String]
worded f s =
    let p = makeParts s
    in fmap (joinParts p) (f (takeParts p))

data Parts = End | Space Char Parts | Word String Parts

makeParts :: String -> Parts
makeParts = startPart
    where
      startPart [] = End
      startPart (c:cs) =
          if isSpace c then Space c (startPart cs) else joinPart (Word . (c:)) cs

      joinPart k [] = k [] End
      joinPart k (c:cs) =
          if isSpace c then k [] (Space c (startPart cs)) else joinPart (k . (c:)) cs

takeParts :: Parts -> [String]
takeParts End = []
takeParts (Space _ t) = takeParts t
takeParts (Word s t) = s : takeParts t

joinParts :: Parts -> [String] -> String
joinParts _ [] = []
joinParts (Word _ End) (ws@(_:_:_)) = unwords ws
joinParts End ws = unwords ws
joinParts (Space c t) ws = c : joinParts t ws
joinParts (Word _ t) (w:ws) = w ++ joinParts t ws
查看更多
Emotional °昔
4楼-- · 2020-07-02 10:35

Like DarkOtter suggested, Edward Kmett's lens library has you covered, but Lens is too weak and Iso is slightly too strong since unwords . words isn't an identity. You could try a Prism instead.

wordPrism :: Prism' String [String]
wordPrism = prism' unwords $ \s ->
   -- inefficient, but poignant
   if s == (unwords . words) s then Just (words s) else Nothing

Now you can define capitalize as

capitalize' :: String -> String
capitalize' = wordPrism %~ map capWord
-- a.k.a    = over wordPrism (map capWord)

but this has fairly pathological default behavior for your case. For Strings which can't be mapped as isomorphisms (strings with multiple spaces in a row inside of them) over wordPrism g == id. There ought to be an "over if possible" operator for Prisms, but I don't know of one. You could define it though:

overIfPossible :: Prism s t a b -> (a -> b) -> (s -> Maybe t)
overIfPossible p f s = if (isn't p s) then Nothing else Just (over p f s)

capitalize :: String -> Maybe String
capitalize = wordPrism `overIfPossible` map capWord

Now, really, both of these are pretty unsatisfactory since what you really want is to capitalize all words and retain the spacing. For this (words, unwords) is too weak generally due to the non-existence of isomorphism that I've highlighted above. You'd have to write your own custom machinery which maintains spaces after which you'd have an Iso and could use DarkOtter's answer directly.

查看更多
Evening l夕情丶
5楼-- · 2020-07-02 10:41

I'd say the best answer is "no, because abstracting over that doesn't buy you anything". In fact your solution is far less flexible: there can be only one instance of Lift String [String] in scope and there are more ways to split string into a list of strings than just words/unwords (which means you'll start throwing newtypes or even more arcane extensions into the mix). Keep it simple — the original capitalize is just fine the way it is.

Or, if you really insist:

lifted :: (a -> b, b -> a) -> (b -> b) -> a -> a
lifted (up, down) f = down . f . up

onWords = lifted (words, unwords)
onLines = lifted (lines, unlines)

capitalize = onWords $ map capWord

Conceptually the same thing as your typeclass, except without abusing typeclass machinery so much.

查看更多
干净又极端
6楼-- · 2020-07-02 10:47

Your lifted is actually the same as dimap from Data.Profunctor:

onWords = dimap words unwords
capitalize = onWords (map capWord)

That might not be the direction of generalization you thought about. But look at the type of the equivalent function in Control.Functor from category-extras:

dimap :: Bifunctor f (Dual k) k k => k b a -> k c d -> k (f a c) (f b d)

This version generalizes it over everything which is both a QFunctor and a co-PFunctor. Not that useful in everyday scenarios, but interesting.

查看更多
登录 后发表回答