How to use Scala Cats Validated the correct way?

2020-07-04 08:04发布

Following is my use case

  1. I am using Cats for validation of my config. My config file is in json.
  2. I deserialize my config file to my case class Config using lift-json and then validate it using Cats. I am using this as a guide.
  3. My motive for using Cats is to collect all errors iff present at time of validation.

My problem is the examples given in the guide, are of the type

case class Person(name: String, age: Int)

def validatePerson(name: String, age: Int): ValidationResult[Person] = {
   (validateName(name),validate(age)).mapN(Person)
}

But in my case I already deserialized my config into my case class ( below is a sample ) and then I am passing it for validation

case class Config(source: List[String], dest: List[String], extra: List[String])

def vaildateConfig(config: Config): ValidationResult[Config] = {
  (validateSource(config.source), validateDestination(config.dest))
   .mapN { case _ => config }
}

The difference here is mapN { case _ => config }. As I already have a config if everything is valid I dont want to create the config anew from its members. This arises as I am passing config to validate function not it's members.

A person at my workplace told me this is not the correct way, as Cats Validated provides a way to construct an object if its members are valid. The object should not exist or should not be constructible if its members are invalid. Which makes complete sense to me.

So should I make any changes ? Is the above I'm doing acceptable ?

PS : The above Config is just an example, my real config can have other case classes as its members which themselves can depend on other case classes.

1条回答
forever°为你锁心
2楼-- · 2020-07-04 08:09

One of the central goals of the kind of programming promoted by libraries like Cats is to make invalid states unrepresentable. In a perfect world, according to this philosophy, it would be impossible to create an instance of Config with invalid member data (through the use of a library like Refined, where complex constraints can be expressed in and tracked by the type system, or simply by hiding unsafe constructors). In a slightly less perfect world, it might still be possible to construct invalid instances of Config, but discouraged, e.g. through the use of safe constructors (like your validatePerson method for Person).

It sounds like you're in an even less perfect world where you have instances of Config that may or may not contain invalid data, and you want to validate them to get "new" instances of Config that you know are valid. This is totally possible, and in some cases reasonable, and your validateConfig method is a perfectly legitimate way to solve this problem, if you're stuck in that imperfect world.

The downside, though, is that the compiler can't track the difference between the already-validated Config instances and the not-yet-validated ones. You'll have Config instances floating around in your program, and if you want to know whether they've already been validated or not, you'll have to trace through all the places they could have come from. In some contexts this might be just fine, but for large or complex programs it's not ideal.

To sum up: ideally you'd validate Config instances whenever they are created (possibly even making it impossible to create invalid ones), so that you don't have to remember whether any given Config is good or not—the type system can remember for you. If that's not possible, because of e.g. APIs or definitions you don't control, or if it just seems too burdensome for a simple use case, what you're doing with validateConfig is totally reasonable.


As a footnote, since you say above that you're interested in looking in more detail at Refined, what it provides for you in a situation like this is a way to avoid even more functions of the shape A => ValidationResult[A]. Right now your validateName method, for example, probably takes a String and returns a ValidationResult[String]. You can make exactly the same argument against this signature as I have against Config => ValidationResult[Config] above—once you're working with the result (by mapping a function over the Validated or whatever), you just have a string, and the type doesn't tell you that it's already been validated.

What Refined allows you to do is write a method like this:

def validateName(in: String): ValidationResult[Refined[String, SomeProperty]] = ...

…where SomeProperty might specify a minimum length, or the fact that the string matches a particular regular expression, etc. The important point is that you're not validating a String and returning a String that only you know something about—you're validating a String and returning a String that the compiler knows something about (via the Refined[A, Prop] wrapper).

Again, this may be (okay, probably is) overkill for your use case—you just might find it nice to know that you can push this principle (tracking validation in types) even further down through your program.

查看更多
登录 后发表回答