I love Faker, I use it in my seeds.rb
all the time to populate my dev environment with real-ish looking data.
I've also just started using Factory Girl which also saves a lot of time - but when i sleuth around the web for code examples I don't see much evidence of people combining the two.
Q. Is there a good reason why people don't use faker in a factory?
My feeling is that by doing so I'd increase the robustness of my tests by seeding random - but predictable - data each time, which hopefully would increase the chances of a bug popping up.
But perhaps that's incorrect and there is either no benefit over hard coding a factory or I'm not seeing a potential pitfall. Is there a good reason why these two gems should or shouldn't be combined?
Some people argue against it, as here.
DO NOT USE RANDOM ATTRIBUTE VALUES
One common pattern is to use a fake data library (like Faker or Forgery) to generate random values on
the fly. This may seem attractive for names, email addresses or
telephone numbers, but it serves no real purpose. Creating unique
values is simple enough with sequences:
FactoryGirl.define do
sequence(:title) { |n| "Example title #{n}" }
factory :post do
title
end
end
FactoryGirl.create(:post).title # => 'Example title 1'
Your randomised
data might at some stage trigger unexpected results in your tests,
making your factories frustrating to work with. Any value that might
affect your test outcome in some way would have to be overridden,
meaning:
Over time, you will discover new attributes that cause your test to
fail sometimes. This is a frustrating process, since tests might fail
only once in every ten or hundred runs – depending on how many
attributes and possible values there are, and which combination
triggers the bug. You will have to list every such random attribute in
every test to override it, which is silly. So, you create non-random
factories, thereby negating any benefit of the original randomness.
One might argue, as Henrik Nyh does, that random values help you
discover bugs. While possible, that obviously means you have a bigger
problem: holes in your test suite. In the worst case scenario the bug
still goes undetected; in the best case scenario you get a cryptic
error message that disappears the next time you run the test, making
it hard to debug. True, a cryptic error is better than no error, but
randomised factories remain a poor substitute for proper unit tests,
code review and TDD to prevent these problems.
Randomised factories are therefore not only not worth the effort, they
even give you false confidence in your tests, which is worse than
having no tests at all.
But there's nothing stopping you from doing it if you want to, just do it.
Oh, and there is an even easier way to inline a sequence in recent FactoryGirl, that quote was written for an older version.
It's up to you.
In my opinion is a very good idea to have random data in tests and it always helped me to discover bugs and corner cases I didn't think about.
I never regret to have random data. All the points described by @jrochkind would be correct (and you should read the other answer before reading this one) but it's also true that you can (and should) write that in your spec_helper.rb
config.before(:all) { Faker::Config.random = Random.new(config.seed) }
this will make so that you have repeatable tests with repeatable data as well. If you don't do that then you have all the problems described in the other answer.
I like to use Faker and usually do so when working with larger code bases. I see the following advantages and disadvantages when using Faker with Factory Girl:
Possible disadvantages:
- A bit harder to reproduce the exact same test scenario (at least RSpec works around this by displaying the random number generator seed every time and allows you to reproduce the exact same test with it)
- Generating data wastes a bit of performance
Possible advantages:
- Makes data displayed usually more humanly comprehensible. When creating test-data manually, people tend to all kinds of short-cuts to avoid the tediousness.
- Building factories with Faker for tests at the same time provides you with the means of generating nice demo data for presentations.
- You could randomly discover edge case bugs when running the tests a lot