How to benchmark unit tests in Python without addi

I have a Python project with a bunch of tests that have already been implemented, and I'd like to begin benchmarking them so I can compare performance of the code, servers, etc over time. Locating the files in a manner similar to Nose was no problem because I have "test" in the names of all my test files anyway. However, I'm running into some trouble in attempting to dynamically execute these tests.

As of right now, I'm able to run a script that takes a directory path as an argument and returns a list of filepaths like this:

def getTestFiles(directory):
    fileList = []
    print "Searching for 'test' in " + directory
    if not os.path.isdir(os.path.dirname(directory)):
        # throw error
        raise InputError(directory, "Not a valid directory")
    else:
        for root, dirs, files in os.walk(directory):
            #print files
            for f in files:
                if "test" in f and f.endswith(".py"):
                    fileList.append(os.path.join(root, f))
    return fileList

# returns a list like this:
# [  'C:/Users/myName/Desktop/example1_test.py',
#    'C:/Users/myName/Desktop/example2_test.py',
#    'C:/Users/myName/Desktop/folder1/example3_test.py',
#    'C:/Users/myName/Desktop/folder2/example4_test.py'...  ]

The issue is that these files can have different syntax, which I'm trying to figure out how to handle. For example:

TestExampleOne:

import dummy1
import dummy2
import dummy3

class TestExampleOne(unittest.TestCase):

    @classmethod
    def setUpClass(cls):
        # set up

    def test_one(self):
        # test stuff
    def test_two(self):
        # test stuff
    def test_three(self):
        # test stuff

    # etc...

TestExampleTwo:

import dummy1
import dummy2
import dummy3

def setup(self):
    try:
        # config stuff
    except Exception as e:
        logger.exception(e)

def test_one():
    # test stuff
def test_two():
    # test stuff
def test_three():
    # test stuff

# etc...

TestExampleThree:

import dummy1
import dummy2
import dummy3

def setup(self):
    try:
        # config stuff
    except Exception as e:
        logger.exception(e)

class TestExampleTwo(unittest.TestCase):
    def test_one(self):
        # test stuff
    def test_two(self):
        # test stuff
    # etc...

class TestExampleThree(unittest.TestCase):
    def test_one(self):
        # test stuff
    def test_two(self):
        # test stuff
    # etc...

# etc...

I would really like to be able to write one module that searches a directory for every file containing "test" in its name, and then executes every unit test in each file, providing execution time for each test. I think something like NodeVisitor is on the right track, but I'm not sure. Even an idea of where to start would be greatly appreciated. Thanks

Using nose test runner would help to discover the tests, setup/teardown functions and methods.

nose-timer plugin would help with benchmarking:

A timer plugin for nosetests that answers the question: how much time does every test take?

Demo:

imagine you have a package named test_nose with the following scripts inside:

test1.py:

import time
import unittest

class TestExampleOne(unittest.TestCase):
    @classmethod
    def setUpClass(cls):
        cls.value = 1

    def test_one(self):
        time.sleep(1)
        self.assertEqual(1, self.value)

test2.py:

import time

value = None

def setup():
    global value
    value = 1

def test_one():
    time.sleep(2)
    assert value == 1

test3.py:

import time
import unittest

value = None

def setup():
    global value
    value = 1

class TestExampleTwo(unittest.TestCase):
    def test_one(self):
        time.sleep(3)
        self.assertEqual(1, value)

class TestExampleThree(unittest.TestCase):
    def test_one(self):
        time.sleep(4)
        self.assertEqual(1, value)

install nose test runner:
```
pip install nose
```
install nose-timer plugin:
```
pip install nose-timer
```

run the tests:

$ nosetests test_nose --with-timer
....
test_nose.test3.TestExampleThree.test_one: 4.0003s
test_nose.test3.TestExampleTwo.test_one: 3.0010s
test_nose.test2.test_one: 2.0011s
test_nose.test1.TestExampleOne.test_one: 1.0005s
----------------------------------------------------------------------
Ran 4 tests in 10.006s

OK

The result is actually conveniently highlighted:

The coloring can be controlled by --timer-ok and --timer-warning arguments.

Note that time.sleep(n) calls were added for making the manual slowdowns to see the impact clearly. Also note that value variable is set to 1 in "setup" functions and methods, then in test function and methods the value is asserted to be 1 - this way you can see the work of setup functions.

UPD (running nose with nose-timer from script):

from pprint import pprint
import nose
from nosetimer import plugin

plugin = plugin.TimerPlugin()
plugin.enabled = True
plugin.timer_ok = 1000
plugin.timer_warning = 2000
plugin.timer_no_color = False


nose.run(plugins=[plugin])
result = plugin._timed_tests
pprint(result)

Save it into the test.py script and pass a target directory to it:

python test.py /home/example/dir/tests --with-timer

The result variable would contain:

{'test_nose.test1.TestExampleOne.test_one': 1.0009748935699463,
 'test_nose.test2.test_one': 2.0003929138183594,
 'test_nose.test3.TestExampleThree.test_one': 4.000233173370361,
 'test_nose.test3.TestExampleTwo.test_one': 3.001115083694458}