measure elapsed time, amount of memory and cpu use

I'm executing an external program through Python. I want to know what is the best choice for calling the outside program, with subprocess.Popen() or with subprocess.call(). Also, I need to measure elapsed time, the amount of memory and CPU used by the external program. I've heard of psutil, but I don't really know which to choose.

also I need to measure elapsed time, amount of memory and cpu used by the extern program

(I'm going to assume you only need the information available in your platform's rusage. And, since Windows has no such thing at all, I'm also going to assume you don't care about Windows. If you need additional information that's only available in some platform-specific way (reading out of Linux's proc filesystem, or calling AIX's monitor APIs, or whatever), you pretty much can't do this with the stdlib, and the psutil answer is the only one.)

The subprocess library wraps up calling fork, then an execv-family function in the child and a waitpid-family function in the parent. (You can see this by starting with the source to call and tracing down into the other calls from there.)

Unfortunately, the easy way to get resource usage from a child is to call wait3 or wait4, not wait or waitpid. So subprocess gets you maddeningly close to what you want, but not quite there.

But you've got a few options:

If you only have one child process, getrusage(RUSAGE_CHILDREN) is all you need.
You can launch the process, then use psutil (or platform-specific code) to get resource information from proc.pid before reaping the child.
You can use psutil to do everything, leaving subprocess behind.
You can subclass subprocess.Popen to override its wait method.

The last one is a lot simpler than it sounds. If you look at the source, there are only 3 places where os.waitpid is actually called, and only one of them will be the one that affects your code; I think it's the one in _try_wait. So (untested):

class ResourcePopen(subprocess.Popen):
    def _try_wait(self, wait_flags):
        """All callers to this function MUST hold self._waitpid_lock."""
        try:
            (pid, sts, res) = _eintr_retry_call(os.wait4, self.pid, wait_flags)
        except OSError as e:
            if e.errno != errno.ECHILD:
                raise
            # This happens if SIGCLD is set to be ignored or waiting
            # for child processes has otherwise been disabled for our
            # process.  This child is dead, we can't get the status.
            pid = self.pid
            sts = 0
        else:
            self.rusage = res
        return (pid, sts)

def resource_call(*popenargs, timeout=None, **kwargs):
    """Run command with arguments.  Wait for command to complete or
    timeout, then return the returncode attribute and resource usage.

    The arguments are the same as for the Popen constructor.  Example:

    retcode, rusage = call(["ls", "-l"])
    """
    with ResourcePopen(*popenargs, **kwargs) as p:
        try:
            retcode = p.wait(timeout=timeout)
            return retcode, p.rusage
        except:
            p.kill()
            p.wait()
            raise

And now:

retcode, rusage = resource_call(['spam', 'eggs'])
print('spam used {}s of system time'.format(rusage.ru_stime))

Compare that to using a hybrid with psutil (which won't even work when used this way on many platforms…):

p = subprocess.Popen(['spam', 'eggs'])
ps = psutil.Process(p.pid)
p.wait()
print('spam used {}s of system time'.format(ps.cpu_times().system))

Of course the latter isn't more complex for not good reason, it's more complex because it's a whole lot more powerful and flexible; you can also get all kinds of data that rusage doesn't include, and you can get information every second while the process is running instead of waiting until it's done, and you can use it on Windows, and so on…