The challenge
I need a password management tool that will be invoked by other processes (scripts of all sort: python, php, perl, etc) and it will be able to identify and verify the caller script in order to perform access control: either return a password back or exit -1
The current implementation
After looking into various frameworks, I have decided to use python
's keepassdb
which is able to handle Keepass V1.X backend database files and build my own access control overlay (since this can later be customized and integrated to our LDAP for user/group access). Access control is done via overloading the notes
field of each entry to include a list of SHA-256 hashes that are allowed to access the password. (Note that this also validates that the script is not changed by anyone)
The password manager is called with -p
parameter which is the PID of the callee script/application and will do the following steps:
- Look recursively "up" starting from its own PID and looking for parents. The caller PID has to be found before we reach process
1
which isinit
with parent 0. This way we are sure we know who called this password manager instance. - Get the full command line of that (parent) process and analyse it looking for scripting languages including python, perl, php, bash, bat, groovy, etc (
shlex
is used for this) - Figure out the absolute path of the script and calculate its SHA
- Compare this to the database values and see if it exists, if it does the script is allowed to have the password which is returned in stdout in a standard format. If not, exit with -1.
The problem
The above implementation works nicely for legit scripts but it is very easy to confuse it. Let caller.py
be a script that is allowed access to a specific entry e
. Running it the command line looks like python /path/to/caller.py arg1 arg2
. The code that parses the command line is:
cmd = walk_ppids(pid)
lg.debug(cmd)
if cmd is False:
lg.error("PID %s is not my parent process or not a process at all!" % pid)
sys.exit(-1)
cmd_parts = shlex.split(cmd)
running_script = ""
for p in cmd_parts:
if re.search("\.(php|py|pl|sh|groovy|bat)$", p, re.I):
running_script = p
break
if not running_script:
lg.error("Cannot identify this script's name/path!")
sys.exit(-1)
running_script = os.path.abspath(running_script)
lg.debug("Found "+running_script)
phash = hash_file(open(running_script, 'rb'), hashlib.sha256())
The command line of the parent process is acquired using:
os.popen("ps -p %s -o args=" % ppid).read().strip()
Now, the easiest way to confuse the above function is to create a shell script without the .sh
extension that takes as first argument the caller.py
. The sh does not use its arguments, instead it invokes the password manager querying for the entry e
. The command line would look like fake_sh ./caller.py
and thus the above code returns the pass... which is the wrong thing to do.
The Questions
One would assume that this is a common problem solved long time ago without programmers hard-coding passes into scripts/apps but I did a bit of research for few days and I didn't seem to able to find anything that works in similar way. I understand that this question is more open-ended so I will accept answers to the following:
- Am I re-inventing the wheel? Is there a framework/software that will do something similar?
- Is this the correct approach, relying on PIDs? Is there another way?
- Implementation wise, could the code posted be improved to be more robust and not that easily confused? (
shlex
analysis part)
Improvement: Making the rules more strict
The first step was to confirm that the correct extension runs on the correct interpreter which means that
caller.py
cannot run on/bin/bash
.Similar vulnerabilities can be exploited with python, for example the command
python -W ./caller.py ./myUberHack.py
. A command line analyzer that looks for the 1st.py
argument to the interpreter will think thatcaller.py
is running... which is not.Building all the invocation rules for all interpreters would be too time consuming, so I hard-code the assumptions. These are store in a
tuple
and each line is:And the validation code now is:
Can't think of anything better at the moment...