fastest way to compare strings in python

2019-04-22 23:33发布

I'm writing a script in Python that will allow the user to input a string, which will be a command that instructs the script to perform a specific action. For the sake of argument, I'll say my command list is:

lock
read
write
request
log

Now, I want the user to be able to enter the word "log" and it will peform a specific action, which is very simple. However, I would like to match partial words. So, for example, if a user enters "lo", it should match "lock", as it's higher in the list. I've tried using strncmp from libc using ctypes to accomplish this, but have yet to make heads or tails of it.

10条回答
仙女界的扛把子
2楼-- · 2019-04-23 00:15

Replace with your favorite string compare function. Fairly fast, and to the point.

matches = ( x for x in list if x[:len(stringToSearchFor)] == stringToSearchFor )
print matches[0]
查看更多
再贱就再见
3楼-- · 2019-04-23 00:16

jaro_winkler() in python-Levenshtein might be what you're looking for.

查看更多
我命由我不由天
4楼-- · 2019-04-23 00:18
import timeit

cmds = []
for i in range(1,10000):
    cmds.append("test")

def get_cmds(user_input):
    return [c for c in cmds if c.startswith(user_input)]

if __name__=='__main__':
    t = timeit.Timer("get_cmds('te')", "from __main__ import get_cmds")
    print "%0.3f seconds" % (t.timeit(number=1))

#>>> 0.008 seconds

So basically, per my comment, you're asking how to optimise an operation that takes no measurable time or CPU. I used 10,000 commands here and the test string matches every one just to show that even under extreme circumstances you could still have hundreds of users doing this and they would never see any lag.

查看更多
看我几分像从前
5楼-- · 2019-04-23 00:22

If i understand your Q correctly, you want a snippet that will return the answer as soon as it has it, without traversing further through your 'command list.' This should do what you want:

from itertools import ifilter

def check_input(some_string, code_book) :
    for q in ifilter(code_book.__contains__, some_string) :
        return True
    return False
查看更多
Viruses.
6楼-- · 2019-04-23 00:23

This is optimized at runtime like you requested... (although most likely not needed)

Here is a simple bit of code which will take an input dictionary of command mapped to function, and results in an output dictionary of all non-duplicate sub commands mapped to the same function.

So you run this when you start your service, and then you have 100% optimized lookups. I am sure there is a more clever way to do this, so feel free to edit.

commands = {
  'log': log_function,
  'exit': exit_function,
  'foo': foo_function,
  'line': line_function,
  }

cmap = {}
kill = set()
for command in commands:
  for pos in range(len(1,command)):
    subcommand = command[0:pos]
    if subcommand in cmap:
      kill.add(subcommand)
      del(cmap[subcommand])
    if subcommand not in kill:
      cmap[subcommand] = commands[command]

#cmap now is the following - notice the duplicate prefixes removed?
{
  'lo': log_function,
  'log': log_function,
  'e': exit_function,
  'ex': exit_function,
  'exi': exit_function,
  'exit': exit_function,
  'f' : foo_function,
  'fo' : foo_function,
  'foo' : foo_function,
  'li' : line_function,
  'lin' : line_function,
  'line' : line_function,
}
查看更多
别忘想泡老子
7楼-- · 2019-04-23 00:26

If you are accepting input from a user, then why are you worried about the speed of comparison? Even the slowest technique will be far faster than the user can perceive. Use the simplest most understandable code you can, and leave efficiency concerns for tight inner loops.

cmds = [
    "lock",
    "read",
    "write",
    "request",
    "log",
    ]

def match_cmd(s):
    matched = [c for c in cmds if c.startswith(s)]
    if matched:
        return matched[0]
查看更多
登录 后发表回答