I am currently searching for a way to find patterns in a list of integers, but the method I am going to use would be applicable to strings and other lists with different elements of course. Now let me explain what I am looking for.
I want to find the longest repeating pattern in a list of integers. For example,
[1, 2, 3, 4, 1, 2, 3]
# This list would give 1, 2, 3
Overlapping patterns should be discarded. ( Not certain )
[1, 1, 1, 1, 1]
# Should give 1, 1 Not 1, 1, 1, 1
Here is what did not help me.
Finding patterns in a list (Did not understand the logic behind the first answer, very little explanation. And second answer solves the problem only if the pattern is known before solving.)
Finding integer pattern from a list (Pattern is given and number of occurrence is wanted. Different than my question.)
Longest common subsequence problem (Most people dealed with this problem however it is not close to mine. I need consecutive elements while searching for a pattern. However in this, seperate elements also counted as subsequences.)
Here what I tried.
def pattern(seq):
n = len(seq)
c = defaultdict(int) # Counts of each subsequence
for i in xrange(n):
for j in xrange(i + 1, min(n, n / 2 + i)):
# Used n / 2 because I figured if a pattern is being searched
# It cant be longer that the half of the list.
c[tuple(seq[i:j])] += 1
return c
As you see, It finds all the sublists and check for repeats. I found this approach a bit naive(and inefficient) and I am in need of a better way. Please help me. Thanks in advance.
Note1: The list is predetermined but because of my algorithms failure, I can only check some parts of the list before freezing the computer. So the pattern I am trying to find can very well be longer than the half of the search list, It can even be longer than the search list itself because I am searching only a part of the original list.If you present a better method than I am using, I can search a larger part of the original list so I will have a better chance at finding the pattern. (If there is one.)
Note2: Here is a part of the list if you want to test it yourself. It really seems like that there is a pattern, but I cannot be sure before I test it with a reliable code. Sample List
Note3: I approach this as a serious problem of data mining. And will try to learn if you make a long explanation. This feels like a much more important problem than LCS, however LCS is much more popular :D This method I am trying to find feels like the methods scientists use to find DNA patterns.
The Code
Ignoring the "no overlapping" requirement, here's the code I used:
Using that, I found 8 distinct patterns of length 303 in your dataset. The program ran pretty fast, too.
Pseudocode Version
I have an answer. It works. (without overlapping) but it is for python3