Why we can't reliably check palindromes in one pass

I came across the concept of a "palindrome". I am trying to understand by reading through wikipedia

http://en.wikipedia.org/wiki/Palindrome#Computation_theory

This item catches my attention

This means that it is not possible for a computer with a finite amount of memory to reliably test palindromes in a single pass.

I thought to check if the given string "palindrome" is pretty straight forward? I got a quick code.

public class Utils {
    private static final String SPECIAL_CHARACTERS_REGEX = "[\\s!,]";

    private static String removeSpecialCharacters(String string) {
        return string.replaceAll(SPECIAL_CHARACTERS_REGEX, "");
    }

    public static boolean isPalindrome(String string) {
        String str = string.toLowerCase();
        str = removeSpecialCharacters(str);

        if (str.isEmpty()) {
            return false;
        }

        int head = 0;
        int tail = str.length() - 1;
        while (head < tail) {
            char h = str.charAt(head);
            char t = str.charAt(tail);

            if (h != t) {
                return false;
            }

            head++;
            tail--;
        }

        return true;
    }
}

      

At first glance it seems to work fine.

public static void main(String[] args) {
    String s = "";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // false

    s = "1";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // true

    s = "12";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // false

    s = "123";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // false

    s = "taco cat";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // true

    s = "A man, a plan, a canal, Panama!";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // true

    s = "Amor, Roma";
    System.out.println(s + " -> " + Utils.isPalindrome(s)); // true
}

      

If so, may I know why wikipedia states that it is not possible to check palindromes in one pass? Am I missing something?

+3


source to share


3 answers


The problem isn't checking if the string is already a palindrome in memory. If you can get the string into memory, the check can be done easily, but you've already used one pass, reading the string into memory, so the check is the second pass.

But this will only work if the entire string fits into memory. Since the premise is that memory is finite, this means that you cannot check if strings that are larger than the memory capacity are palindromes, which is what the sentence you quoted suggests.



In contrast, there are many checks that you can do with finite memory on any length of strings you want. For example, you can check if the length of a string is divisible by 5. You can check whether each should be immediately followed a

in a string b

. In general, you can check if a string matches any regular expression (here I mean a regular expression in the mathematical sense, as opposed to patterns recognized by the "regular expression" libraries.) But since you cannot describe a set of all palindromes using a regular expression, you cannot check in one pass that an arbitrarily long string is a palindrome using only a limited amount of memory.

+7


source


You just missed the first line before the said line which says: -

In automata theory, the set of all palindromes in a given alphabet is a typical example of a language that is context free but not regular.

Here they are talking about listing all possible palindromes in a given alphabet.

Tell us about the binary alphabet, A = {0,1}. Given language -> number of palindromes on alphabet A. There can be an infinite number of palindromes such as 1,0,11,00,101,111, ... and so on.

In the case of palindrome languages , it is at least impossible to get an idea of ​​the middle element and store that in memory (track) in a single pass on finite memory systems . To do this, you need to keep track of all sorts of characters in the string being evaluated, and how do you determine that the incoming characters change to the one you were visiting in just one pass on a finite memory system?

Wikipedia also states: -



In addition, a set of palindromes cannot be reliably tested by a deterministic push automaton. When reading a palindrome from left to right, it is essentially impossible to find "middle" until all words have been fully read.

Such a language, which consists of all such strings, cannot be evaluated in a single pass on a system with finite memory and therefore, due to limited memory constraints, cannot be an ordinary language (an ordinary language can be defined as a language recognized by a finite state machine --- This language cannot be recognized on a finite-memory system because finite-memory systems cannot have multiple passes ). Thus, the language cannot evaluate all these sets of strings for palindromes. Hence, this is definitely an example of a finite memory system .

This question comes back to one of the well-known questions of automata theory: -

For the E language, E = {0^i 1^j | i > j}

it is not regular. And, therefore, it cannot be proven on a finite memory machine. You need to reformulate the lemma theorem to prove that the given language is not regular. Then you will need to bring pushdown-automata back here. But this also has its limits (let's not talk about it here)

Also, the next line clearly intends to say that modern computers with huge memory and latest technology involving many passes will easily achieve the same --->

(For practical purposes with modern computers, this limitation only applies to incredibly long strings of letters.) // Again, when memory runs out on modern computers, while checking for palindromes.

+2


source


If the string is n long, your code uses O (log (n)) memory (indices head

and tail

need O (log (n)) bits). Thus, your program needs an unlimited amount of memory to be able to test arbitrarily long strings.

Application analysis of the complexity of the real code is difficult and usually not theoretically sound: it is obvious that in Java head

and tail

has ints and 32 bits each. But complexity analysis deals with arbitrarily large costs, not just large ones, sufficient for all practical purposes that are supported by real programming languages, and the difference can be hard to come to terms with (as here).

0


source







All Articles