Removing all non-letter characters from a string in C #

I want to remove all letters from a string. When I say all letters, I mean everything that is not in the alphabet or apostrophe. This is the code I have.

public static string RemoveBadChars(string word)
{
    char[] chars = new char[word.Length];
    for (int i = 0; i < word.Length; i++)
    {
        char c = word[i];

        if ((int)c >= 65 && (int)c <= 90)
        {
            chars[i] = c;
        }
        else if ((int)c >= 97 && (int)c <= 122)
        {
            chars[i] = c;
        }
        else if ((int)c == 44)
        {
            chars[i] = c;
        }
    }

    word = new string(chars);

    return word;
}

      

It's close, but doesn't quite work. The problem is this:

[in]: "(the"
[out]: " the"

      

This gives me a space instead of "(". I want to completely remove the character.

+3


source to share


5 answers


A regex would be better as it is quite inefficient, but to answer your question, the problem with your code is that you have to use a variable other than i inside the for loop. So, something like this:



public static string RemoveBadChars(string word)
{
    char[] chars = new char[word.Length];
    int myindex=0;
    for (int i = 0; i < word.Length; i++)
    {
        char c = word[i];

        if ((int)c >= 65 && (int)c <= 90)
        {
            chars[myindex] = c;
            myindex++;
        }
        else if ((int)c >= 97 && (int)c <= 122)
        {
            chars[myindex] = c;
            myindex++;
        }
        else if ((int)c == 44)
        {
            chars[myindex] = c;
            myindex++;
        }
    }

    word = new string(chars);

    return word;
}

      

+2


source


There Char

is a method in the class that can help. Use Char.IsLetter()

to find the correct letters (and an extra check for the apostrophe) and then pass the result to the constructor string

:

var input = "(the;':";

var result = new string(input.Where(c => Char.IsLetter(c) || c == '\'').ToArray());

      



Output:

+6


source


Use Regular Expression (Regex) instead .

public static string RemoveBadChars(string word)
{
    Regex reg = new Regex("[^a-zA-Z']");
    return reg.Replace(word, string.Empty);
}

      

If you don't want to replace spaces:

Regex reg = new Regex("[^a-zA-Z' ]");

      

+4


source


private static Regex badChars = new Regex("[^A-Za-z']");

public static string RemoveBadChars(string word)
{
    return badChars.Replace(word, "");
}

      

This creates a regular expression consisting of a character class (enclosed in square brackets) that looks for anything that is not (leading ^

within the character class) AZ, az, or. It then defines a function that replaces whatever matches the expression with an empty string.

+1


source


This is a working answer, it says he wants to remove non-alphabetic characters

public static string RemoveNoneLetterChars(string word)
{
    Regex reg = new Regex(@"\W");
    return reg.Replace(word, " "); // or return reg.Replace(word, String.Empty); 
}

      

0


source







All Articles