Determine if a string is a stopwatch or not in python

I want to determine if a string is a temporary word or not, I am writing python code for this, but I did not get the correct result code

stopwords = [ "a","about","above","after","again","against","all","am","an","and","any","are","aren't","as","at","be","because","been","before","being","below","between","both","but","by","can't","cannot","could","couldn't","did","didn't","do","does","doesn't","doing","don't","down","during","each","few","for","from","further","had","hadn't","has","hasn't","have","haven't","having","he","he'd","he'll","he's","her","here","here's","hers","herself","him","himself","his","how","how's","i","i'd","i'll","i'm","i've","if","in","into","is","isn't","it","it's","its","itself","let's","me","more","most","mustn't","my","myself","no","nor","not","of","off","on","once","only","or","other","ought","our","ours    ourselves","out","over","own","same","shan't","she","she'd","she'll","she's","should","shouldn't","so","some","such","than","that","that's","the","their","theirs","them","themselves","then","there","there's","these","they","they'd","they'll","they're","they've","this","those","through","to","too","under","until","up","very","was","wasn't","we","we'd","we'll","we're","we've","were","weren't","what","what's","when","when's","where","where's","which","while","who","who's","whom","why","why's","with","won't","would","wouldn't","you","you'd","you'll","you're","you've","your","yours","yourself","yourselves"];
file="C:/Python26/test.txt";
f=open("stopwords.txt",'w');
with open(file,'r') as rf:
    lines = rf.readlines();
    for word in lines:
        if word in stopwords:
            f.write(word.strip("\n")+"\t"'1'"\n");            
        else:
            f.write(word.strip("\n")+"\t"'0'"\n");
    f.close();

      

as a result I got 0 against every token / line stored in the test.txt file

+3


source to share


2 answers


Basically you are comparing the string with stop words from a list of stop words because you are iterating over sentences / lines returned rf.readlines()

not on single words... You need to iterate over every word in every lineso an additional cycle is required. Thus, add an extra loop like below to repeat each word on each line:



for line in lines:
    for word in line.split():  # split() splits the line on white-spaces
        if word in stopwords:
            f.write(word.strip("\n")+"\t"'1'"\n");            
        else:
            f.write(word.strip("\n")+"\t"'0'"\n");
    f.close();

      

+5


source


The problem is in the way you split line

. A good option is to use a list comprehension, Split a string to enumerate and enumerate a list.

stopwords = ["a", "about", "above", "after", "again", "against", "all", "am", "an", "and", "any", "are", "aren't", "as", "at", "be", "because", "been", "before", "being", "below", "between", "both", "but", "by", "can't", "cannot", "could", "couldn't", "did", "didn't", "do", "does", "doesn't", "doing", "don't", "down", "during", "each", "few", "for", "from", "further", "had", "hadn't", "has", "hasn't", "have", "haven't", "having", "he", "he'd", "he'll", "he's", "her", "here", "here's", "hers", "herself", "him", "himself", "his", "how", "how's", "i", "i'd", "i'll", "i'm", "i've", "if", "in", "into", "is", "isn't", "it", "it's", "its", "itself", "let's", "me", "more", "most", "mustn't", "my", "myself", "no", "nor", "not", "of", "off", "on", "once", "only", "or", "other", "ought", "our", "ours    ourselves", "out", "over", "own", "same", "shan't", "she", "she'd", "she'll", "she's", "should", "shouldn't", "so", "some", "such", "than", "that", "that's", "the", "their", "theirs", "them", "themselves", "then", "there", "there's", "these", "they", "they'd", "they'll", "they're", "they've", "this", "those", "through", "to", "too", "under", "until", "up", "very", "was", "wasn't", "we", "we'd", "we'll", "we're", "we've", "were", "weren't", "what", "what's", "when", "when's", "where", "where's", "which", "while", "who", "who's", "whom", "why", "why's", "with", "won't", "would", "wouldn't", "you", "you'd", "you'll", "you're", "you've", "your", "yours", "yourself", "yourselves"]

def stop_word_test(test_word):
    if test_word in stopwords:
        return test_word.strip("\n")+"\t"'1'"\n"
    else:
        return test_word.strip("\n")+"\t"'0'"\n"

with open("c:\\stopwords.txt", 'w') as write_file:
    with open("C:\\test.txt", 'r') as r_file:
        [write_file.write(value) for value in [stop_word_test(word) for line in r_file.readlines() for word in "".join((char if char.isalpha() else " ") for char in line).split()]]

      



In the above example, we are breaking the string for any punctuation that is not a letter.

Also no need ;

for python.

0


source







All Articles