Define words in Python

This might seem like a duplicate: Does Python define a word?

However, this is not because I am trying to implement this answer (which works for this OP thread, but not for me) in my code.

Here is my function:

def define_word(user_define_input):
    srch = str(user_define_input[1])
    items=re.findall('<meta name="description" content="'+".*$",output_word,re.MULTILINE)
    for output_word in items:
        y=output_word.replace('<meta name="description" content="','')
        z=y.replace(' See more."/>','')
        m=re.findall('at, a free online dictionary with pronunciation, synonyms and translation. Look it up now! "/>',z)
        if m==[]:
            if z.startswith("Get your reference question answered by"):
                print ("Word not found!")
                print (z)
        print ("Word not found!")



>>> print (user_define_input) #to show what is in the list
>>> define <word entered> #prints out the list, in this case, the program ignores user_define_input[0] and looks for [1] which is the targeted word


Also, this contains a bit of HTML: / sorry, but that's what the other answer used.

So, the error when I try to use this:

File "/Users/******/GitHub/Multitool/", line 104, in define_word
items=re.findall('<meta name="description" content="'+".*$",output_word,re.MULTILINE)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/", line 210, in findall
return _compile(pattern, flags).findall(string)
TypeError: can't use a string pattern on a bytes-like object


Note: line 104 of the .py function:

items=re.findall('<meta name="description" content="'+".*$",output_word,re.MULTILINE)


Line 210 is the last line of this function:

def findall(pattern, string, flags=0):
    """Return a list of all non-overlapping matches in the string.

    If one or more capturing groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern
has more than one group.

Empty matches are included in the result."""
    return _compile(pattern, flags).findall(string) #line 210


If there is anything confusing about this, please tell me (and I'm not sure what tags to add for this: /). And thank you in advance. Feel free to change anything or even rewrite the whole thing, but just use variables / lists:

  • define_word (for function name)
  • user_define_input

If you would like to see git for this, please visit this link:


output_word = output_word.decode()


or change

output_word ='iso-8859-2')


gave this when put this in: define test:

Test definition, the means by which the presence, quality, or genuineness of anything is determined; a means of trial.<meta property="og:url" content=""/><link rel="shortcut icon" href=""/><!--[if lt IE 9]><link rel="respond-proxy" id="respond-proxy" href="" /><![endif]--><!--[if lt IE 9]><link rel="respond-redirect" id="respond-redirect" href="" /><![endif]--><link rel="search" type="application/opensearchdescription+xml" href="" title=""/><link rel="publisher" href=""/><link rel="canonical" href=""/><link rel="stylesheet" href="" type="text/css" media="all"/><link rel="stylesheet" href="" type="text/css" media="all"/><script type="text/javascript">var searchURL="";var CTSParams={"infix":"","clkpage":"dic","clksite":"dict","clkld":0};</script>
Word not found!



source to share

3 answers

output_word = output_word.decode()


converts bytes to string.


this is the last state from the script in the chat (still far from perfect ...):

import requests
from lxml import html

def define_word(word):
    response = requests.get(
    tree = html.fromstring(response.text)
    title = tree.xpath('//title/text()')
    defs = tree.xpath('//div[@class="def-content"]/text()')
    # print(defs)

    defs = ''.join(defs)
    defs = defs.split('\n')
    defs = [d for d in defs if d]
    for d in defs:






returns a string of bytes. The exception indicates that you cannot use a Python string as a regular expression pattern when applied to a byte string.

The byte string will (usually) be a unicode encoded string, in which case it looks like UTF-8 encoded data. Therefore, you need to decode the byte string into a Python string so that it can be used as a regular expression pattern:

output_word = urllib.request.urlopen(""+srch+"?s=t")
output_word ='utf8')


This should fix the problem for you.

You need to know which encoding to use. This can be done by looking at the response header Content-Type

that is for this URL Content-Type: text/html; charset=UTF-8

. Also, since this is HTML content, you can find the tag <meta http-equiv="Content-type" ...


Finally, you can use a library requests

that will handle this for you:

import requests
r = requests.get(""+srch+"?s=t")
output_word = r.text




After a few changes, this is the code I stuck with, although it still has a few flaws.

def define_word(user_define_input):
        response = requests.get("{}?s=t".format(user_define_input[1]))
    except IndexError:
        print("You have not entered a word!")
    tree = html.fromstring(response.text)
    title = tree.xpath('//title/text()')
    defs = tree.xpath('//div[@class="def-content"]/text()')
    defs = ''.join(defs)
    defs = defs.replace("() ", "")
    defs = defs.split('\n')
    defs = [d for d in defs if d]
    for d in defs:


and this to break the user input into a list with two elements:

def split_line_test(user_input):
    global user_define_input
    user_define_input = user_input.split()
    if (user_define_input[0] == "define"): #define is user_define_input[0] while user_define_input[1] is the word that will be searched up
        return True
    if (user_define_input[0] == "weather"): #you can ignore this, it is for my other function
        return True
    return False


So thank you for helping me fix the code :)



All Articles