Python - concatenating string characters with space

I am trying to handle string input. I first joined the input with \n

, so I could have each word on one line (this is what I need):

some
random
words
written

      

and convert it to something like this:

s o m e
r a n d o m
w o r d s
w r i t t e n

      

But for some reason I get random spaces at the beginning of the line, but not every line. There are zero spaces in the input, I checked it on purpose. I don't know where these extra spaces come from.

Here's my code:

input = "some random words written"
string = '\n'.join(re.findall(r"\w{4,}", input)) #regex bc I need the words to be at least 4 characters
space = " ".join(string)
print(space)

      

This gives me something like this:

s o m e
 r a n d o m
 w o r d s
 w r i t t e n

      

Can anyone understand why?

+3


source to share


4 answers


I wouldn't use regular expressions for this.

[x for x in input.split() if len(x) > 3]

      

... will filter out words less than 4 characters long.

[' '.join(y) for y in [x for x in input.split() if len(x) > 3]]

      



... will turn this into a list of "words" with each word "spaced".

So, you can do it all:

'\n'.join([' '.join(y) for y in [x for x in input.split() if len(x) > 3]])

      

It is often best to create your functional code snippets using an iterative bottom-up approach, such as the one I showed here. Also regular expressions tend to be slow and somewhat dangerous. You rely on a complex and sophisticated set of parsers to interpret and apply your regular expressions. When you can avoid them, that's usually a good thing. the code is likely to run faster and be more reliable.

+2


source


Try the following:

'\n'.join(' '.join(i) for i in text.split() if len(i) >= 4)

      

First, find all words greater than or equal to four letters.

Then connect these words with a space. Since it str

is iterable, it places a space between each letter.



Then attach it to \n

and you're done!

>>> text = "some random words written"
>>> print('\n'.join(' '.join(i) for i in text.split() if len(i) >= 4))
s o m e
r a n d o m
w o r d s
w r i t t e n

      

The reason your solution doesn't work is because it puts a space between the newlines and the new character. join

puts a space between each character.

0


source


you can do it with 1 generator and no regex:

strg = "some random words written"
print('\n'.join(' '.join(word) for word in strg.split() if len(word) > 3))

      

started just like this answer ; mine is very similar, but since I have a solution that is a little shorter, I posted it anyway ...

and input

is built-in; avoid these variable names.

0


source


You can use a list instead regex

, i.e.

print("\n".join(' '.join(x) for x in input.split() if len(x) > 3 ))

      

If you really need regex

to use:

print("\n".join(' '.join(x) for x in re.findall('\w{4,}', input)))

      


output:

s o m e
r a n d o m
w o r d s
w r i t t e n

      

0


source







All Articles