Uploading files based on a list of Python input

Question

Uploading files based on a list of Python input

I am trying to upload files based on a given list. My script does it well for me. However, I have one problem. It only downloads the first one. It doesn't iterate over the list. I changed the code a bit, and now it treats all items in the list as one item and returns the error: "No such files or directory data \ item1 \ nitem2 \ nitem3 \ nitem4 \ nitem5.txt" Here is a part of my code that parses the input list.

def get_data(name):
    file_name = os.path.join("data", name + ".txt")
    if not os.path.exists(file_name):
        sys.exit(-1)

    inF = open(file_name, "r") 
    lines = inF.readlines()
    data = ''.join(lines)

    return data

EDIT:

def download_final_data_for_data(data):
    url = "http://www.example.com/"+ data
    url_file = urlopen(url)
    soup = BeautifulSoup(url_file)
    soup1 = str(soup)
    pattern=re.compile(r'''>final_data(.*?)</a>''')  
    data = pattern.findall(soup1)
    final_data_number = ''.join(data)
    return final_data_number

def get_data(name):
    data_list = []
    file_name = os.path.join("data", name + ".txt")
    if not os.path.exists(file_name):
        sys.exit(-1)

    inF = open(file_name, "r") 
    lines = inF.readlines()
    for line in lines:
        data = line.strip()
        if len(data) > 1:
            data_list.append(data)
        else:
            data_list.append(sys.argv[1])
    return data_list

+3

python list file parsing download

abn 08 Sep 14 at 18:16

source to share

5 answers

It's hard to tell without seeing more context and a concrete example of the input, but it looks like it name

contains something like item1\nitem2\nitem3\nitem4\nitem5

. Did you print it out to check?

I notice that you join lines

together in a single line data

. If you did something like this with a parameter name

, I would expect to see something like what you describe.

I guess what you probably want to do is something like:

for fn in name:
    get_data(fn.strip())  # strip off possible trailing \n

but without the union name

. If it is name

already a string, as you described, you need to do something like this:

name = name.split('\n')
for fn in name:
    get_data(fn)

+2

OldGeeksGuide 08 Sep 14 at 18:31

source to share

for name in namelist.split('\\'):
    data = get_data(name)

+2

gkusner 08 Sep 14 at 18:31

source to share

I am assuming you are passing the string to get_data (), since otherwise you will get a concatenation error. If that is the case, since the filename you are getting in your error includes line breaks and you are joining everything at the end of the method, I am assuming you are joining the input target from the file. However, I cannot determine this without seeing what "name" is.

If this is what you are doing, I would suggest using file.readlines () and passing that to get_data. It would look like this:

for name in file.readlines():
    data = get_data(name)

Otherwise, if for some reason you need to read it all in one line, you can try:

names = name.split('\n')
for name in names:
    data = get_data(name)

+2

cacarpenter89 08 Sep 14 at 18:36

source to share

Since your statements are not very clear, I will try to show the skeleton of my way of solving such a problem.

You can use argparse to tell the program to use specific files as link lists.

Argparse provides a CLI in the script below, which you can call as follows.

python ./script.py -i list.txt -o ./

To load everything into the current directoy (note that this is not implemented) Or use a bunch of files via python ./script.py -i lists/* -o ./

import argparse   

def parseList(file):   # Parse the file, remove newlines/empty lines

    with open(file, 'r') as f: 
        lines = [line.strip() for line in f if line.strip()]
    return lines


def downloadLinks(links, output): # DOWNLOAD ALL THE LINKS!
    for link in links:
        print("Download me: %s" % link)

if __name__ == '__main__':

    ap = argparse.ArgumentParser('File Downloader')

    ap.add_argument('-i','--input',nargs='+', required=True, help='Path to the download list')
    ap.add_argument('-o','--output',required=True, help='Path to the output directory')

    args = vars(ap.parse_args())



    for file in args['input']:  # loop over all input files and process them
        parsedList = parseList(file)
        downloadLinks(parsedList, args['output'])

+2

xoryouyou 17 Sep 14 at 12:29

source to share

user1941126 · Accepted Answer · 2014-09-16T22:01:27+0000

I can see where the problem is. The problematic part:

file_name = os.path.join("data", name + ".txt")

to get the correct filename you have to iterate over the names somehow. To get a list of your names as you read them (and since they are in the code right now) do

namelist = name.split("/n") #this gives you a list that you can work on.
                            #alternatively read the file line-by-line (which you don't at the moment)

what your code does is concatenate the strings containing "data", all the names you read with newlines and the ".txt" suffix. Anyway, just do

for name in namelist:
    #do stuff with name
    file_name = os.path.join("data",name+".txt")
    ....

Uploading files based on a list of Python input

More articles: