Why are my pictures damaged after downloading and writing them in python?
Foreword
This is my first post on stackoverflow, so I apologize if I got it wrong. I searched a lot on the internet and stackoverflow to find a solution to my problems, but I couldn't find anything.
situation
I'm working on a digital photo frame with my raspberry pi that also automatically downloads photos from my wife's Facebook page. Luckily I found someone who was working on something similar:
https://github.com/samuelclay/Raspberry-Pi-Photo-Frame
A month ago this gentleman added the download_facebook.py script . This is what I needed! So, a few days ago, I started working on this script to make it work in my Windows environment first (before throwing it to pi). Unfortunately, there is no documentation specific to this script, and I lack experience with Python.
Based on from urllib import urlopen
, I can assume this script was written for Python 2.x. This is because Python 3.x is now from urlib import request
.
So I installed Python 2.7.9 interpreter and had fewer problems than when I tried to work with Python 3.4.3 interpreter.
problem
I got a script to download pictures from a Facebook account; However, the photos are damaged.
Here are photos of the problem: http://imgur.com/a/3u7cG
Now I originally used Python 3.4.3 and had problems with my urlrequest (url) method (see code at the bottom of the post) and how it worked with image data. I tried to decode in different formats like utf-8 and utf-16, but according to the content headers, it shows utf-8 format (I think).
Conclusion
I'm not entirely sure if the problem is with uploading the image or writing the image to a file.
If anyone can help me with this, I will always be grateful! Also let me know what I can do to improve my posts in the future.
Thank you in advance.
Code
from urllib import urlopen
from json import loads
from sys import argv
import dateutil.parser as dateparser
import logging
# plugin your username and access_token (Token can be get and
# modified in the Explorer Get Access Token button):
# https://graph.facebook.com/USER_NAME/photos?type=uploaded&fields=source&access_token=ACCESS_TOKEN_HERE
FACEBOOK_USER_ID = "**USER ID REMOVED"
FACEBOOK_ACCESS_TOKEN = "** TOKEN REMOVED - GET YOUR OWN **"
def get_logger(label='lvm_cli', level='INFO'):
"""
Return a generic logger.
"""
format = '%(asctime)s - %(levelname)s - %(message)s'
logging.basicConfig(format=format)
logger = logging.getLogger(label)
logger.setLevel(getattr(logging, level))
return logger
def urlrequest(url):
"""
Make a url request
"""
req = urlopen(url)
data = req.read()
return data
def get_json(url):
"""
Make a url request and return as a JSON object
"""
res = urlrequest(url)
data = loads(res)
return data
def get_next(data):
"""
Get next element from facebook JSON response,
or return None if no next present.
"""
try:
return data['paging']['next']
except KeyError:
return None
def get_images(data):
"""
Get all images from facebook JSON response,
or return None if no data present.
"""
try:
return data['data']
except KeyError:
return []
def get_all_images(url):
"""
Get all images using recursion.
"""
data = get_json(url)
images = get_images(data)
next = get_next(data)
if not next:
return images
else:
return images + get_all_images(next)
def get_url(userid, access_token):
"""
Generates a useable facebook graph API url
"""
root = 'https://graph.facebook.com/'
endpoint = '%s/photos?type=uploaded&fields=source,updated_time&access_token=%s' % \
(userid, access_token)
return '%s%s' % (root, endpoint)
def download_file(url, filename):
"""
Write image to a file.
"""
data = urlrequest(url)
path = 'C:/photos/%s' % filename
f = open(path, 'w')
f.write(data)
f.close()
def create_time_stamp(timestring):
"""
Creates a pretty string from time
"""
date = dateparser.parse(timestring)
return date.strftime('%Y-%m-%d-%H-%M-%S')
def download(userid, access_token):
"""
Download all images to current directory.
"""
logger = get_logger()
url = get_url(userid, access_token)
logger.info('Requesting image direct link, please wait..')
images = get_all_images(url)
for image in images:
logger.info('Downloading %s' % image['source'])
filename = '%s.jpg' % create_time_stamp(image['created_time'])
download_file(image['source'], filename)
if __name__ == '__main__':
download(FACEBOOK_USER_ID, FACEBOOK_ACCESS_TOKEN)
source to share
Answering the question why @ Alastair's solution from the comments worked:
f = open(path, 'wb')
From https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files :
On Windows, added to mode "b" opens the file in binary mode, so modes such as "rb", "wb" and "r + b" also exist. Python on Windows distinguishes between text and binary files; end-of-line characters in text files are automatically modified slightly when data is read or written. This behind-the-scenes change to file data is fine for ASCII text files, but it corrupts binary data like the data in JPEG or EXE files. Be very careful when using binary mode when reading and writing such files. On Unix it doesn't hurt to add 'b' to mode, so you can use it regardless of platform for all binaries.
(I was on a Mac, which explains why the problem was not reproduced for me.)
source to share