How do I use variables and OR cunjunctions in a SQL statement in Python?
I have a list of IDs in a named list res
that I want to use row by row as WHERE clauses in a SQL query before storing the results in an array:
ids
grupos
0 [160, 161, 365, 386, 471]
1 [296, 306]
This is what I was trying to insert into the SQL query:
listado = [None]*len(res)
# We store the hashtags that describes the best the groups
# We iterate on the people of a group to construct the WHERE condition
print "res : ", res
for i in (0,len(res)):
conn = psycopg2.connect(**params)
cur = conn.cursor()
listado = [None]*len(res)
for i in (0,len(res)):
print "res[i:p] : ", res.iloc[i]['ids']
cur.execute("""SELECT COUNT(swipe.eclipse_id), subscriber_hashtag.hashtag_id FROM subscriber_hashtag
-- join para que las publicidades/eclipses que gusta un usarios estan vinculadas con las de la tabla de correspondencia con los hashtag
INNER JOIN eclipse_hashtag ON eclipse_hashtag.hashtag_id = subscriber_hashtag.hashtag_id
-- join para que los usarios estan vinculados con los de la tabla de correspondencia con los hashtag
LEFT OUTER JOIN swipe ON subscriber_hashtag.subscriber_id = swipe.subscriber_id
-- recobremos los "me gusta"
WHERE subscriber_hastag.subscriber_id in (%s)
GROUP BY subscriber_hashtag.hashtag_id
ORDER BY COUNT(swipe.eclipse_id) DESC;""",(res.iloc[i]['ids']))
n = cur.fetchall()
listado[i] = [{"count": elem[0], "eclipse_id": elem[1]} for elem in n]
Data for a reproducible example
Providing additional information:
subscriber_id hashtag_id
160 345
160 347
161 345
160 334
161 347
306 325
296 362
306 324
296 326
161 322
160 322
The output should be here:
{0:[324,1],[325,1],[326,1],[362,1], 1 : [345,2],[347,2],[334,1]}
Current error message
ERROR: An unexpected error occurred while entering tokenization The following trace may be corrupted or invalid Error message: ("EOF on multi-line string", (1, 50))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-44-f7c3c5b81303> in <module>()
39 WHERE subscriber_hastag.subscriber_id in (%s)
40 GROUP BY subscriber_hashtag.hashtag_id
---> 41 ORDER BY COUNT(swipe.eclipse_id) DESC;""",(res.iloc[i]['ids']))
42
43 n = cur.fetchall()
TypeError: not all arguments converted during string formatting
source to share
Have a look at the tuple binding :
Python tuples are converted to syntax suitable for the SQL IN statement and representing a composite type:
Passing ids as a query argument of a tuple, so your argument to execute is 1 tuple of ids tuples, and drop the parentheses manually around %s
. At this point, yours (res.iloc[i]['ids'])
is nothing more than a sequence expression in fallback brackets, so it execute()
uses it as a sequence of arguments, which raises a TypeError exception; your argument sequence has more arguments than the request has placeholders.
Try it (tuple(res.iloc[i]['ids']),)
. Note the comma, this is a very common mistake to omit . Total:
cur.execute("""SELECT COUNT(swipe.eclipse_id),
subscriber_hashtag.hashtag_id
FROM subscriber_hashtag
INNER JOIN eclipse_hashtag ON eclipse_hashtag.hashtag_id = subscriber_hashtag.hashtag_id
LEFT OUTER JOIN swipe ON subscriber_hashtag.subscriber_id = swipe.subscriber_id
WHERE subscriber_hashtag.subscriber_id in %s
GROUP BY subscriber_hashtag.hashtag_id
ORDER BY COUNT(swipe.eclipse_id) DESC;""",
(tuple(res.iloc[i]['ids']),))
Your for-loop is a little weird since you are iterating over a 2-tuple (0, len(res))
. Perhaps you meant range(len(res))
. You can also just iterate over the Pandas series:
for i, ids in enumerate(res['ids']):
...
cur.execute(..., (tuple(ids),))
source to share