Which Twitter API should I use to fetch large numbers of tweets for NLP research?
I would like to extract as many tweets as possible that contain a given keyword (usually a company name).
I'm using the Twitter Search API, but it's limited to "recent tweets". So for a relatively rare keyword, I can only get 500 tweets at most.
Twitter says you shouldn't use the search API for research. So which API should I use?
+3
source to share
2 answers
To get lots of tweets with specific keywords, use the Streaming API with Statuses / Filter .
First create a file (eg "tracking.txt") with track terms with keywords separated by commas. This can include hash tags. For example, I used the following to get tweets with a link and some hashtags.
track=http #baby,http #family,http #children, ...
Then use curl to redirect the stream to a file. Be sure to use your twitter and password.
curl -d @tracking.txt https://stream.twitter.com/1/statuses/filter.json -uAnyTwitterUser:Password > stream.json
+4
source to share