Hi all,
I am new to python and multivac, I try to download tweets from last year elections.
I have troubles downloading more than 10 000 tweets for a specific query, and understanding the pagination system.
Here is the request :
total = requests.get(‘https://api.iscpif.fr/v2/pvt/politic/france/twitter/search?q=2017LeDebat&output=id_str,ca,tx,usr.tmz&since=2017-05-03&until=2017-05-04&count=1&from=1&api_key=’ + api_key).json()[‘results’][‘total’]
from_arg = 1
print(‘number of tweets’, total)
while from_arg < total / 100:
print(‘Doing tweet {}’.format(from_arg))
results = requests.get(‘https://api.iscpif.fr/v2/pvt/politic/france/twitter/search?q=2017LeDebat&output=id_str,ca,tx,usr.tmz&since=2017-05-03&until=2017-05-04&count=100&from=’ + str(int(from_arg)) + ‘&api_key=’ + api_key).json()[‘results’][‘hits’]
write_tweets(results, “tweetMacronLepen.json”)
from_arg += 1
Is it possible to download more than 10k tweets in one shot ? Usually after I download 100 pages I receive this error.
Doing tweet 99
Doing tweet 100
Doing tweet 101
Traceback (most recent call last):
File “tweet_v5page.py”, line 17, in
results = requests.get(‘https://api.iscpif.fr/v2/pvt/politic/france/twitter/search?q=@MLP_Officiel&output=id_str,ca,tx,usr.tmz&since=2017-05-03&until=2017-05-04&count=100&from=’ + str(int(from_arg)) + ‘&api_key=’ + api_key).json()[‘results’][‘hits’]
KeyError: ‘results’
What do I need to change in my code to download the 100 next pages ? When I change “from=” to 1, 2, 3, sometimes it’s working and the 100 next pages of tweets are downloaded, sometimes it’s not and the exact same tweets are downloaded.
Thank you for your help !
Edgar