Skip to content
Snippets Groups Projects

Hashtag info scraper

  • Clone with SSH
  • Clone with HTTPS
  • Embed
  • Share
    The snippet can be accessed without any authentication.
    Authored by Jan Maria Kopankiewicz

    Run it but force kill it after a while.

    dependencies:

    • emoji
    • instalooter

    sample output:

    coronavirus.txt

    Edited
    ig_hashtag.py 1.26 KiB
    from emoji import demojize
    from instalooter.looters import HashtagLooter
    
    def img_info(media, looter):
        try:
            _id = [media][0]['id']
        except:
            _id = None
    
        try:
            edge_media_to_comment = [media][0]['edge_media_to_comment']['count']
        except:
            edge_media_to_comment = None
    
        try:
            comments_disabled = [media][0]['comments_disabled']
        except:
            comments_disabled = None
    
        try:
            taken_at_timestamp = int([media][0]['taken_at_timestamp']) 
        except:
            taken_at_timestamp = None
    
        try:
            edge_media_to_caption =  ' '.join(demojize([media][0]['edge_media_to_caption']['edges'][0]['node']['text']).split())
        except:
            edge_media_to_caption = None
            
        nfo =("{}; {}; {}; {}; {}".format(_id,edge_media_to_comment,comments_disabled,taken_at_timestamp,edge_media_to_caption))
        
        with open("{}.txt".format(HASHTAG), "a+") as f:
            f.write("{}\n".format(nfo))
            
        #return nfo
    
    
    HASHTAG = 'coronavirus'
    looter = HashtagLooter(HASHTAG)
    
    with open("{}.txt".format(HASHTAG), "a+") as f:
        f.write("id; edge_media_to_comment; comments_disabled; taken_at_timestamp; edge_media_to_caption\n")
    
    for media in looter.medias():
        img_info(media, looter)
    
                    
    
                
    
    
    0% or .
    You are about to add 0 people to the discussion. Proceed with caution.
    Finish editing this message first!
    Please register or to comment