InstaLoader – an OSINT Tool for Scraping Instagram Metadata

Introduction

There are a lot of Instagram OSINT tools out there. InstaLoader is one of my favorites. It’s achieved this status by providing a massive amount of data while maintaining its user-friendliness. InstaLoader does the following:

  • downloads public and private profiles, hashtags, user stories, feeds and saved media,
  • downloads comments, geotags and captions of each post,
  • automatically detects profile name changes and renames the target directory accordingly,
  • allows fine-grained customization of filters and where to store downloaded media.

InstaLoader lets you pull hashtags, user stories, feeds, captions, and saved media.  This is your foundation when conducting an investigation.  After analyzing the extracted profile information, you can fully understand everything there is to know about that profile using public information. Next, you can download comments and geotags of each post. Analyzing the comments of each post will provide you the next “link” in your analysis. If a user authentically reacts with the individual commenting, you can “pivot” to that profile and repeat the InstaLoader process in order to build your network. This will allow you to potentially discover external user activity from profile 1 by searching the second link’s comments. On that note, let’s discuss a unique problem of investigating on Instagram.

Instagram

Instagram has a problem for the OSINT investigator that other social media profiles like Twitter do not. You can’t see user activity beyond their profile page. To elaborate: you can see what a user posts, but you cannot search for other content they’ve interacted with.  This presents a problem when identifying scams, fraudulent accounts, potential human traffickers, etc. Because of this problem, you need to get as much data as you possibly can in order to write effective reports and draw reasonable conclusions. Conducting link analysis by analyzing the comments underneath user-generated content is one method. Another method is to analyze hashtags the user follows, which is publicly available, in search for external user activity. Once you find a profile user 1 has interacted with, you can search within that profile for more information.

InstaLoader Setup

InstaLoader is one of the easiest tools I’ve ever used. Setup is only two steps.

$ pip3 install instaloader

$ instaloader profile [profile ...]

That’s it. Because it’s on PyPi, you can install it with pip and you’re ready to go. Not much more to say here. Here is a link to the tool’s documentation.

Use Cases

As a general disclaimer, InstaLoader shouldn’t be used for nefarious or creepy reasons. As with most tools, the good guys and the bad guys can use it for personal gain. Don’t be a bad guy. On a more technical level, you can use InstaLoader to extract data, generate archives, do link analysis, and more. Let’s take a look at a few ways you can use Instaloader for good.

Collecting evidence as proof of a scam. In a previous post I did on Nigerian Prince Scams, I used InstaLoader to prove that an account wasn’t who they said they were. Because the account is now deleted, let me give you the background. An account claiming to be the prince of a Nigerian Emir was convincing people to buy into a crypto scam and used photos of that prince as a way to convince people. The first step was to identify which prince they were impersonating then prove that they were a fraud. We did this by figuring out that all photos of the prince were publicly available. We used InstaLoader to download the photos and the metadata and used reverse image search and Google Dorks to find an article where the photos were published. The first red flag was discovered. Next, we were able to find multiple accounts using the same language as the account in question surrounding what we alleged to be a crypto scam. Second red flag. After collection of more information on Instagram, we had a case.

That’s just one example of what you can do with the information extracted by InstaLoader. I’ll likely follow up to this post with a counter human trafficking example in the future.

OSINT Insight

InstaLoader may be one of the most robust Instagram OSINT tools. While I love all the features and the ability to customize, I do have issues with the formatting of the file output. This is something that could be tweaked I’m sure, but it’s the main reason I used Instalooter for my face_recognition demo instead of InstaLoader. To be more specific, instead of dumping all of the photos into one folder, each photo gets its own folder with metadata attached. This is great for organization, but adds time to my workflow.

Because Instagram is owned by Facebook, it is likely that Instagram will receive similar privacy-protecting measures in the future. This puts the success of tools like InstaLoader in limbo, in my opinion. Additionally, regular users will likely make the shift towards private accounts, limiting the access of tools. Keep these things in mind before building an infrastructure around web scraping tools like InstaLoader.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s