Collecting OSINT from eCommerce Data on Amazon

There’s a wide variety of information that you can gather using open sources.  Data ranging from email addresses to phone numbers to social media accounts are all common sources of open source intelligence.  Application of this data can vary, but is usually is involved in some sort of investigation, trying to correlate data and find a connection with meaning. Something that I’ve always been interested in is eCommerce and the data behind it.  Most users of eCommerce data are marketers or businesses, but what sort of application does it have for the security industry?  It certainly has value for competitive intelligence, but could you use eCommerce data from websites like Amazon for trend analysis with a security application?  Here’s a quick guide on how to collect OSINT from eCommerce data.  What we’ll be looking at in this article is how to use open source software (mostly in the form of Chrome extensions) to find data about products and sellers on Amazon.  If you’ve got a grasp on basic economics and fundamental data science, this should be a breeze for you and be very enlightening.  Maybe you’ll even give selling on Amazon a shot, I certainly did.

0. Understanding Amazon’s Metrics

The primary metric we will be looking at is sales rank.  What sales rank is, essentially, is how well a product sells on Amazon.  The lower the rank, the better it sells.  For example, let’s say a book is ranked 300 on Amazon and another is ranked 1000.  The book ranked 300 sells way more copies than the latter.  I key thing to remember is the sales rank differs based on the product category.  The ranking is based on the total number of products in a category.  Amazon has a lot more books then it has auto parts, therefore the difference in rank in auto parts is more significant.  Here’s how knowing sales rank will help with OSINT.  Let’s say you’re investigating a controversial author on Amazon.  Knowing the sales rank of their book will allow you to calculate a rough estimate of their monthly revenue from books.  Let’s look at an example.

Here is a visualization of the data scraped from “The Anarchist Cookbook” by William Powell over the last 365 days.

Screen Shot 2018-06-02 at 10.32.24 PM

Don’t worry, there’s a lot going on here.  What I want you to focus on is the part of the graph depicted in green.  This is the sales rank.  Let me explain what’s going on here.  The lower the the data is on the Y axis, the lower the sales rank.  This means the book is selling better when it’s closer to rank #1.  At the time of this writing, “The Anarchist Cookbook” is ranked roughly 30,000 in books. This roughly equates to 290-310 sales per month, or 10 per day.  Here’s the interesting part.  Look at the time between August and January.  The sales went way down.  Why is this?  Price stayed relatively stagnant during that time, having little impact on demand.  Is 365 days too short of a period to cast judgement on this observation?  Let’s look at the entire dataset.

Here is a visualization of the data scraped from “The Anarchist Cookbook” since it was published on Amazon (2627 days ago).

Screen Shot 2018-06-02 at 10.36.34 PM

Now we have all the information available, but the results are about the same.  We see from July 2015 to January of 2017, the demand for “The Anarchist Cookbook” was about the same.  Then, between January 2017 and January 2018, the demand went way down.  What happened during this time that would drop the demand?  If you think you have an idea, leave your comment below (can’t say the first year of Donald Trump or Brexit, I don’t want to start a troll hole).  But wait, why has the demand suddenly gone back to normal in 2018?  It may be worth investigating.  Now that you’ve seen the data in action, I’ll show you how to get it set up.

1. Keepa Chrome Extension

The graphs you see above are created using the free Keepa Extension for Chrome.  This is an industry standard for Amazon sellers and does all the heavy lifting for you.  You can adjust your parameters with the options on the right of the graph and eliminate a lot of the noise.  I recommend making an account (it’s free) with Keepa.  I’ve never got any spam from them, which I appreciate.  There’s really no setup for this one.  Install the extension, then view your first product. One thing I would recommend doing is right clicking the extension, select options, then enable “Display product’s stock quantity for some merchants on offer pages”.  This will let you know how many each seller has in stock, a further metric to take into consideration.

3. OSINT Application

So what can you use this information for other than tracking sales data for a book?  Let me give you another example of how I used OSINT from Amazon data to find useful insights.  Remember the Charlottesville protest in August of 2017?  If not, just google “Charlottesville protest” and you’ll get spun up pretty quickly.  In a nutshell, a bunch of Antifa trolls went against a bunch of Alt-Right trolls and a lot of people got hurt, one person died.  This was a pretty big deal in the media and put President Trump in the hot seat.  Politics aside, something interesting happened that’s relevant to this case study.  A bunch of Alt-Right trolls starting buying tiki torches in mass to light during their marches.  Here’s what it looked like.

Image result for charlottesville protest tiki torches

The first thing I noticed here was the variety of different brands and styles of tiki torches in this photo.  This clearly wasn’t a wholesale deal and one person didn’t supply them all.  This is likely the result of a “hey go buy tiki torches” or something to that effect on an Alt-Right forum or chat before the march took place.  Check out that guys mustache. Anyway, I wanted to see if there was a dramatic shift in demand for tiki torches between July and August of 2017 on Amazon when all of this was going on, so I looked for the best selling tiki torch on Amazon at the time. Here’s the product I found. What I found is the sales rank for this particular tiki torch was at it’s absolute lowest on August 21, 2017 (4 days after Charlottesville).  This is no doubt a stretch in correlation, but it’s an interesting observation that could be further exploited with other OSINT.

I hope you enjoyed this guide and introduction to collecting OSINT using eCommerce data, specifically from Amazon.  Please follow me on Twitter or subscribe to this blog for more articles like this or other useful tools.  I will start conducting in depth case studies soon that will further explore techniques like the one found in this article and others.  Stay tuned!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s