07 Scraping for Volume

January 29, 2022

It took almost two weeks, but I was able to scrape 215 reverse image links to collect 125,559 images. Some of these images are not of the woman holding the flag. And I'm sure that plenty of them are duplicates. Next step in this process will be to collect those images and see if I can remove the duplicates to get a total count of how many variations on this flag I was able to collect.

The goal is to get as many variations of the image as possible. I landed on a process utilizing Google's Reverse Image search to yield pages of flag variations.

Example of Reverse Image Search Results

Process

As I've been working on this project, I have a sizeable list of different brands which are using the image in question. This list was the starting point for collecting the images.

Going to the product page, I discovered that I got the best results by using Google Lens to provide a reverse image link (as opposed to uploading the image, or using the image URL directly into Google's reverse image search.) Once I landed there, if there was an "All sizes" option for the results, then I would be able to access that URL. The URL was then added to my spreadsheet of URLs to be scraped, where I was methodically going through and pulling images.

I do this for each flag variation on the product page, to try and get the most results possible.

Results

No items found.