This script performs a reverse DNS lookup against a list of IP addresses. I use it to determine genuine Googlebot requests for log file analysis.
It takes an Excel file (.xslx) called logs.xslx with a sheet named ‘Sheet1’ and looks for IPs in a column called ip. Then it performs a reverse lookup on the unique values. It exports an Excel file called validated_logs.xslx which contains all of the data from logs.
- Last year I moved this blog from WordPress to Hugo, hosted on Netlify. As a part of this move, I wanted to make the site as fast as possible and made a number of improvements, including adding in support for WebP images. This can be achieved with Hugo by creating the following shortcode: {{ $image := .Params.src }} {{ $type_arr := split $image "." }} {{ $srcbase := index $type_arr 0 }} {{ $srcext := index $type_arr 1 }} {{ $.
- This page describes some ways to extract search engine hits from a websites log files. Extracting Hits from Apache Log Files To extract just the Googlebot hits on the site using the GNU/Linux terminal, try this: grep 'Googlebot\/' access.log > googlebot_access.log That will write the Googlebot hits to a new logfile called googlebot_access.log. You can also pipe that output into another command, for example to extract only the URLs that Googlebot is requesting:
- Recently, i’ve been trying to better understand what constitutes ‘good outreach’ in an attempt to increase response rates and overall placements. Ignoring the quality of the actual content being outreached — which, I believe, will always be the number one factor — there appear to be three main components. These are: 1. Outreach targets: The quality of the contact list, e.g. relevency to the writer/site. 2. The Pitch: The language used in the initial email — subject line, copy, length of copy, link inclusion, attachment inclusion, etc.
- Using log files for SEO analysis is a great way to uncover issues that you may have otherwise missed. This is because, unlike third party spiders, they allow you to see exactly how Googlebot is crawling a site. If you’re an SEO professional looking to carry out your own log file analysis, then the chances are you’ll have to request the files through your own, or your clients, dev team.
- Unless you’ve been living under a rock for the past week, you’ve probably read or heard something about #TheDress. The debate went viral last Thursday, resulting in a massive influx of publicity for the retailer Roman Originals and an overnight 347% increase in sales for the garment in question. Yet, beyond an initial spike in product interest, what will the legacy of #TheDress be for the brand? Additionally, based on previous academic research, what can we learn from the stories success?