How To Avoid a Duplicate Content Disaster When Blogging With ‘Ghost’

If you're like me and enjoy using Ghost as a blogging platform because it’s lightweight, pretty and makes your content easily accessible then you want to make sure it is configured correctly, or face a duplicate content disaster that could potentially destroy your SEO efforts.

Ghost Blog Image taken from August Kleimo.

Let's begin.

After spending a few hours debugging a rather silly issue to do with the Ghost blogging platform, and the fact that we had suddenly had duplicate sites indexed in Google, one on our URL and one accessible via the IP address of the site directly, I have decided to create guide for those who may experience the same issue down the line and avoid any SEO penalties as quickly as possible.

The difference that you'll notice with this guide in comparison to others, is that I demonstrate my debugging/thinking process and the logical steps I take to figure out the causes of issues (a valuable skill to have when running any kind of online business).

Debugging

It started a few weeks ago with a little drop in search engine rankings. Posts that were previously ranking on the first page in the top three positions, and now dropped to page 4, page 5 and so forth.

Duplicate Content Google Penalty Thanks to Cognitive SEO for the image. Check them out!

After running a search for our site, specifically this blog - I noticed that Google had indexed two versions of our site. One directly linking up by the IP address, and one through our URL.

This was a bad sign.

So, I opened Google Webmasters Tools and looked through the sitemap panel. There were several warnings and errors, particularly to do with sitemaps not correctly registering because the blog was producing sitemaps and linking posts using the IP address as the primary URL, instead of blog.guessbox.io

This led to Google de-indexing about 50% of our content straight off the bat. So I figured it must've been a one-off error, and I manually fixed up the URLs on the sitemaps and resubmitted them.

Yet, nothing changed because when Google was crawling those good URLs it was also finding the duplicate IP address based URLs linking to the same content.

Further penalties were imposed on our content. Sigh.

Troubleshooting The Next Steps

After digging around I quickly realized that the error must have laid within the config files, which I'll talk about it more later in this article.

I amended the config.js file to include the correct URL structure, re-uploaded it back to my server.

And...

Nothing changed.

Random links on the blog had our IP address in front of them (i.e. the homepage link, the RSS feed link etc).

I thought to save time, what if I just hard coded the correct URLs in the right spots in default.hbs and post.hbs files (these are effectively, template files for what appears on a post page and other sections of a typical blog). If you have ever used WordPress before you are probably familiar with these.

If you ever need to access your config.js file here are the linux commnands I used.

Access Config.Js File via SSH for Ghost

ls -l simply lists all files in directory cd changes directory
pwd (prints working/current directory path) cd .. goes back to parent folder

This fixed some of the links, but only temporarily, and it was a really messy workaround. So, I carried on.

The Short Answer

I couldn't figure out what the problem was. Everything I had read on the subject said just edit the config file(s) and you'll be good to go.

But none of them mentioned that editing those files, and re-uploading them to your server is useless if you don't restart your server afterward!

Use the following code to restart your Ghost server (not your Google Cloud instance - that must be running already). This is done via SSH (your hosting provider should help you set up an SSH connection, if you don't know how to do it).

$ sudo /opt/bitnami/ctlscript.sh restart

Then don't forget to remove or rename the bnconfig file that is found in the same folder, because if you don't and you accidentally restart your server instance (be it through Google Cloud, AWS, Azure etc) then your blogs URL structure will return back to how it was in the beginning - incorrect.

Use this code to do so (it just renames the bnconfig file to bnconfig.back)

sudo mv /opt/bitnami/apps/ghost/bnconfig /opt/bitnami/apps/ghost/bnconfig.back

Development vs Production Environments

The Ghost blogging platform comes with both a production environment and development environment. By default, if you are installing it on your own server and not using Google Cloud Launcher via Bitnami (like we did) then it will be set to development, and you will need to manually change it to production once your blog is live.

How To Change Ghost Blog Environment Read more on the official Ghost site.

But before you do this, make sure your config files are populated with the correct URLs. If you used Google Cloud Launcher, the config file is located in the following the path:

/home/bitnami/apps/ghost/htdocs/config.js

You can use the following command to open it up directly for editing via SSH.

sudo pico /home/bitnami/apps/ghost/htdocs/config.js

Within that file, you will find information relating to both development, and production environments and the attributes that go along with them.

Where it says URL: remove the IP address of the server and place your blog URL like so:

config = { // ### Production // When running Ghost in the wild, use the production environment. // Configure your URL and mail settings here production: { url: 'https://blog.guessbox.io', mail: { transport: 'SMTP', options: { service: 'Mailgun', auth: { user: 'your-username@yourdomain', pass: 'yourpassword' } } },

You can also set the same URL for your development environment unless of course, you make frequent developmental changes to your blog in which case perhaps have a separate URL to test things on.

Note: I know I have mentioned that you must be connected via SSH via command line to your server to make these changes. However, that is not true. There are plenty of apps that make the process much simpler by offering a graphical user interface for you to work within.

Coda 2 is a personal favorite when it comes to accessing SSH via GUI app. I say this genuinely, because as you'll notice there is no affiliate id hidden in that link.

Lots of great content coming up this week, so make sure to subscribe to never a miss a tutorial, feature unveiling or new case study.