How Copying and Pasting from MS Word Can Wreak Havoc on Your Blog

Denamico April 7, 2015



Blogging takes time and focus. And sometimes, it includes a routine (get coffee, turn off email and social notices, close the door, etc). But whether you use a routine or get straight down to business, chances are, you have a preferred tool for composing blog posts.

Some people prefer composing directly in the blogging platform, and others like to draft the post in an editor like Microsoft Word or Google Docs.

But if you’re copying and pasting blog posts from Word to your blogging platform, you could be wreaking havoc on your blog!Here are the details on how you could be compromising your site and how to fix it...

What Copying and Pasting from MS Word Does to Your Blog

It seems like an innocent enough action: copy from Word and paste into your blog. But if you’ve ever taken a look at the HTML of that freshly pasted content, you’ll notice a lot of excessive, non-productive code (aka “junk code” or “bloated code”).

Take a look at this example…

Here’s a screenshot of a few paragraphs typed into MS Word. As you can see, the formatting is pretty simple. There’s a header, some basic paragraph text, and a secondary subheader to accentuate text.


Note what happens to the HTML when the above paragraphs are copied and pasted into a WYSIWYG blogging editor.


Put simply, it's a mess.And the above code includes only the header and first two paragraphs!

Alternatively, this is what the HTML view looks like when unnecessary bloat code is removed. Note, the below screenshot includes the entire excerpt:


Why You Should Care: SEO Impacts

It’s pretty easy to see that the second HTML screenshot is a lot easier to read. The interesting thing here is… the second version isn’t just easier for people to read, but it’s a lot easier for search engines to read too!

So, if you want to do everything you can to improve blog post SEO, you’ll want to remove the excess bloat code too. (Or be careful not to add it in the first place! More on that below.)

Why You Should Care: User Experience (UX)

Have you ever made a few minor edits to your blog (say, deleting an extra paragraph space) only to have that small gesture make a huge impact on your paragraph style? Maybe the entire font size of that paragraph just changed. Or perhaps the alignment is thrown off. And no matter how many times you try to adjust the font size or paragraph alignment, things just won’t change!

A lot of those challenges are attributable to the bloat code that was copied from your word processor. Along with all of that junk code came a lot of inline styles, or code that instructs your web browser how to present your content. So when you try to force even more inline styles into an already crowded space, things can get messy pretty quickly. (You should also know that inline styles will override the global CSS styles that your website uses. These global CSS styles are important because they reinforce a consistent look-and-feel across your site as well as overall brand alignment.) 

After much fumbling around with the WYSIWYG editor, you may be able to force things into getting “close enough.” And sometimes, “close enough” is what’s saving you from pulling out your hair or from having an Office Space moment with your laptop.

But, when you leave the junk code in place along with those paragraphs that just won’t resize, you’re at risk of compromising your website visitors’ overall user experience (UX). A blog post with odd text sizes and off-brand styles will distract your site visitors from their primary experience. Additionally, these style issues can reflect poorly on your overall brand by painting a picture of unprofessionalism.

What Can You Do to Prevent Bloated HTML in Blog Posts?

1. Compose directly in your blogging editor.

This is the quickest, most immediate solution. Just skip the word processor all together and compose right in your blogging platform. If you use WordPress, try taking advantage of the Distraction Free Writing tool. This tool hides everything from your screen except for your critical blogging composition tools.


2. Use the ‘Paste as Text’ button.

Some blogging editors such as Wordpress have a ‘Paste as Text’ button. By activating this function, you can paste in text from MS Word or other word processor, and the blogging editor will automatically strip out the HTML formatting.


3. Paste into a Plain Text Editor First.

If your blogging platform does not have a ‘Paste as Text’ function, you can paste text into a plain text editor first. Then copy and paste the text into your blogging platform.

4. Test Various Word Processors.

Microsoft Word is among the worst offenders when it comes to inserting junk code. This is primarily because MS Word uses a proprietary code (Microsoft Vector Markup Language, or VML) which was developed specifically for their word processing software. Unfortunately translating this code over to HTML is never a smooth transition. However, other tools such as Google Docs will not insert near the level of bloat code as MS Word. Though, you'll still need to go in and remove some excess code.


Using any one of the above approaches will dramatically reduce or eliminate bloat code in your blog posts. But if you blog regularly, it's a good idea to develop a basic understanding of HTML and CSS. Having this knowledge will be a very important skill set to add to your blogging toolbox. So in the future, when those pesky paragraphs just won’t align correctly, you’ll feel at ease switching over the the HTML view and cleaning up the inline styles.

Image by Tim Bartelvia Flickr, licensed under CC BY-SA