Convert Word HTML to W3C Valid code in Dreamweaver

May 21st, 2007 by Steve

If you have ever use Microsoft Word to create a web page, you may have noticed a lot of extra code if you examined the HTML that was generated. If you have Word HTML that you want to clean up rather than build a whole new page from scratch, try the following steps. In Dreamweaver, start a new document, then switch to the Design view.
Open the Word generated HTML page in a browser (I prefer Firefox, but IE seems to work for this also). Select all (you can use the edit menu) and copy the page. In Dreamweaver, paste into the Design view. Switch to Code view, and you will probably notice some ole links (something like this: <A name=”OLE_LINK8″>) along with random empty tags. Remove all of these.
Next, In the Commands menu, click “Clean Up Word HTML”. This gets rid of the extra style sheet garbage created by Word.
If you want the pages to validate (w3c), links may need to be fixed. If any of them had a coded ampersand, (&amp;), they got changed to uncoded ampersands (&) when you copied the code into Dreamweaver, so you will need to change them all back. Be careful if you use Find and Replace, because if there are already ampersands with the HTML code (&amp;), they will be broken. You may want to just do a “Find” first, and list all the occurances of the ampersand.
As with all our tips, feel free to add a helpful comment.

Posted in |

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.