.
ASP.Net
Skip Navigation Links

Convert .htm to .aspx

This article is not so much a tip or code but a response to a problem that I had which you might find useful.
When I started this website back in, ummmm, 2000?, I used FrontPage for ease of use or created documents using Word which I saved as .HTM pages that I exported to the site.
In those early days the server I was running on didn't support ASP.Net. It was a Yahoo site, and it was only later that I made the migration to Brinkster and their beta enabled ASP.Net hosting. Having an amount of existing content, and being a bit of a lazy fellow, I put my toe into the ASP.Net pool very slowly and left large parts of my site as HTML. This has remained true until today.

Recently, I decided that master pages were a nice way to go when providing a consistent look to the site or certain groups of pages but with about a hundred and fifty files to convert I've always been reluctant to actually do the work. Well, I decided this week that I had had enough and began transforming .HTM pages into .ASPX. Generally, if an htm page doesn't have any specific features you can just rename them to .ASPX. If however, like me, you want to apply some master page then the process is a little more laborious, especially with a hundred or so pages to do by hand. Basically one must create an ASPX page which nominates a master page, then copy the contents of the HTML HEAD and BODY tags into the correct content placeholders. Doing this by hand was such a pain that I elected to write a wee bit of code rather than actually visit each page and transform it.

Html2ASPX

To answer the needs of my site I created the Html2Aspx application that has gone through the entire site and reworked all the htm files, adding master pages and renaming all the old files just in case something went wrong.

DISCLAIMER:

I have created this tool because it suits my needs and because I know that damage to my site is my responsibility. If you go ahead and use this tool, by downloading it or by using the source code published here you implicitly agree that any damage caused to your site or to whatever site you apply the changes to is ENTIRELY YOUR RESPONSIBILITY.

The tool is in the form of a WPF application which can be downloaded from here. If you download this file you do so at your own risk.

So, How does it all work? Well, Really, all we need is a path to the web-site that needs modification. From there we can find the set of master pages available and select one if we desire. The simple user interface allows this choice.

Html2Aspx ui

When you've made all the relavent choices you can run the conversion. In my case I just applied my GDI+ FAQ master page to all of the files. I will sort out the various ones that need the Windows Forms tips and tricks master etc at a later date.

Pressing the convert button runs the following bit of code:

        private void button2_Click(object sender, RoutedEventArgs e)

        {

            if(Directory.Exists(textBox1.Text))

            {

                foreach(string s in Directory.GetFiles(textBox1.Text,"*.htm"))

                {

                    string aspxfile = s.Replace(".htm",".aspx");

 

                    if (this.checkBox1.IsChecked.Value == true)

                    {

                        //Apply master page

                        string MasterTemplate = @"<%@ Page Title="""" Language=""C#"" MasterPageFile=""~/{0}"" AutoEventWireup=""true"" %>

 

<asp:Content ID=""Content1"" ContentPlaceHolderID=""head"" Runat=""Server"">

{1}

</asp:Content>

<asp:Content ID=""Content2"" ContentPlaceHolderID=""ContentPlaceHolder1"" Runat=""Server"">

{2}

</asp:Content>

 

";

                        string n = System.IO.Path.GetFileNameWithoutExtension(s);

                        StreamReader sr = new StreamReader(s);

                        string htmlContent = sr.ReadToEnd();

                        sr.Close();

 

                        IHTMLDocument2 doc = (IHTMLDocument2)new mshtml.HTMLDocument();

                        doc.write(htmlContent);

 

                        IHTMLElement head = null;

                        IHTMLElement body = null;

 

                        foreach (IHTMLElement lmnt in doc.all)

                        {

                            if (lmnt.tagName.ToLower() == "head")

                                head = lmnt;

                            if (lmnt.tagName.ToLower() == "body")

                                body = lmnt;

                        }

 

                        if (head != null && body != null)

                        {

                            // got what we need

 

                            File.Move(s, s.Replace(".htm", ".oldhtm"));

                            listBox2.Items.Add(string.Format("Renamed {0} to {1}",s, s.Replace(".htm", ".oldhtm")));

 

                            if (File.Exists(aspxfile))

                                File.Delete(aspxfile);

                            StreamWriter sw = new StreamWriter(aspxfile);

                            sw.Write(string.Format(MasterTemplate,

                                "~/" + System.IO.Path.GetFileName(master),

                                head.innerHTML,

                                body.innerHTML));

                            sw.Flush();

                            sw.Close();

 

                            listBox2.Items.Add(string.Format("Created {0}",aspxfile));

                        }

                        else

                            listBox2.Items.Add(string.Format("Couldn't find head and / or body in {0}. File skipped.",s));

                    }

                    else

                    {

                        //File.Move(s, s.Replace(".htm", ".aspx"));

                        listBox2.Items.Add(string.Format("Renamed {0} to {1}",s, aspxfile));

                    }

                }

            }

        }

 So, If the check box to use the master page is unchecked, I just rename the file. If the checkbox is checked and a master page file is selected, the following process takes place:

Each .htm file is loaded into an HTMLDocument. The file is then renamed to .oldhtm.

The head and body tags are discovered, if there is no head or no body, the file is skipped.

The basic ASPX page with a master is provided in the form of a format string:

<%@ Page Title="""" Language=""C#"" MasterPageFile=""~/{0}"" AutoEventWireup=""true"" %>

 

<asp:Content ID=""Content1"" ContentPlaceHolderID=""head"" Runat=""Server"">

{1}

</asp:Content>

<asp:Content ID=""Content2"" ContentPlaceHolderID=""ContentPlaceHolder1"" Runat=""Server"">

{2}

</asp:Content>

The name of the master page file, the inner HTML of the <HEAD> and the inner HTML of the <BODY> are used to populate the correct content holders. The file is then saved as .ASPX and voila! There you have it. An .ASPX file with a master page (or not) generated from existing .HTM files.

 What happens when someone requests an HTM page on your site now??

Ha haaa.. I have an article for that!

Sponsored By
DaraizeTechnologies.com
Bob Powell

Create your badge

Copyright © Bob Powell 2000-2013.  All rights reserved.