As a web developer I feel too much of my time is spent on CSS. I am trying to come up with a solution where I can write re-usable CSS i.e. classes and reference these classes in the HTML without additional code in ASPX or ASCX files etc. or code-behind files. I want an intermediary which links up HTML elements with CSS classes.
What I want to achieve:
- Modify HTML immediately before transmission
- Select elements in the HTML
- Based on rules defined elsewhere (e.g. in a text file relating to the page currently being processed):
- Add a CSS class reference to multiple HTML elements
- Add multiple CSS class references to a single HTML element
How I envisage this working:
- Extend ASP.NET functions which generate final HTML
- Grab all the HTML as a string
- Pass the string into a contructor for an object with querying (e.g. XPATH) methods
- Go through list of global rules e.g. for child
ul
of firstdiv
thenclass = "navigation"
- Go through list of page specific rules e.g. for child
ul
of firstdiv
thenclass &= " home"
- Get processed HTML from object e.g. obj.ToString
- ASP.NET to resume page generation using processed HTML
So what I need to know is:
- Where / how can I extend ASP.NET page generation functions (to get all HTML of page)
- What classes have element / node querying methods and access to attributes
Thanks for your help in advance.
P.S. I am developing ASP.NET web forms websites with VB.net code-behinds running on ISS 7
Check out my CsQuery project: https://github.com/jamietre/csquery or on nuget as "CsQuery".
This is a C# (.NET 4) port of jQuery. In basic performance tests (included in the project test suite) selectors are about 100 times faster than HTML Agility Pack + Fizzler (a css selector add-on for HAP); it's plenty fast for manipulating the output stream in real time on a typical web site. If you are amazon.com or something, of course, YMMV.
My initial purpose in developing this was to manipulate HTML from a content management system. Once I had it up and running, I found that using CSS selectors and the jQuery API is a whole lot more fun than using web controls and started using it as a primary HTML manipulation tool for server-rendered pages, and built it out to cover pretty much all of CSS, jQuery and the browser DOM. I haven't touched a web control since.
To intercept HTML in webforms with CsQuery you do this in the page codebehind:
To do the same thing in ASP.NET MVC please see this blog post describing that.
There is basic documentation for CsQuery on GitHub. Apart from getting HTML in and out, it works pretty much like jQuery. The
WebForms
object above is just to help you handle interacting with theHtmlTextWriter
object and theRender
method. The general-purpose usage is very simple:Additonally, pretty much the entire browser DOM is available using the same methods you use in a browser. The indexer [0] returns the first element in the selection set like jquery; if you are used to write javascript to manipulate HTML it should be very familiar.
Of course in C# you have a wealth of other general-purpose tools like LINQ at your disposal. Alternatively:
When you're done manipulating the document, you'll probably want to get the HTML out:
That's all there is to it. There are a vast number of methods on the
CQ
object, covering all the jQuery DOM manipulation techniques. There are also utility methods for handling JSON, and it has extensive support for dynamic and anonymous types to make passing data structures (e.g. a set of CSS classes) as easy as possible -- much like jQuery.Some More Advanced Stuff
I don't recommend doing this unless you are familiar with lower-level tinkering with asp.net's http workflow. There's nothing at all undoable but there will be a learning curve if you've never heard of an HttpHandler.
If you want to skip the WebForms engine altogether, you can create an
IHttpHandler
that automatically parses HTML files. This would definitely perform better than overlaying on a the ASPX engine -- who knows, maybe even faster than doing a similar amount of server-side processing with web controls. You can then then register your handler using web.config for specific extensions (likehtm
andhtml
).Yet another way to automatically intercept is with routing. You can use the MVC routing library in a webforms app with no trouble, here's one description of how to do this. Then you can create a route that matches whatever pattern you want (again, perhaps
*.html
) and pass handling off to a customIHttpHandler
or class. In this case, you're doing everything: you will need to look at the path, load the file from the file system, parse it with CsQuery, and stream the response.Using either mechanism, you'll need a way to tell your project what code to run for each page, of course. That is, just because you've created a nifty HTML parser, how do you then tell it to run the correct "code behind" for that page?
MVC does this by just locating a controller with the name of "PageNameController.cs" and calling a method that matches the name of the parameter. You could do whatever you want; e.g. you could add an element:
Your generic handler code could look for such an element, and then use reflection to locate the correct named class & method to call. This is pretty involved, and beyond the scope of this answer; but if you're looking to build a whole new framework or something this is how you would go about it.
Intercepting the content of the page prior to it being sent is rather simple. I did this a while back on a project that compressed content on the fly: http://optimizerprime.codeplex.com/ (It's ugly, but it did its job and you might be able to salvage some of the code). Anyway, what you want to do is the following:
1) Create a Stream object that saves the content of the page until Flush is called. For instance I used this in my compression project: http://optimizerprime.codeplex.com/SourceControl/changeset/view/83171#1795869 Like I said before, it's not pretty. But my point being you'll need to create your own Stream class that will do what you want (in this case give you the string output of the page, parse/modify the string, and then output it to the user).
2) Assign the page's filter object to it. (Page.Response.Filter) Note that you need to do it rather early on so you can catch all of the content. I did this with a HTTP Module that ran on the PreRequestHandlerExecute event. But if you did something like this:
That would also most likely work.
3) You should be able to use something like Html Agility Pack to parse the HTML and modify it from there.
That to me seems like the easiest approach.