I need to inline css from a stylesheet in c#.
Like how this works.
http://www.mailchimp.com/labs/inlinecss.php
The css is simple, just classes, no fancy selectors.
I was contemplating using a regex (?<rule>(?<selector>[^{}]+){(?<style>[^{}]+)})+
to strip the rules from the css, and then attempting to do simple string replaces where the classes are called, but some of the html elements already have a style tag, so I'd have to account for that as well.
Is there a simpler approach? Or something already written in c#?
UPDATE - Sep 16, 2010
I've been able to come up with a simple CSS inliner provided your html is also valid xml. It uses a regex to get all the styles in your <style />
element. Then converts the css selectors to xpath expressions, and adds the style inline to the matching elements, before any pre-existing inline style.
Note, that the CssToXpath is not fully implemented, there are some things it just can't do... yet.
CssInliner.cs
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Xml.Linq;
using System.Xml.XPath;
namespace CssInliner
{
public class CssInliner
{
private static Regex _matchStyles = new Regex("\\s*(?<rule>(?<selector>[^{}]+){(?<style>[^{}]+)})",
RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
public List<Match> Styles { get; private set; }
public string InlinedXhtml { get; private set; }
private XElement XhtmlDocument { get; set; }
public CssInliner(string xhtml)
{
XhtmlDocument = ParseXhtml(xhtml);
Styles = GetStyleMatches();
foreach (var style in Styles)
{
if (!style.Success)
return;
var cssSelector = style.Groups["selector"].Value.Trim();
var xpathSelector = CssToXpath.Transform(cssSelector);
var cssStyle = style.Groups["style"].Value.Trim();
foreach (var element in XhtmlDocument.XPathSelectElements(xpathSelector))
{
var inlineStyle = element.Attribute("style");
var newInlineStyle = cssStyle + ";";
if (inlineStyle != null && !string.IsNullOrEmpty(inlineStyle.Value))
{
newInlineStyle += inlineStyle.Value;
}
element.SetAttributeValue("style", newInlineStyle.Trim().NormalizeCharacter(';').NormalizeSpace());
}
}
XhtmlDocument.Descendants("style").Remove();
InlinedXhtml = XhtmlDocument.ToString();
}
private List<Match> GetStyleMatches()
{
var styles = new List<Match>();
var styleElements = XhtmlDocument.Descendants("style");
foreach (var styleElement in styleElements)
{
var matches = _matchStyles.Matches(styleElement.Value);
foreach (Match match in matches)
{
styles.Add(match);
}
}
return styles;
}
private static XElement ParseXhtml(string xhtml)
{
return XElement.Parse(xhtml);
}
}
}
CssToXpath.cs
using System.Text.RegularExpressions;
namespace CssInliner
{
public static class CssToXpath
{
public static string Transform(string css)
{
#region Translation Rules
// References: http://ejohn.org/blog/xpath-css-selectors/
// http://code.google.com/p/css2xpath/source/browse/trunk/src/css2xpath.js
var regexReplaces = new[] {
// add @ for attribs
new RegexReplace {
Regex = new Regex(@"\[([^\]~\$\*\^\|\!]+)(=[^\]]+)?\]", RegexOptions.Multiline),
Replace = @"[@$1$2]"
},
// multiple queries
new RegexReplace {
Regex = new Regex(@"\s*,\s*", RegexOptions.Multiline),
Replace = @"|"
},
// , + ~ >
new RegexReplace {
Regex = new Regex(@"\s*(\+|~|>)\s*", RegexOptions.Multiline),
Replace = @"$1"
},
//* ~ + >
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*])~([a-zA-Z0-9_\-\*])", RegexOptions.Multiline),
Replace = @"$1/following-sibling::$2"
},
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*])\+([a-zA-Z0-9_\-\*])", RegexOptions.Multiline),
Replace = @"$1/following-sibling::*[1]/self::$2"
},
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*])>([a-zA-Z0-9_\-\*])", RegexOptions.Multiline),
Replace = @"$1/$2"
},
// all unescaped stuff escaped
new RegexReplace {
Regex = new Regex(@"\[([^=]+)=([^'|""][^\]]*)\]", RegexOptions.Multiline),
Replace = @"[$1='$2']"
},
// all descendant or self to //
new RegexReplace {
Regex = new Regex(@"(^|[^a-zA-Z0-9_\-\*])(#|\.)([a-zA-Z0-9_\-]+)", RegexOptions.Multiline),
Replace = @"$1*$2$3"
},
new RegexReplace {
Regex = new Regex(@"([\>\+\|\~\,\s])([a-zA-Z\*]+)", RegexOptions.Multiline),
Replace = @"$1//$2"
},
new RegexReplace {
Regex = new Regex(@"\s+\/\/", RegexOptions.Multiline),
Replace = @"//"
},
// :first-child
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):first-child", RegexOptions.Multiline),
Replace = @"*[1]/self::$1"
},
// :last-child
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):last-child", RegexOptions.Multiline),
Replace = @"$1[not(following-sibling::*)]"
},
// :only-child
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):only-child", RegexOptions.Multiline),
Replace = @"*[last()=1]/self::$1"
},
// :empty
new RegexReplace {
Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):empty", RegexOptions.Multiline),
Replace = @"$1[not(*) and not(normalize-space())]"
},
// |= attrib
new RegexReplace {
Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\|=([^\]]+)\]", RegexOptions.Multiline),
Replace = @"[@$1=$2 or starts-with(@$1,concat($2,'-'))]"
},
// *= attrib
new RegexReplace {
Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\*=([^\]]+)\]", RegexOptions.Multiline),
Replace = @"[contains(@$1,$2)]"
},
// ~= attrib
new RegexReplace {
Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)~=([^\]]+)\]", RegexOptions.Multiline),
Replace = @"[contains(concat(' ',normalize-space(@$1),' '),concat(' ',$2,' '))]"
},
// ^= attrib
new RegexReplace {
Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\^=([^\]]+)\]", RegexOptions.Multiline),
Replace = @"[starts-with(@$1,$2)]"
},
// != attrib
new RegexReplace {
Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\!=([^\]]+)\]", RegexOptions.Multiline),
Replace = @"[not(@$1) or @$1!=$2]"
},
// ids
new RegexReplace {
Regex = new Regex(@"#([a-zA-Z0-9_\-]+)", RegexOptions.Multiline),
Replace = @"[@id='$1']"
},
// classes
new RegexReplace {
Regex = new Regex(@"\.([a-zA-Z0-9_\-]+)", RegexOptions.Multiline),
Replace = @"[contains(concat(' ',normalize-space(@class),' '),' $1 ')]"
},
// normalize multiple filters
new RegexReplace {
Regex = new Regex(@"\]\[([^\]]+)", RegexOptions.Multiline),
Replace = @" and ($1)"
},
};
#endregion
foreach (var regexReplace in regexReplaces)
{
css = regexReplace.Regex.Replace(css, regexReplace.Replace);
}
return "//" + css;
}
}
struct RegexReplace
{
public Regex Regex;
public string Replace;
}
}
And some tests
[TestMethod]
public void TestCssToXpathRules()
{
var translations = new Dictionary<string, string>
{
{ "*", "//*" },
{ "p", "//p" },
{ "p > *", "//p/*" },
{ "#foo", "//*[@id='foo']" },
{ "*[title]", "//*[@title]" },
{ ".bar", "//*[contains(concat(' ',normalize-space(@class),' '),' bar ')]" },
{ "div#test .note span:first-child", "//div[@id='test']//*[contains(concat(' ',normalize-space(@class),' '),' note ')]//*[1]/self::span" }
};
foreach (var translation in translations)
{
var expected = translation.Value;
var result = CssInliner.CssToXpath.Transform(translation.Key);
Assert.AreEqual(expected, result);
}
}
[TestMethod]
public void HtmlWithMultiLineClassStyleReturnsInline()
{
#region var html = ...
var html = XElement.Parse(@"<html>
<head>
<title>Hello, World Page!</title>
<style>
.redClass {
background: red;
color: purple;
}
</style>
</head>
<body>
<div class=""redClass"">Hello, World!</div>
</body>
</html>").ToString();
#endregion
#region const string expected ...
var expected = XElement.Parse(@"<html>
<head>
<title>Hello, World Page!</title>
</head>
<body>
<div class=""redClass"" style=""background: red; color: purple;"">Hello, World!</div>
</body>
</html>").ToString();
#endregion
var result = new CssInliner.CssInliner(html);
Assert.AreEqual(expected, result.InlinedXhtml);
}
There are more tests, but, they import html files for the input and expected output and I'm not posting all that!
But I should post the Normalize extension methods!
private static readonly Regex NormalizeSpaceRegex = new Regex(@"\s{2,}", RegexOptions.None);
public static string NormalizeSpace(this string data)
{
return NormalizeSpaceRegex.Replace(data, @" ");
}
public static string NormalizeCharacter(this string data, char character)
{
var normalizeCharacterRegex = new Regex(character + "{2,}", RegexOptions.None);
return normalizeCharacterRegex.Replace(data, character.ToString());
}
Excellent question.
I have no idea if there is a .NET solution, but I found a Ruby program called Premailer that claims to inline CSS. If you want to use it you have a couple options:
Since you're already 90% of the way there with your current implementation, why don't you use your existing framework but replace the XML parsing with an HTML parser instead? One of the more popular ones out there is the HTML Agility Pack. It supports XPath queries and even has a LINQ interface similar to the standard .NET interface provided for XML so it should be a fairly straightforward replacement.
I have a project on Github that makes CSS inline. It's very simple, and support mobile styles. Read more on my blog: http://martinnormark.com/move-css-inline-premailer-net
Chad, do you necessarily have to add the CSS inline? Or could you maybe be better off by adding a
<style>
block to your<head>
? This will in essence replace the need for a reference to a CSS file as well plus maintain the rule that the actual inline rules override the ones set in the header/referenced css file.(sorry, forgot to add the quotes for code)
Here is an idea, why dont you make a post call to http://www.mailchimp.com/labs/inlinecss.php using c#. from analysis using firebug it looks like the post call needs 2 params html and strip which takes values (on/off) the result is in a param called text.
here is a sample on how to make a post call using c#
I'd recommend using an actual CSS parser rather than Regexes. You don't need to parse the full language since you're interested mostly in reproduction, but in any case such parsers are available (and for .NET too). For example, take a look at antlr's list of grammars, specifically a CSS 2.1 grammar or a CSS3 grammar. You can possibly strip large parts of both grammars if you don't mind sub-optimal results wherein inline styles may include duplicate definitions, but to do this well, you'll need some idea of internal CSS logic to be able to resolve shorthand attributes.
In the long run, however, this will certainly be a lot less work than a neverending series of adhoc regex fixes.