Blogs providers such as Tumblr and Blogger allow users to write scripts in their own blogs.
It makes users add AdSense, Analytics and counters into their blogs easily.
How to keep security and customization both?
What kind of scripts should I filter?
Thx :)
If every blog is going to be on its own domain (not a shared second level domain like blogname.myblog.com
!), chances are there is no need to filter anything at all.
The Same Origin Policy will prevent sites from having access to anything important (like session cookies that could be hijacked to break into other blogs, or administrative URLs).
There is always the danger of a malicious user adding an iframe pointing to a malware-infected site, or doing something else evil. But there is no chance for you to stop that reliably. Every hosting company allowing their clients to upload HTML has the exact same problem. I guess nothing can be done against that except oversight, having each blogger sign some Terms & Conditions, and kicking out anybody who violates them.
If you are planning to run the blogs on a shared domain, it becomes potentially more difficult, because blogs could access stuff like each other's, and possibly the admin area's, cookies. There'd be a number of things that you would have to be aware of.
This is a hard problem, and it really depends on how stringent you are trying to be. One way would be to get them to write in a new language that you preprocess into JS, meaning that only things you are allow are possible, another way is to try to blacklist obvious things to avoid XSS and cookie stealing.
The real issue is that you can't just do string find and replaces:
alert(document.cookie)
can be written
゚ω゚ノ= /`m´)ノ ~┻━┻ //´∇`/ [''];
o=(゚ー゚) ==3; c=(゚Θ゚) =(゚ー゚)-(゚ー゚);
(゚Д゚) =(゚Θ゚)= (o^^o)/
(o^^o);(゚Д゚)={゚Θ゚: '' ,゚ω゚ノ :
((゚ω゚ノ==3) +'') [゚Θ゚] ,゚ー゚ノ :(゚ω゚ノ+
'')[o^^o -(゚Θ゚)] ,゚Д゚ノ:((゚ー゚==3)
+'')[゚ー゚] }; (゚Д゚) [゚Θ゚] =((゚ω゚ノ==3) +'') [c^^o];(゚Д゚) ['c'] = ((゚Д゚)+'') [ (゚ー゚)+(゚ー゚)-(゚Θ゚)
];(゚Д゚) ['o'] = ((゚Д゚)+'')
[゚Θ゚];(゚o゚)=(゚Д゚) ['c']+(゚Д゚)
['o']+(゚ω゚ノ +'')[゚Θ゚]+ ((゚ω゚ノ==3)
+'') [゚ー゚] + ((゚Д゚) +'') [(゚ー゚)+(゚ー゚)]+ ((゚ー゚==3) +'')
[゚Θ゚]+((゚ー゚==3) +'') [(゚ー゚) -
(゚Θ゚)]+(゚Д゚) ['c']+((゚Д゚)+'')
[(゚ー゚)+(゚ー゚)]+ (゚Д゚) ['o']+((゚ー゚==3)
+'') [゚Θ゚];(゚Д゚) [''] =(o^^o) [゚o゚] [゚o゚];(゚ε゚)=((゚ー゚==3) +'') [゚Θ゚]+
(゚Д゚) .゚Д゚ノ+((゚Д゚)+'') [(゚ー゚) +
(゚ー゚)]+((゚ー゚==3) +'') [o^^o
-゚Θ゚]+((゚ー゚==3) +'') [゚Θ゚]+ (゚ω゚ノ +'') [゚Θ゚]; (゚ー゚)+=(゚Θ゚); (゚Д゚)[゚ε゚]='\'; (゚Д゚).゚Θ゚ノ=(゚Д゚+
゚ー゚)[o^^o -(゚Θ゚)];(o゚ー゚o)=(゚ω゚ノ
+'')[c^^o];(゚Д゚) [゚o゚]='\"';(゚Д゚) [''] ( (゚Д゚) [''] (゚ε゚+(゚Д゚)[゚o゚]+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ (゚ー゚)+ (゚Θ゚)+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ ((゚ー゚) + (゚Θ゚))+
(゚ー゚)+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ (゚ー゚)+ ((゚ー゚)
+ (゚Θ゚))+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ ((o^^o) +(o^^o))+ ((o^^o) - (゚Θ゚))+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ ((o^^o) +(o^^o))+
(゚ー゚)+ (゚Д゚)[゚ε゚]+((゚ー゚) + (゚Θ゚))+
(c^^o)+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ (゚ー゚)+
(゚ー゚)+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ ((゚ー゚) +
(゚Θ゚))+ ((゚ー゚) + (o^^o))+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ (゚ー゚)+ (o^^o)+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ ((o^^o) +(o^^o))+
((゚ー゚) + (゚Θ゚))+ (゚Д゚)[゚ε゚]+(゚Θ゚)+
((゚ー゚) + (゚Θ゚))+ ((゚ー゚) + (゚Θ゚))+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ (゚ー゚)+ ((゚ー゚) +
(゚Θ゚))+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ ((゚ー゚) +
(゚Θ゚))+ ((o^^o) +(o^^o))+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ ((o^^o) +(o^^o))+
(゚ー゚)+ (゚Д゚)[゚ε゚]+((゚ー゚) + (゚Θ゚))+
((o^^o) +(o^^o))+ (゚Д゚)[゚ε゚]+(゚Θ゚)+
(゚ー゚)+ (o^^o)+ (゚Д゚)[゚ε゚]+(゚Θ゚)+
((゚ー゚) + (゚Θ゚))+ ((゚ー゚) + (o^^o))+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ ((゚ー゚) + (゚Θ゚))+
((゚ー゚) + (o^^o))+ (゚Д゚)[゚ε゚]+(゚Θ゚)+
((゚ー゚) + (゚Θ゚))+ (o^^o)+
(゚Д゚)[゚ε゚]+(゚Θ゚)+ ((゚ー゚) + (゚Θ゚))+
(゚Θ゚)+ (゚Д゚)[゚ε゚]+(゚Θ゚)+ (゚ー゚)+ ((゚ー゚)
+ (゚Θ゚))+ (゚Д゚)[゚ε゚]+((゚ー゚) + (゚Θ゚))+ (゚Θ゚)+ (゚Д゚)[゚o゚]) (゚Θ゚)) ('');
A silly example, but you can see how hard it is to manually filter this.
There's no way to filter anything, but javascript can create independant contexts. The blog framework executes their (important) javascript code in a context and allows the blogger to execute code in another context. There's no way for those contexts to get to each other: the blogger can't write JS that conflicts with the blog's framework JS or negates it or corrupts it.
If a user adds malicious js into their page themselves then there should be some sort of banning mechanism. I would be focussing more on making sure your login, authentication and posting procedures are secure so that visitors that are not logged in can't inject script tags into an unsuspecting user's page body.
If you're going to allow authenticated users to put js in their posts you're already making your situation difficult. If you want to allow adsense or google analytics you can make your job simpler by creating some kind of widget for those that can be placed in content and you can check against the js code pasted into those widgets against what you would expect for adsense/analytics/etc...
While Rich has a great point about the fact that code can be obscured to the point of being unrecognisable. You could filter for URLs that take people offsite and potentially you could require any URLs in the javascript to be either local or say to google's CDN.