This is the question I asked yesterday. I was able to get the required data. The final data is like this. Please follow this link.
I tried with the following code to get all the infobox data
content = content.split("}}\n");
for(k in content)
{
if(content[k].search("Infobox")==2)
{
var infobox = content[k];
alert(infobox);
infobox = infobox.replace("{{","");
alert(infobox);
infobox = infobox.split("\n|");
//alert(infobox[0]);
var infohtml="";
for(l in infobox)
{
if(infobox[l].search("=")>0)
{
var line = infobox[l].split("=");
infohtml = infohtml+"<tr><td>"+line[0]+"</td><td>"+line[1]+"</td></tr>";
}
}
infohtml="<table>"+infohtml+"</table>";
$('#con').html(infohtml);
break;
}
}
I initially thought each element is enclosed in {{ }}. So I wrote this code. But what I see is, I was not able to get the entire infobox data with this. There is this element
{{Sfn|National Informatics Centre|2005}}
occuring which ends my infobox data.
It seems to be far simpler without using json. Please help me
Have you tried DBpedia? Afaik they provide template usage information. There is also a toolserver tool named Templatetiger, which does template extraction from the static dumps (not live).
However, I once wrote a tiny snippet to extract templates from wikitext in javascript:
It features one-level nested templates, but still is very error-prone. Parsing wikitext with regexp is as evil as trying to do it on html :-)
It may be easier to query the parse-tree from the api: api.php?action=query&prop=revisions&rvprop=content&rvgeneratexml=1&titles=.... From that parsetree you will be able to extract the templates easily.