How can I access un-rendered (markdown) content in

2019-02-12 18:26发布

问题:

From reading the documentation Jekyll's template data one might think that the way to access un-rendered content would be page.content; but as far as I can tell, this is providing the content of the post as already rendered by the markdown parser.

I need a solution that accesses the raw (original markdown) content directly, rather than simply trying to convert the html back to markdown.

Background on use case

My use case is the following: I use the pandoc plugin to render markdown for my Jekyll site, using the 'mathjax' option to get pretty equations. However, mathjax requires javascript, so these do not display in the RSS feed, which I generate by looping over page.content like so:

 {% for post in site.posts %}
 <entry>
   <title>{{ post.title }}</title>
   <link href="{{ site.production_url }}{{ post.url }}"/>
   <updated>{{ post.date | date_to_xmlschema }}</updated>
   <id>{{ site.production_url }}{{ post.id }}</id>
   <content type="html">{{ post.content | xml_escape }}</content>
 </entry>
 {% endfor %}

As the xml_escape filter implies, post.content here appears in html. If I could get the raw content (imagine post.contentraw or such existed) then I could easily add a filter that would use pandoc with the "webtex" option to generate images for equations when parsing the RSS feed, e.g:

require 'pandoc-ruby'
module TextFilter
  def webtex(input)
    PandocRuby.new(input, "webtex").to_html
  end
end
Liquid::Template.register_filter(TextFilter)

But as I get content with the equations already rendered in html+mathjax instead of the raw markdown, I'm stuck. Converting back to markdown doesn't help, since it doesn't convert the mathjax (simply garbles it).

Any suggestions? Surely there's a way to call the raw markdown instead?

回答1:

Here's the trouble that I think you'll have: https://github.com/mojombo/jekyll/blob/master/lib/jekyll/convertible.rb https://github.com/mojombo/jekyll/blob/master/lib/jekyll/site.rb

From my reading, for a given post/page self.content is replaced by the result of running self.content through Markdown and Liquid, at line 79 in convertible.rb:

self.content = Liquid::Template.parse(self.content).render(payload, info)

Posts are rendered before pages, seen at lines 37-44 and 197-211 in site.rb:

def process
  self.reset
  self.read
  self.generate
  self.render
  self.cleanup
  self.write
end

... ...

def render
  payload = site_payload
  self.posts.each do |post|
    post.render(self.layouts, payload)
  end

  self.pages.each do |page|
    page.render(self.layouts, payload)
  end

  self.categories.values.map { |ps| ps.sort! { |a, b| b <=> a } }
  self.tags.values.map { |ps| ps.sort! { |a, b| b <=> a } }
rescue Errno::ENOENT => e
  # ignore missing layout dir
end

By the time you get to rendering this page, self.content has been rendered to HTML - so it isn't a case of stopping it rendering. It's already done.

However, Generators (https://github.com/mojombo/jekyll/wiki/Plugins) run before the render stage, so, as far as I can tell from reading the source, you should be able to fairly trivially write a generator which will duplicate self.content into some attribute (such as self.raw_content) which you can later access as raw Markdown in your templates {{ page.raw_content }}.