Using pybtex to convert from bibtex to formatted H

2019-09-16 01:31发布

问题:

I'm using Django and am storing bibtex in my model and want to be able to pass my view the reference in the form of a formatted HTML string made to look like the Harvard reference style.

Using the method described in Pybtex does not recogonize bibtex entry it is possible for me to convert a bibtex string into a pybtex BibliographyData object. I believe it should be possible to get from this to an HTML format based on the docs https://pythonhosted.org/pybtex/api/formatting.html but I just don't seem to be able to get it working.

Pybtex seems to be set up to be used from the command line rather than python, and there are very few examples of it being used on the internet. Has anyone done anything like this? Perhaps it would be easier to pass the bibtex to my template and use a javascript library like https://github.com/pcooksey/bibtex-js to try and get an approximation of the Harvard style?

回答1:

To do that I adapted some code from here. I am not sure what is the name of this particular formatting style, but most probably you can change/edit it. This is how it looks:

import io
import six
import pybtex.database.input.bibtex
import pybtex.plugin

pybtex_style = pybtex.plugin.find_plugin('pybtex.style.formatting', 'plain')()
pybtex_html_backend = pybtex.plugin.find_plugin('pybtex.backends', 'html')()
pybtex_parser = pybtex.database.input.bibtex.Parser()

my_bibtex = '''
@Book{1985:lindley,
author =    {D. Lindley},
title =     {Making Decisions},
publisher = {Wiley},
year =      {1985},
edition =   {2nd},
}
'''

data = pybtex_parser.parse_stream(six.StringIO(my_bibtex))
data_formatted = pybtex_style.format_entries(six.itervalues(data.entries))
output = io.StringIO()
pybtex_html_backend.write_to_stream(data_formatted, output)
html = output.getvalue()

print (html)

This generates the following HTML formatted reference:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head><meta name="generator" content="Pybtex">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Bibliography</title>
</head>
<body>
<dl>
<dt>1</dt>
<dd>D.&nbsp;Lindley.
<em>Making Decisions</em>.
Wiley, 2nd edition, 1985.</dd>
</dl></body></html>


回答2:

I've notice the command line pybtex-format tool produces a fair output for HTML:

$ pybtex-format myinput.bib myoutput.html

So I went to the source code at pybtex/database/format/__main__.py and found an incredibly simple solution that worked like a charm for me:

from pybtex.database.format import format_database
format_database('myinput.bib', 'myoutput.html', 'bibtex', 'html')

Here are my input and output files:

@inproceedings{Batista18b,
        author   = {Cassio Batista and Ana Larissa Dias and Nelson {Sampaio Neto}},
        title    = {Baseline Acoustic Models for Brazilian Portuguese Using Kaldi Tools},
        year     = {2018},
        booktitle= {Proc. IberSPEECH 2018},
        pages    = {77--81},
        doi      = {10.21437/IberSPEECH.2018-17},
        url      = {http://dx.doi.org/10.21437/IberSPEECH.2018-17}
}
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head><meta name="generator" content="Pybtex">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Bibliography</title>
</head>
<body>
<dl>
<dt>1</dt>
<dd>Cassio Batista, Ana&nbsp;Larissa Dias, and Nelson <span class="bibtex-protected">Sampaio Neto</span>.
Baseline acoustic models for brazilian portuguese using kaldi tools.
In <em>Proc. IberSPEECH 2018</em>, 77–81. 2018.
URL: <a href="http://dx.doi.org/10.21437/IberSPEECH.2018-17">http://dx.doi.org/10.21437/IberSPEECH.2018-17</a>, <a href="https://doi.org/10.21437/IberSPEECH.2018-17">doi:10.21437/IberSPEECH.2018-17</a>.</dd>
</dl></body></html>