How to automatically apply ISBN hyphenation?

2020-05-16 05:42发布

问题:

I've got ISBN numbers (10-digits and 13 digits) without the dashes. Now I'm looking for a way to add those dashes automatically.

I found some useful information here: http://www.isbn.org/standards/home/isbn/international/hyphenation-instructions.asp

But I'm not sure if it's doable at all, because the publisher identifier has a random length, and without knowing it, it's maybe not possible to determine the correct positions for the dashes.

Does anybody know if it's possible somehow?

Thanks a lot!

回答1:

You can deduce the length of the publisher identifier if you have the full range tables.

Example 1. ISBN 0141439564 (Penguin: Great Expectations)

  • The group identifier is 0 (English language).
  • The publisher ranges for this group are 00–19, 200–699, 7000–8499, 85000–89999, 900000-949999, and 9500000–9999999
  • The next two digits are 14, which is in the range 00–19, so the publisher has 2 digits.
  • So the hyphenated form is 0-14-143956-4

Example 2. ISBN 2253004227 (Poche: Gérminal)

  • The group identifier is 2 (French language)
  • The publisher ranges for this group are 00–19, 200–349, 35000–39999, 400–699, 7000–8399, 84000–89999, 900000–949999, 9500000–9999999
  • The next three digits are 253, which is in the range 200–349, so the publisher has 3 digits
  • So the hyphenated form is 2-253-00422-7

You can check your algorithm at the Library of Congress's ISBN hyphenation tool.



回答2:

For Python, you can use the library python-stdnum, isbnid or isbn_hyphenate. They can hyphenate ISBNs, and use the range table mentioned in the other answer.



回答3:

I wrote the following JavaScript function to hyphenate ISBNs (I know there is also isbnjs, but this is more compact and easier to include in other projects I think).

https://gist.github.com/aurimasv/6693537



回答4:

Take a look at https://pypi.python.org/pypi/isbntools, it will allow you to 'hyphenate' ISBNs and much more, like extracting, cleanning, transforming, and get metadata.

This is a library (that you can use in your program) but it installs several 'scripts' that you can use from the command line.