Code golf - hex to (raw) binary conversion

2019-01-17 08:18发布

In response to this question asking about hex to (raw) binary conversion, a comment suggested that it could be solved in "5-10 lines of C, or any other language."

I'm sure that for (some) scripting languages that could be achieved, and would like to see how. Can we prove that comment true, for C, too?

NB: this doesn't mean hex to ASCII binary - specifically the output should be a raw octet stream corresponding to the input ASCII hex. Also, the input parser should skip/ignore white space.

edit (by Brian Campbell) May I propose the following rules, for consistency? Feel free to edit or delete these if you don't think these are helpful, but I think that since there has been some discussion of how certain cases should work, some clarification would be helpful.

  1. The program must read from stdin and write to stdout (we could also allow reading from and writing to files passed in on the command line, but I can't imagine that would be shorter in any language than stdin and stdout)
  2. The program must use only packages included with your base, standard language distribution. In the case of C/C++, this means their respective standard libraries, and not POSIX.
  3. The program must compile or run without any special options passed to the compiler or interpreter (so, 'gcc myprog.c' or 'python myprog.py' or 'ruby myprog.rb' are OK, while 'ruby -rscanf myprog.rb' is not allowed; requiring/importing modules counts against your character count).
  4. The program should read integer bytes represented by pairs of adjacent hexadecimal digits (upper, lower, or mixed case), optionally separated by whitespace, and write the corresponding bytes to output. Each pair of hexadecimal digits is written with most significant nibble first.
  5. The behavior of the program on invalid input (characters besides [a-fA-F \t\r\n], spaces separating the two characters in an individual byte, an odd number of hex digits in the input) is undefined; any behavior (other than actively damaging the user's computer or something) on bad input is acceptable (throwing an error, stopping output, ignoring bad characters, treating a single character as the value of one byte, are all OK)
  6. The program may write no additional bytes to output.
  7. Code is scored by fewest total bytes in the source file. (Or, if we wanted to be more true to the original challenge, the score would be based on lowest number of lines of code; I would impose an 80 character limit per line in that case, since otherwise you'd get a bunch of ties for 1 line).

16条回答
三岁会撩人
2楼-- · 2019-01-17 08:55
.

Its an language called "Hex!". Its only usage is to read hex data from stdin and output it to stdout. Hex! is parsed by an simple python script. import sys

try:
  data = open(sys.argv[1], 'r').read()
except IndexError:
  data = raw_input("hex!> ")
except Exception as e:
  print "Error occurred:",e

if data == ".":
  hex = raw_input()
  print int(hex, 16)
else:
  print "parsing error"
查看更多
看我几分像从前
3楼-- · 2019-01-17 08:58

A 31-character Perl solution:

s/\W//g,print(pack'H*',$_)for<>

查看更多
Rolldiameter
4楼-- · 2019-01-17 08:58

I know Jon posted a (cleaner) LINQ solution already. But for once I am able to use a LINQ statement which modifies a string during its execution and abuses LINQ's deferred evaluation without getting yelled at by my co-workers. :p

string hex = "FFA042";
byte[] bytes =
    hex.ToCharArray()
       .Select(c => ('0' <= c && c <= '9') ? 
                         c - '0' :
                         10 + (('a' <= c) ? c - 'a' : c - 'A'))
       .Select(c => (hex = hex.Remove(0, 1)).Length > 0 ? (new int[] {
           c,
           hex.ToCharArray()
                 .Select(c2 => ('0' <= c2 && c2 <= '9') ?
                                    c2 - '0' :
                                    10 + (('a' <= c2) ? c2 - 'a' : c2 - 'A'))
                 .FirstOrDefault() }) : ( new int[] { c } ) )
       .Where(c => (hex.Length % 2) == 1)
       .Select(ca => ((byte)((ca[0] << 4) + ca[1]))).ToArray();

1 statement formatted for readability.

Update

Support for spaces and uneven amount of decimals (89A is equal to 08 9A)

byte[] bytes =
    hex.ToCharArray()
       .Where(c => c != ' ')
       .Reverse()
       .Select(c => (char)(c2 | 32) % 39 - 9)
       .Select(c => 
           (hex =
                new string('0', 
                           (2 + (hex.Replace(" ", "").Length % 2)) *
                                hex.Replace(" ", "")[0].CompareTo('0')
                                                       .CompareTo(0)) +
                hex.Replace(" ", "").Remove(hex.Replace(" ", "").Length - 1))
              .Length > 0 ? (new int[] {
                        hex.ToCharArray()
                           .Reverse()
                           .Select(c2 => (char)(c2 | 32) % 39 - 9)
                           .FirstOrDefault(), c }) : new int[] { 0, c } )
                     .Where(c => (hex.Length % 2) == 1)
                     .Select(ca => ((byte)((ca[0] << 4) + ca[1])))
                     .Reverse().ToArray();

Still one statement. Could be made much shorter by running the replace(" ", "") on hex string in the start, but this would be a second statement.

Two interesting points with this one. How to track the character count without the help of outside variables other than the source string itself. While solving this I encountered the fact that char y.CompareTo(x) just returns "y - x" while int y.CompareTo(x) returns -1, 0 or 1. So char y.CompareTo(x).CompareTo(0) equals a char comparison which returns -1, 0 or 1.

查看更多
够拽才男人
5楼-- · 2019-01-17 09:02

edit Checkers has reduced my C solution to 46 bytes, which was then reduced to 44 bytes thanks to a tip from BillyONeal plus a bugfix on my part (no more infinite loop on bad input, now it just terminates the loop). Please give credit to Checkers for reducing this from 77 to 46 bytes:

main(i){while(scanf("%2x",&i)>0)putchar(i);}

And I have a much better Ruby solution than my last, in 42 38 bytes (thanks to Joshua Swank for the regexp suggestion):

STDIN.read.scan(/\S\S/){|x|putc x.hex}

original solutions

C, in 77 bytes, or two lines of code (would be 1 if you could put the #include on the same line). Note that this has an infinite loop on bad input; the 44 byte solution with the help of Checkers and BillyONeal fixes the bug, and simply stops on bad input.

#include <stdio.h>
int main(){char c;while(scanf("%2x",&c)!=EOF)putchar(c);}

It's even just 6 lines if you format it normally:

#include <stdio.h>
int main() {
  char c;
  while (scanf("%2x",&c) != EOF)
    putchar(c);
}

Ruby, 79 bytes (I'm sure this can be improved):

STDOUT.write STDIN.read.scan(/[^\s]\s*[^\s]\s*/).map{|x|x.to_i(16)}.pack("c*")

These both take input from STDIN and write to STDOUT

查看更多
家丑人穷心不美
6楼-- · 2019-01-17 09:06

39 char perl oneliner

y/A-Fa-f0-9//dc,print pack"H*",$_ for<>

Edit: wasn't really accepting uppercase, fixed.

查看更多
我想做一个坏孩纸
7楼-- · 2019-01-17 09:09

Brian's 77-byte C solution can be improved to 44 bytes, thanks to leniency of C with regard to function prototypes.

main(i){while(scanf("%2x",&i)>0)putchar(i);}
查看更多
登录 后发表回答