I am trying to find a Python solution which can execute the following OpenRefine Python commands in JSON without OpenRefine server being on. My OpenRefine JSON contains mappings and custom Python commands on each field of any properly formatted CSV file, so this is not a basic JSON reading. One example OpenRefine JSON code where only regex mappings
[
{
"op": "core/text-transform",
"description": "Text transform on cells in column Sleep using expression jython:import re\n\nvalue = re.sub(\"h0\", \"h\",value)\n\nvalue = re.sub(\"h\",\"*60+\", value)\n\nreturn eval(value)\n\n \nreturn eval(value.replace(\"h\", \"*60+\"));",
"engineConfig": {
"mode": "row-based",
"facets": []
},
"columnName": "Sleep",
"expression": "jython:import re\n\nvalue = re.sub(\"h0\", \"h\",value)\n\nvalue = re.sub(\"h\",\"*60+\", value)\n\nreturn eval(value)\n\n \nreturn eval(value.replace(\"h\", \"*60+\"));",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
One solution is to process JSON one by one with each type of element but there may be easier solutions with some packages.
Python: 3.5.2
OS: Debian 9
The pyrefine project aims to do exactly that. But it is still a work in progress, very few operations are supported. Contributors are welcome!
Here are some additional projects:
I do not know any ready-made solution and I have no idea how to do that. A workaround might be to try to transform the Jython scripts into functions, and then apply them to your csv using pandas.
result on your Json:
Finally, you can add these lines on the top of
result.py
, launch the script, and pray...I have found variety of different alternatives. I try to judge them to other alternatives. Pyrefine is so far the only genuine Python solution.
Alternatives
I. A partial solution here to create a dictionary in R with Python to do the conversions. This does not implement GREPL edits, Jython/Python edits or Closure edits.
where the output could be edited to Python format.
II. P3-batchrefine is mostly coded in Java but some Python. It lets you do the transformations in the following way (not a genuine Python solution unless you are fine in calling external Java libraries).
III. Pyrefine is a genuine python solution and it aims to work in the following way, copied from its docs:
Further Information on parsing OpenRefine JSON