我想:我想解析格式的文本文件
{"business_id": "rncjoVoEFUJGCUoC1JgnUA", "full_address": "8466 W Peoria Ave\nSte 6\nPeoria, AZ 85345", "open": true, "categories": ["Accountants", "Professional Services", "Tax Services", "Financial Services"], "city": "Peoria", "review_count": 3, "name": "Peoria Income Tax Service", "neighborhoods": [], "longitude": -112.241596, "state": "AZ", "stars": 5.0, "latitude": 33.581867000000003, "type": "business"}
{"business_id": "0FNFSzCFP_rGUoJx8W7tJg", "full_address": "2149 W Wood Dr\nPhoenix, AZ 85029", "open": true, "categories": ["Sporting Goods", "Bikes", "Shopping"], "city": "Phoenix", "review_count": 5, "name": "Bike Doctor", "neighborhoods": [], "longitude": -112.10593299999999, "state": "AZ", "stars": 5.0, "latitude": 33.604053999999998, "type": "business"}
其中每行是一个单独的JSON对象。 我想解析的形式是一种类型的,其RPART可以采取作为一个参数的。
如果我遍历每一行,但根据本SO回答是多个R喜欢用通过每行循环单独的应用功能,而不是我能得到这个工作。
对于在R数据帧的每一行
问题:当我运行我的代码,我得到这个错误
Error in apply(yelp_df, 1, fromJSON) : dim(X) must have a positive length
我的代码
#!/usr/bin/Rscript
require(graphics)
require(RJSONIO)
con <- file("yelp_phoenix_academic_dataset/yelp_academic_dataset_business.json", "r")
yelp_df <- readLines(con) #rather then guessing what the optimal buffer size of the system is I'll just put everything into memeory
apply(yelp_df, 1, fromJSON)