how much time does grid.py take to run?

I am using libsvm for binary classification.. I wanted to try grid.py , as it is said to improve results.. I ran this script for five files in separate terminals , and the script has been running for more than 12 hours..

this is the state of my 5 terminals now :

[root@localhost tools]# python grid.py sarts_nonarts_feat.txt>grid_arts.txt
Warning: empty z range [61.3997:61.3997], adjusting to [60.7857:62.0137]
         line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [61.3997:61.3997], adjusting to [60.7857:62.0137]
         line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".

[root@localhost tools]# python grid.py sgames_nongames_feat.txt>grid_games.txt
Warning: empty z range [64.5867:64.5867], adjusting to [63.9408:65.2326]
         line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [64.5867:64.5867], adjusting to [63.9408:65.2326]
         line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".

[root@localhost tools]# python grid.py sref_nonref_feat.txt>grid_ref.txt
Warning: empty z range [62.4602:62.4602], adjusting to [61.8356:63.0848]
         line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [62.4602:62.4602], adjusting to [61.8356:63.0848]
         line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".

[root@localhost tools]# python grid.py sbiz_nonbiz_feat.txt>grid_biz.txt
Warning: empty z range [67.9762:67.9762], adjusting to [67.2964:68.656]
         line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [67.9762:67.9762], adjusting to [67.2964:68.656]
         line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".

[root@localhost tools]# python grid.py snews_nonnews_feat.txt>grid_news.txt
Wrong input format at line 494
Traceback (most recent call last):
  File "grid.py", line 223, in run
    if rate is None: raise "get no rate"
TypeError: exceptions must be classes or instances, not str

I had redirected the outputs to files , but those files for now contain nothing.. And , the following files were created :

sbiz_nonbiz_feat.txt.out
sbiz_nonbiz_feat.txt.png
sarts_nonarts_feat.txt.out
sarts_nonarts_feat.txt.png
sgames_nongames_feat.txt.out
sgames_nongames_feat.txt.png
sref_nonref_feat.txt.out
sref_nonref_feat.txt.png
snews_nonnews_feat.txt.out (--> is empty )

There's just one line of information in .out files..
the ".png" files are some GNU PLOTS .

But i dont understand what the above GNUplots / warnings convey .. Should i re-run them ?

Can anyone please tell me on how much time this script might take if each input file contains about 144000 lines..

Thanks and regards

标签： machine-learning gnuplot libsvm

4条回答

smile是对你的礼貌

2楼-- · 2019-05-31 11:34

Change:

if rate is None: raise "get no rate"

in line 223 in grid.py to:

if rate is None: raise ValueError("get no rate")

Also, try adding:

gnuplot.write("set dgrid3d\n")

after this line in grid.py:

gnuplot.write("set contour\n")

This should fix your warnings and errors, but I am not sure if it will work, since grid.py seems to think your data has no rate.

0人赞添加讨论(0) 举报

爷的心禁止访问

3楼-- · 2019-05-31 11:36

I guess grid.py is trying to find the optimal value for C (or Nu)?

I don't have an answer for the amount of time it will take, but you might want to try this SVM library, even though it's an R package: svmpath.

As described on that page there, it will compute the entire "regularization path" for a two class SVM classifier in about as much time as it takes to train an SVM using one value of your penalty param C (or Nu).

So, instead of training and doing cross validation for an SVM with a value x for your C parameter, then doing all of that again for value x+1 for C, x+2, etc. You can just train the SVM once, then query its predictive performance for different values of C post-facto, so to speak.

0人赞添加讨论(0) 举报

一纸荒年 Trace。

4楼-- · 2019-05-31 11:40

The libSVM faq speaks to your question:

Q: Why grid.py/easy.py sometimes generates the following warning message? Warning: empty z range [62.5:62.5], adjusting to [61.875:63.125] Notice: cannot contour non grid data! Nothing is wrong and please disregard the message. It is from gnuplot when drawing the contour.

As a side note, you can parallelize your grid.py operations. The libSVM tools directory README file has this to say on the matter:

Parallel grid search

You can conduct a parallel grid search by dispatching jobs to a cluster of computers which share the same file system. First, you add machine names in grid.py:

ssh_workers = ["linux1", "linux5", "linux5"]

and then setup your ssh so that the authentication works without asking a password.

The same machine (e.g., linux5 here) can be listed more than once if it has multiple CPUs or has more RAM. If the local machine is the best, you can also enlarge the nr_local_worker. For example:

nr_local_worker = 2

In my Ubuntu 10.04 installation grid.py is actually /usr/bin/svm-grid.py

0人赞添加讨论(0) 举报

兄弟一词,经得起流年.

5楼-- · 2019-05-31 11:50

Your data is huge, 144 000 lines. So this will take sometime. I used large data such as yours and it took up to a week to finish. If you using images, which I suppose you are, hence the large data, try resizing your image before creating the data. You should get approximately the same results with your images resized.

0人赞添加讨论(0) 举报

how much time does grid.py take to run?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间