Syntax error, insert “… VariableDeclaratorId” to c

2019-05-12 08:55发布

问题:

I am facing some issues with this code:

import edu.uci.ics.crawler4j.crawler.CrawlConfig;
import edu.uci.ics.crawler4j.crawler.CrawlController;
import edu.uci.ics.crawler4j.fetcher.PageFetcher;
import edu.uci.ics.crawler4j.robotstxt.RobotstxtConfig;
import edu.uci.ics.crawler4j.robotstxt.RobotstxtServer;

public class Controller {

     String crawlStorageFolder = "/data/crawl/root";
     int numberOfCrawlers = 7;

     CrawlConfig config = new CrawlConfig();
     config.setCrawlStorageFolder(crawlStorageFolder);
     /*
      * Instantiate the controller for this crawl.
      */
     PageFetcher pageFetcher = new PageFetcher(config);
     RobotstxtConfig robotstxtConfig = new RobotstxtConfig();
     RobotstxtServer robotstxtServer = new RobotstxtServer(robotstxtConfig, pageFetcher);
     CrawlController controller = new CrawlController(config, pageFetcher, robotstxtServer);

     /*
      * For each crawl, you need to add some seed urls. These are the first
      * URLs that are fetched and then the crawler starts following links
      * which are found in these pages
      */
     controller.addSeed("http://www.ics.uci.edu/~lopes/");
     controller.addSeed("http://www.ics.uci.edu/~welling/");
     controller.addSeed("http://www.ics.uci.edu/");
     /*
      * Start the crawl. This is a blocking operation, meaning that your code
      * will reach the line after this only when crawling is finished.
      */
     controller.start(MyCrawler.class, numberOfCrawlers);
 }

I am getting the following error:

"Syntax error, insert "... VariableDeclaratorId" to complete FormalParameterList" on config.setCrawlStrorageFolder(crawlStorageFolder)

回答1:

You can't have arbitrary code like that directly in the class body. It must be in a method (or constructor, or initialization block).



回答2:

Your code is in class body. Put it in a main method to run.

   import edu.uci.ics.crawler4j.crawler.CrawlConfig;
    import edu.uci.ics.crawler4j.crawler.CrawlController;
    import edu.uci.ics.crawler4j.fetcher.PageFetcher;
    import edu.uci.ics.crawler4j.robotstxt.RobotstxtConfig;
    import edu.uci.ics.crawler4j.robotstxt.RobotstxtServer;

    public class Controller {
    public static void main(String[] args){

         String crawlStorageFolder = "/data/crawl/root";
         int numberOfCrawlers = 7;

         CrawlConfig config = new CrawlConfig();
         config.setCrawlStorageFolder(crawlStorageFolder);
         /*
          * Instantiate the controller for this crawl.
          */
         PageFetcher pageFetcher = new PageFetcher(config);
         RobotstxtConfig robotstxtConfig = new RobotstxtConfig();
         RobotstxtServer robotstxtServer = new RobotstxtServer(robotstxtConfig, pageFetcher);
         CrawlController controller = new CrawlController(config, pageFetcher, robotstxtServer);

         /*
          * For each crawl, you need to add some seed urls. These are the first
          * URLs that are fetched and then the crawler starts following links
          * which are found in these pages
          */
         controller.addSeed("http://www.ics.uci.edu/~lopes/");
         controller.addSeed("http://www.ics.uci.edu/~welling/");
         controller.addSeed("http://www.ics.uci.edu/");
         /*
          * Start the crawl. This is a blocking operation, meaning that your code
          * will reach the line after this only when crawling is finished.
          */
         controller.start(MyCrawler.class, numberOfCrawlers);
     }
    }