I'm trying to write code in Java to read a file by couple of threads and count the words in them. Each thread should read different lines. It counts words well (when I let 1 thread run) but my threads are reading same line and increments line counter at the same time. I was sure that the synchronized
keyword in read method will fix it,but it didn't. What should I do to fix it?
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
public class WordCounterr implements Runnable {
private static Hashtable<String, Integer> ht = new Hashtable<String, Integer>();
private int lineCounter;
private String path;
private int tNumber;
//private final AtomicInteger whichLine = new AtomicInteger();
private static int whichLine;
private static boolean flag;
public WordCounterr(String path,int num){
lineCounter = 0;
//whichLine = 0;
flag= false;
this.path=path;
tNumber = num;
}
public void countWords(String s) throws IOException{
char[] c = s.toCharArray();
String str="";
char ch;
for(int k=0;k<c.length;k++){
ch=c[k];
if((ch>40 && ch<91) ||(ch>96 && ch<123)){
if(ch>40 && ch<91)
ch+=32;
str+=ch;
}
else if(ch==32 ||k==c.length-1){
if(str.length()>1){ //sprawdzamy czy funkcja znalazla juz
if(ht.containsKey(str)) //takie slowo
ht.put(str,ht.get(str)+1); //znalazla - powiekszamy wartosc przy kluczu
else
ht.put(str,1); //nie znalazla - dodajemy slowo do Hashtable
}
str="";
}
}
}
public synchronized void read(String path) throws IOException{
BufferedReader buf=new BufferedReader(new FileReader(path));
String linia ;
for(int i=0;i<whichLine;i++){
linia=buf.readLine();
}
if((linia=buf.readLine())!=null){
System.out.println(linia);
countWords(linia);
lineCounter++;
System.out.println("watek nr:"+tNumber+"ktora linia:"+whichLine);
whichLine++;
/*try{
Thread.sleep(100);
}catch(InterruptedException el){
System.out.println(el.toString());
}*/
} else
setFlag(true);
buf.close(); //pamietamy o zamknieciu pliku
}
public synchronized void print(){
if(getFlag()){
setFlag(false);
System.out.println(ht);
}
System.out.println("watek nr: "+tNumber+", przeanalizowano "+ lineCounter+ "linii tekstu");
}
public void setFlag(boolean val){
flag=val;
}
public boolean getFlag(){
return flag;
}
@Override
public void run() {
try{
while(getFlag()==false) {
read(path);
Thread.yield(); //let other thread read
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}catch(IOException ex){
System.out.println(ex.toString());
}//catch(InterruptedException el){
// System.out.println(el.toString());
//}
print();
}
public static void main(String[] args) throws IOException, InterruptedException{
String path = args[0];
int tNum = Integer.parseInt(args[1]);
Thread[] thread = new Thread[tNum]; // tablica w?tków
for (int i = 0; i < tNum; i++){
thread[i] =new Thread(new WordCounterr(path,i));
}
for (int i = 0; i < tNum; i++)
thread[i].start();
}
}
I am guessing that it still will be inefficiently reading file content. Try change the synchronization point. It should be to placed in read method. This method reads whole file content. Rather that try synchronized just reading next line of this file. You can achieve it by putting to each WordCounterr instance the same reader file instance and synchronized only process of moving pointer to next line read content of this line. Counting words in the line can be done without synchronization and only updating HashTable should be synchronized. Reading file content in parallel can be synchronized as below:
The synchronized modifier is defined so:
it is not possible for two invocations of synchronized methods on the same object to interleave.
You are calling the method
read
in each of yourThreads
.However you are not calling the same
read
method because you are passing new instances ofWordCounterr
to each newThread
. This means you are calling the method on different objects which will not be effected by the synchronized modifier.To fix this try:
Rather than:
I hope this helps :)