How do I read (and parse) a file and then append t

2019-02-15 08:09发布

问题:

I am trying to read from a file correctly in Haskell but I seem to get this error.

*** Exception: neo.txt: openFile: resource busy (file is locked) This is my code.

import Data.Char
import Prelude
import Data.List
import Text.Printf
import Data.Tuple
import Data.Ord
import Control.Monad
import Control.Applicative((<*))

import Text.Parsec
    ( Parsec, ParseError, parse        -- Types and parser
    , between, noneOf, sepBy, many1    -- Combinators
    , char, spaces, digit, newline     -- Simple parsers
    )

These are the movie fields.

type Title = String
type Director = String
type Year = Int
type UserRatings = (String,Int) 
type Film = (Title, Director, Year , [UserRatings])
type Period = (Year, Year)
type Database = [Film]

This is the Parsing of all the types in order to read correctly from the file

-- Parse a string to a string
stringLit :: Parsec String u String
stringLit = between (char '"') (char '"') $ many1 $ noneOf "\"\n"

-- Parse a string to a list of strings
listOfStrings :: Parsec String u [String]
listOfStrings = stringLit `sepBy` (char ',' >> spaces)

-- Parse a string to an int
intLit :: Parsec String u Int
intLit = fmap read $ many1 digit
-- Or `read <$> many1 digit` with Control.Applicative
stringIntTuple :: Parsec  String u (String  , Int)
stringIntTuple = liftM2 (,) stringLit intLit

film :: Parsec String u Film
film = do
    -- alternatively `title <- stringLit <* newline` with Control.Applicative
    title <- stringLit
    newline
    director <- stringLit
    newline
    year <- intLit
    newline
    userRatings <- stringIntTuple
    newline
    return (title, director, year, [userRatings])

films :: Parsec String u [Film]
films = film `sepBy` newline

This is the main program (write "main" in winghci to start the program)

-- The Main
main :: IO ()
main = do
    putStr "Enter your Username:  "
    name <- getLine
    filmsDatabase <- loadFile "neo.txt"
    appendFile "neo.txt" (show filmsDatabase)
    putStrLn "Your changes to the database have been successfully saved."

This is the loadFile function

loadFile :: FilePath -> IO (Either ParseError [Film])
loadFile filename = do
database <- readFile filename 
return $ parse films "Films" database

the other txt file name is neo and includes some movies like this

"Blade Runner"
"Ridley Scott"
1982
("Amy",5), ("Bill",8), ("Ian",7), ("Kevin",9), ("Emma",4), ("Sam",7), ("Megan",4)

"The Fly"
"David Cronenberg"
1986
("Megan",4), ("Fred",7), ("Chris",5), ("Ian",0), ("Amy",6)

Just copy paste everything include a txt file in the same directory and test it to see the error i described.

回答1:

Whoopsy daisy, being lazy
tends to make file changes crazy.
File's not closed, as supposed
thus the error gets imposed.
This small guile, by loadFile
is what you must reconcile.
But don't fret, least not yet,
I will show you, let's get set.


As many other functions that work with IO in System.IO, readFile doesn't actually consume any input. It's lazy. Therefore, the file doesn't get closed, unless all its content has been consumed (it's then half-closed):

The file is read lazily, on demand, as with getContents.

We can demonstrate this on a shorter example:

main = do
  let filename = "/tmp/example"
  writeFile filename "Hello "
  contents <- readFile filename
  appendFile filename "world!"   -- error here

This will fail, since we never actually checked contents (entirely). If you get all the content (for example with printing, length or similar), it won't fail anymore:

main = do
  let filename = "/tmp/example2"
  writeFile filename "Hello "
  content <- readFile filename
  putStrLn content
  appendFile filename "world!"   -- no error

Therefore, we need either something that really closes the file, or we need to make sure that we've read all the contents before we try to append to the file.

For example, you can use withFile together with some "magic" function force that makes sure that the content really gets evaluated:

readFile' filename = withFile filename ReadMode $ \handle -> do
  theContent <- hGetContents handle
  force theContent

However, force is tricky to achieve. You could use bang patterns, but this will evaluate the list only to WHNF (basically just the first character). You could use the functions by deepseq, but that adds another dependency and is probably not allowed in your assignment/exercise.

Or you could use any function that will somehow make sure that all elements are evaluated or sequenced. In this case, we can use a small trick and mapM return:

readFile' filename = withFile filename ReadMode $ \handle -> do
  theContent <- hGetContents handle
  mapM return theContent

It's good enough, but you would use something like pipes or conduit instead in production.

The other method is to make sure that we've really used all the contents. This can be done by using another parsec parser method instead, namely runParserT. We can combine this with our withFile approach from above:

parseFile :: ParsecT String () IO a -> FilePath -> IO (Either ParseError a) 
parseFile p filename = withFile filename ReadMode $ \handle ->
  hGetContents handle >>= runParserT p () filename

Again, withFile makes sure that we close the file. We can use this now in your loadFilm:

loadFile :: FilePath -> IO (Either ParseError [Film])
loadFile filename = parseFile films filename

This version of loadFile won't keep the file locked anymore.



回答2:

The problem is that readFile doesn't actually read the entire file into memory immediately; it opens the file and instantly returns a string. As you "look at" the string, behind the scenes the file is being read. So when readFile returns, the file it still open for reading, and you can't do anything else with it. This is called "lazy I/O", and many people consider it to be "evil" precisely because it tends to cause problems like the one you currently have.

There are several ways you can go about fixing this. Probably the simplest is to just force the whole string into memory before continuing. Calculating the length of the string will do that — but only if you "use" the length for something, because the length itself is lazy. (See how this rapidly becomes messy? This is why people avoid lazy I/O.)

The simplest thing you could try is printing the number of films loaded right before you try to append to the database.

main = do
  putStr "Enter your Username:  "
  name <- getLine
  filmsDatabase <- loadFile "neo.txt"
  putStrLn $ "Loaded " ++ show (length filmsDatabase) ++ " films."
  appendFile "neo.txt" (show filmsDatabase)
  putStrLn "Your changes to the database have been successfully saved."

It's kind of evil that what looks like a simple print message is actually fundamental to making the code work though!

The other alternative is to save the new database under a different name, and then delete the old file and rename the new one over the top of the old one. This does have the advantage that if the program were to crash half way through saving, you haven't just lost all your stuff.