
Reading and writing to the filesystem is one of the basic tasks in any programming language.
TheFile
class
Ruby provides theFile
class as a tool for creating, reading, and changing files.
File.open
File.open
is a versatile method. It does different things depending on the second parameter.
File.open('some_file_name.txt', 'w')
opens a file for writing to it, whileFile.open('some_file_name.txt', 'r')
opens a file for reading.
Create a new file
We'll useFile.open
with the "w" switch for its second argument. It returns an instance of theFile
class:
file = File.open("some_file_name.txt", "w")
So we can now use the instance methods of that class.puts
, for example, will write a string to the file on a new line, just like we useputs
to write to the standard output:
file.puts "I wrote this with ruby!"
After you're done working with a file, make sure you close it!
file.close
That's because, when you opened it, you started a kind of conversation with the operating system, which created a "file descriptor" to make note of the fact that your Ruby script is interacting with this file. For that reason, when you're done working with it, you need to tell the operating system that your business is done with the file. Otherwise, the operating system will keep this file descriptor around for as long as your Ruby script is still running. On a potentially long-lived process, like a web server, this could lead to lots and lots of file descriptors being built up in the system.
On the other hand, there are some Ruby scripts that are very short and may omit closing files for this reason: since their process is going to terminate shortly, they rely on the fact that the operating system will remove their file descriptors when they terminate.
With a block:
File.open
also has a handy block syntax that closes the file for you when the block is finished executing. I recommend using this style ofFile.open
to avoid forgetting to close your files.
File.open("some_file_name.txt", "w") { |file| file.puts "I wrote this with ruby!"}
Back to the file system:
Let's have a look now at the file we created. I'm closing my irb session now and looking at the filesystem where I ran irb:
$ ls some_file_name.txt
If I print out the contents of that file I see:
$ cat some_file_name.txt I wrote this with ruby!
Which is just as I expected. Great!
Let's try reading a file from Ruby now. Back in an irb session, I'll use theFile.open
method again, this time passing "r", for "read", as the second argument:
irb(main):0> File.open("some_file_name.txt", "r") { |file| puts file.read }
I wrote this with ruby!
=> #<File:some_file_name.txt (closed)>
File.read
File.read is pretty simple. It just prints out the contents of the file all at once.
File#each
If we take a look at the inheritance of theFile
class, we notice that it includes the Enumerable module:
irb(main):0> File.ancestors
=> [File, IO, File::Constants, Enumerable, Object, Kernel, BasicObject]
Each line in a read file is an iterable element, so you can use many of theEnumerablemethods we just learned on the lines of a file.
irb(main):0> File.open("some_file_name.txt", "r") do |file|
irb(main):1* file.each { |line| puts line.length }
irb(main):1> end
24
=> #<File:some_file_name.txt (closed)>
That will print out the number of characters on each line. We only have one line, so there's just one entry here.
I/O streams and cursors
Files behave differently than many iterable object types in Ruby (such as arrays, sets, and hashes), in that it requires special treatment to iterate over them more than once. Observe:
irb(main):0> File.open("some_file_name.txt", "r") do |file|
irb(main):1* file.each { |line| puts line.length }
irb(main):1* file.each { |line| puts line }
irb(main):1> end
24
=> #<File:some_file_name.txt (closed)>
Strange! It did the first line of our block but it seemed to skip the second line. Let's try something different.
irb(main):0> File.open("some_file_name.txt", "r") do |file|
irb(main):1* puts file.count
irb(main):1* puts file.count
irb(main):1> end
1
0
=> nil
Even stranger! It would seem that the length of the file is changing, just by counting its contents. That's not what's happening, though.
Files are read by means of something called I/O streams, which are actually defined and governed by your operating system, not Ruby.
This makes sense if you think about it. Ruby is one process running inside a larger system, so to speak to other systems or files, it has to speak in a common way that all the processes and systems on the computer can understand. That "common way" of communicating between different elements of the operating system is called I/O. Specifically, files are read and written using I/O Streams.
Streams have a "cursor," which keeps track of which line of a file you last read. When you callFile#read
in the above examples, it reads from the cursor's current location (which is the beginning of the file when first opened) until the end of the file. At the end of this operation, the cursor is at the end of the file, so you would have to reset the cusor to the beginning of the file in order to read again. Otherwise, the "read" operation just picks up where it left off, at the end of the file. Fortunately, you can move the cursor back to the beginning of the file by using the methodFile#rewind
.
irb(main):0> File.open("some_file_name.txt", "r") do |file|
irb(main):1* puts file.count
irb(main):1* file.rewind
irb(main):1* puts file.count
irb(main):1> end
1
1
=> nil
The main takeaway from this chapter, other than learning how to do basic read and write operations, is that interacting with files requires a different way of thinking: your file operations are using system I/O, and will need special treatment accordingly. Remember to close files that you opened (it happens automatically if you useFile.open
with the block format), and that iterating over an input stream (in Ruby, represented by the iterable elements of aFile
object) depends on a cursor, which has to be manually rewound if you want to iterate again.