File IO in NodeJS

If you haven't already set up your Node sample project, you can quickly do so by following these instructions

Working with files is a critical skill for all programmers.

File IO refers to file input and output

  1. file input is when your program reads the contents of a file (the file's contents are 'input' into your program). File input is also known as reading a file (usually from the computer's hard drive) into your program.
  2. file output is when your program writes content to a file on the computer's hard disk or to another computer on the network. File output is also known as writing a file.

When using JavaScript in the browser, you cannot do file IO without jumping through some hoops. This is for security reasons. When a browser loads a web page from the internet, it will not allow the JavaScript code to do file IO operations unless certain security checks are completed. Imagine if you visited a web page that could read the files on your hard drive, and send the information to other computers on the internet. This would put your privacy in grave danger! Or, if you visited a web page that wrote a malicious file to your computer. For these reasons, browsers are made to prevent JavaScript code from doing file IO operations. Browsers, it is said, put your JavaScript code into a 'sand box' and won't allow it to interact with many features of the underlying operating system.

But with NodeJS, you can do file IO. This is because in NodeJS your program does not run within the confines the browser. NodeJS applications run on your operating system.This is very similar to how Java applications run on computers: a run time environment, such as the Java Virtual Machine (JVM), executes your code. When you install NodeJS, it includes the node run time environment, which is the program that executs your node code. The node run time environment does not have the same restrictions that are applied code running in a browser environment.

Here's a great article on working with files in NodeJS

We'll start out by exploring how to perform file output in NodeJS.

Create a file named file-io-example.js inside the samples folder. Then put this code into it, which will create a file on your computer's hard drive:

const fs = require("fs");

const content = "Hello World!"
const fileName = "my-file.txt"
const path = __dirname + "/" + fileName;
console.log(path);

fs.writeFile(path, content, (err) => {
  if (err) {
    console.error(err);
  }else{
  	console.log("file written successfully");
  }
});

We'll talk about this code in a minute, but run it first by entering this command in the terminal (make sure you run this command from the project folder):

node samples/file-io-example.js

To start a program in NodeJS, you enter node in the terminal, followed by the relative path to the file that has your code in it. This starts the node runtime and tells it to execute the code in your file.

Now look inside the samples folder, you should see a file named my-file.txt, which should include 'Hello World!' in it.

You have just successly completed a file output operation (also known as 'writing' a file)!

Now, about the code:

The first line 'imports' the fs module so that you can use it in your script ('fs' stands for 'file system'). This module is an object which includes methods that do file IO operations.

In NodeJS, you can call the require() function to import a module.

The content constant is the content that we want to write into the file.

The fileName constant is the name of the file that we want to create.

The path constant uses a built-in variable named __dirname, which NodeJS will automatically set as the path to the current folder starting from the C: drive. The current folder is the one from which the app was run, which is the samples folder we created earlier. We've concatenated our fileName into the path.

The first console log will display the path variable in the terminal when you run the script

The fs module happens to be an object that has various methods (but some modules in NodeJS may be functions, or classes, or even constants). So, after the first console log, the writeFile() methd of the fs object is invoked.

The writeFile() method takes 3 parameters (note that the last one is the 'callback'): 1. The path to the file that is being created (C:\some\path\somefile.txt) 2. The content to put into the file 3. The callback function to be invoked (by the Node runtime) when the file is done being written.

Many functions and methods in NodeJS require you to pass in a callback function as a parameter. Which is Node's way of dealing with functions that could potentially take a long time to complete (such as reading or writing a very, very large file).

The callback function allows you to run code when the operation completes. The heavy use of callback functions in NodeJS is it's way of dealing with asynchronous programming, which we'll talk more about later in the program. But, if you want to learn more now, here's a link on asynchronous programming in JavaScript.

If there was a problem writing the file, then Node will pass in an 'error' parameter. So you can check to see if the parameter is undefined (falsy). If it is undefined, then the operation succeeded, otherwise there was some sort of error, and there should be some details about it in the error parameter (which is an object).

The concept of passing in an 'error' object when the callback is invoked is known as 'error-first callback' Here's more info on error-first callback

Now let's try out file input by adding this to file-io-example.js:

fs.readFile(path,'utf-8', (err, fileContents)=>{
	if(err){
		console.log(err)
	}else{
		console.log(fileContents)
	}
});

Run the script again (node samples/file-io-example.js) and the terminal should show you the contents of my-file.txt!

The readFile() method takes 3 parameters:

  1. The path to the file that is to be read into the program
  2. The encoding that should be used to write the file (we talked about ascii and unicode encoding in the Java class), in this case we are using utf-8 encoding.
  3. The callback function, which will be invoked by NodeJS when the file input operation completes. The callback function has two parameters:
  4. The first one will be an 'error' object - but only if an error occured while trying to read in the contents of the file. If no error occurred, then the parameter will be undefined (which is falsy).
  5. The second parameter will be the contents of the file (assuming there was no error)

To observe the asynchronous manner in which the program runs, put a console log at the end of the file. When you re-run the program you'll see that the console log you just added will actually execute before the file is written and read.

NodeJS offers 'synchronized' versions of the methods we've been using. These versions do not require you to pass in a callback function. This will cause your program to wait until the the file IO operations complete.

Add this code to file-io-example.js:

fs.writeFileSync(path, content);
console.log("SYNCHRONIZED VERSION: file written successfully!")

const fileContents = fs.readFileSync(path, "utf-8");
console.log("SYNCHRONIZED VERSION: " + fileContents);

Note that readFileSync() returns the contents of the file, rather than passing it as a parameter to a callback function.

If you run the script again (node samples/file-io-example.js) you'll see in the console log that even though the synchronized versions are called after the asynchronous ones, they are completing first. This is because synchronized functions and methods will block the program (stop it from continuing to run subsequent lines of code) until it completes.

Asynchronous functions and methods execute in the background, on a separate thread. This allows the subsequent lines in your program to run WHILE the asynchronous code is executing in the background. Threads are a complicated topic in programming, but they allow your program to multi-task. Various programming languages have different ways of dealing with threads, and NodeJS uses callbacks to allow your program to multi-task. If you use synchronized methods, then your code will execute the code in the order it appears. And any operations that take a long time will 'block' the rest of the program from running until they complete.

You have now seen how to do basic file IO operations (in NodeJS), which is a very imporant task in programming.

Follow up questions

  1. What are some things that could cause an error when attempting to write a file to the hard disk?
  2. What are some things that could cause an error when attempting to read a file from the hard disk?
  3. What is asynchronous programming? This question may take us some time to understand.
  4. Why do browsers restrict/prevent file IO operations?

Here's an article on synchronous vs asynchronous methods and functions

Here's one on NodeJS, threads, and async programming