File IO in NodeJS
If you haven't already set up your Node sample project, you can quickly do so by following these instructions
Working with files is a critical skill for all programmers.
File IO refers to file input and output
- file input is when your program reads the contents of a file (the file's contents are 'input' into your program). File input is also known as reading a file (usually from the computer's hard drive) into your program.
- file output is when your program writes content to a file on the computer's hard disk, or to another computer on the network. File output is also known as writing a file.
File IO allows you to make data persistent applications. Your program can save data to a file by doing a file output operation. Then the data will 'persist' even if the computer is turned off. When the computer is turn on and your app is launched, it can do a file input operation and read the contents of the file into the program so that the user can continue working with their data.
When using JavaScript in the browser, you cannot do file IO without jumping through some hoops. This is for security reasons. When a browser loads a web page from the internet, it will not allow the JavaScript code to do file IO operations unless certain security checks are completed. Imagine if you visited a web page, and the JavaScript code could read the files on your hard drive. It could then send the information to other computers on the internet. This would put your privacy in grave danger! Or, if you visited a web page that wrote a malicious file to your computer. For these reasons, browsers are made to prevent JavaScript code from doing file IO operations. Browsers are designed to protect you, and by default they won't allow JavaScript code to interact with many features of the underlying operating system.
But with NodeJS, you can do file IO operations. This is because in NodeJS your JavaScript code does not run within the confines the browser. NodeJS applications run on your operating system. This is very similar to how Java applications run on computers: a runtime environment, such as the Java Virtual Machine (JVM), executes your code, and has access to the file system. When you install NodeJS, it includes the Node runtime environment, which is the program that executes your JavaScript code. The Node runtime environment does not have the same restrictions that are applied JavaScript code that runs in the browser.
Let's experiment with file IO operations in NodeJS.
Create a file named file-io-example.js inside the samples folder of your node-sample-project. Then put this code into it, which will create a file on your computer's hard drive:
// Import the fs module (which is an object that has useful methods for doing file IO)
const fs = require("fs");
// The content that we want to put into a file
const content = "Hello World!"
// The name of the file that we want to create
const fileName = "my-file.txt"
// The full path to the file we want to create (starting from the root of the drive)
const filePath = __dirname + "/" + fileName;
console.log("FULL path TO FILE: " + filePath);
// Invoke the writeFile() method of the fs object
fs.writeFile(filePath, content, (error) => {
if (error) {
console.log(error);
}else{
console.log("file written successfully");
}
});
We'll talk about this code in a minute, but run it first by entering this command in the terminal (make sure you run this command from the project folder):
node samples/file-io-example.js
To start a program in NodeJS, you enter node in the terminal, followed by the relative path to the file that has your code in it. This starts the node runtime and tells it to execute the JavaScript code in your program.
Now look inside the samples folder, you should see a file named my-file.txt, which should should have the contents 'Hello World!' in it.
You have just successfully completed a file output operation (also known as 'writing' a file)!
Now, about the code:
The first line 'imports' the fs module so that you can use it in your script ('fs' stands for 'file system'). This module is included with NodeJS, and it's an object that have various methods for doing file IO operations.
In NodeJS, you can call the require() function to import a module.
The content constant is the content that we want to write into the file.
The fileName constant is the name of the file that we want to create.
The filePath constant uses a built-in constant named __dirname, which NodeJS will automatically set as the path to the current folder starting from the root of the drive. The current folder is the one that contains your program (file-io-example.js), which is the samples folder we created earlier. We've concatenated our fileName into the path. I put a console log under this line so that you can see the full path to the file that our program creates.
The fs Module in NodeJS
When you import modules in NodeJS (like we did on the first line of our program), they could be objects, functions, classes, or even other data types. You can think of a NodeJS module as a component (since we've been discussing component based design). As mentioned, the fs module happens to be an object that has various methods for working with the file system ('fs' = file system).
The writeFile() method of the fs object takes 3 parameters (note that the last one is the 'callback'): 1. The path to the file that is being created (for example: C:\some\path\somefile.txt) 2. The content to put into the file 3. The callback function to be invoked (by the Node runtime) when the file is done being written.
Many functions and methods in NodeJS require you to pass in a callback function as a parameter. Which is Node's way of dealing with functions that could potentially take a long time to complete (such as reading or writing a very, very large file).
The callback function allows you to run code when the operation completes. It's a lot like event handling, where you are waiting for an event to occur, and when it does then your callback/event handler will be triggered. In this case, the callback is triggered when the content has been written into the file.
The heavy use of callback functions in NodeJS is its way of dealing with asynchronous programming, which we'll talk more about later in the program. But, if you want to learn more now, here's a link on asynchronous programming in JavaScript.
Notice that the callback has a parameter named error. If there was a problem writing the file, then Node will pass in an 'error' object as a parameter. So you can check to see if the parameter is undefined (falsy). If it is undefined, then the operation succeeded, otherwise there was some sort of error, and there should be some details about it in the error parameter (which is an object).
The concept of passing in an 'error' object when the callback is invoked is known as 'error-first callback'
Here's more info on error-first callback pattern in NodeJS
Now let's try out a file input operation by adding this code to file-io-example.js:
fs.readFile(filePath,'utf-8', (error, fileContents)=>{
if(error){
console.log(error);
}else{
console.log("FILE CONTENTS: " + fileContents);
}
});
Run the script again (node samples/file-io-example.js) and the terminal should show you the contents of my-file.txt!
The readFile() method of the fs module/object takes 3 parameters:
- The path to the file that is to be read into the program
- The encoding that should be used to write the file, in this case utf-8 encoding. utf-8 is commonly used to encode text files (don't worry about this right now).
- The callback function, which will be invoked by NodeJS when the file input operation completes. The callback function has two parameters:
- The first one will be an 'error' object - but only if an error occurs while trying to read in the contents of the file. If no error occurred, then the parameter will be undefined (which is falsy).
- The second parameter will be the contents of the file (assuming there was no error)
To observe the asynchronous manner in which the program runs, put a console log at the end of the file. When you re-run the program you'll see that the console log you just added will actually execute before the file is written and read. This is because the program continues to run and execute lines of code even before the input output operations complete. This is known as 'asynchronous' code execution, and NodeJS makes heavy use of it.
NodeJS offers 'synchronized' versions of the methods we've been using. These versions do not require you to pass in a callback function. This will cause your program to wait until the the file IO operations complete.
Add this code to file-io-example.js:
fs.writeFileSync(filePath, content);
console.log("SYNCHRONIZED VERSION: file written successfully!")
const fileContents = fs.readFileSync(filePath, "utf-8");
console.log("SYNCHRONIZED VERSION: " + fileContents);
Note that readFileSync() returns the contents of the file, rather than passing it as a parameter to a callback function.
If you run the script again (by entering node samples/file-io-example.js* in the terminal) you'll see in the console log that even though the synchronized versions are called after the asynchronous ones, they are completing first. This is because synchronized functions and methods will block the program (stop it from continuing to run subsequent lines of code) until the file operation completes.
Asynchronous functions and methods execute in the background, on a separate thread. This allows the subsequent lines in your program to run WHILE the asynchronous code is executing in the background. Threads are a complicated topic in programming, but they allow your program to multi-task. Various programming languages have different ways of dealing with threads, and NodeJS uses callbacks to allow your program to multi-task. If you use synchronized methods, then your lines of code will execute in order (synchronously). And any operations that take a long time will 'block' the rest of the program from running until they complete.
You have now seen how to do basic file IO operations (in NodeJS), which is a very important task in programming.
We covered some heavy concepts in this activity! The important things for you to focus on are:
- The fs a built-in module in NodeJS, and it is an object that has various methods for working with the file system (such as file IO operations).
- NodeJS makes heavy use of callbacks, where the first parameter is used to indicate an error in the operation.
Using callbacks can make your code messy! In the Web 3 class you'll learn about some alternative ways of dealing asynchronous code. In this project, and in the final project we'll avoid callbacks in our file IO operations by using the synchronized versions of the functions (readFileSync() and writeFileSync()).
Commit your changes to your Git repository.
Then create a separate branch, so that we can easily return our project to the current state if we need to. Run this command:
git branch 2-file-io-complete
Finally, make sure to push this branch, and your main/master branch to GitHub.
Follow up questions
- What are some things that could cause an error when attempting to write a file to the hard disk?
- What are some things that could cause an error when attempting to read a file from the hard disk?
- What is asynchronous programming? This question may take us some time to understand.
- Why do browsers restrict/prevent file IO operations?
Additional Resources
In the next lesson, you'll build A Very Simple NodeJS Web Server