UNIX commands for file handling

Mitul Vaghela
5 min readJun 26, 2020

Recently, I was assigned a project to test an ETL application developed by one of my team members. I had to work with some of the largest file sizes I had ever encountered. The file sizes ran into GBs and wouldn’t open easily using any mainstream editors. I tried opening the files and they slowed down the whole system to the point where I had to restart the computer.

I knew I had to figure out a better way to deal with this new problem. I was searching for solutions online and a lot of stack-overflow answers recommended using UNIX commands. I started exploring unix commands and maintained a list of commands that I found useful to later share it with the community.

Without any further ado lets look at the commands -

First things first, you can open terminal by pressing these shortcuts —

Linux (Ubuntu) — press Ctrl + Alt + T

Mac — Click launchpad -> type terminal -> click terminal

  1. List command

Generally, files will be shared with you by developers over a cloud storage solution (AWS, Azure) or a NAS drive. Before you begin working on them, its a good idea to list all the files and get basic details like filenames, size, time stamps etc. You can use list command for this.

to show the contents of a directory use this command

ls — list

to show hidden files and folders, add -a to the command

ls -a — List with hidden contents

to show file size and time stamp, add -l to the command

to display recursive listing (folders inside folders) of all files and folders, add -R to the command

2. Move command (covers copy as well)

Chances are that you would like to move files from one location to another before the file processing starts. In some cases, you might want to rename them. Move command is very handy in such cases.

mv — move command. It supports moving single files, multiple files and directories.

to move a file from one directory to another

to move multiple files into a directory

If you pass -i to the move command it will prompt you before overwriting a file. Instead of moving the files, if you want to copy them replace mv with cp.

However, note that with cp command, you are creating a copy of the file and hence it might take a while before the files are copied (sometimes even hours). Hence, please use this command very carefully.

3. Word count command

When you read the files and load them into a database, the first thing you want to do is to make sure that all the records have been loaded successfully — basically compare the record count between files and database. Now, because the file size is in GBs, it will literally break your computer before you can open them. Good news is you can get the record count in a file without opening it.

wc — word count. It helps you get number of lines (read records) in a file, the number of characters in a file and the number of words in a file.

Demo file to be used for word count command

To get the record count in a file, use following command.

WC -l — counts number of lines in a file

If you want to count the number of characters use -m and for words use -w.

Now let’s create a copy of the demo.txt and call it demo1.txt with exactly the same contents. So, now we have 12 lines in demo.txt and 12 lines in demo1.txt.

You can further multiply the power of word count command by using the pipe character. For example, you can use following command to get the sum of total records in all the files present under a folder. In this case, both demo.txt and demo1.txt

The cat command concatenates the contents of all the files and channels the output to word count command using pipe.

4. Head command

When you receive a new batch of files, you would quickly like to verify that the files comply with the contract agreed upon. By contract, I mean the header and footer signatures, file schema among other things. You can use the head command to read the first 10 lines in the file and print it on console.

Head command

For multiple files, just keep adding file names followed by space. You can print custom number of lines, by adding -n to the command followed by a number.

Head -n command

5. Tail command

Like head, you can use tail command, to read last 10 lines of the file

Tail command

6. Grep command

How about searching some text in the file? We can use Grep command for that. In essence, this command prints all the lines that match a pattern. To understand what this means, let’s say that the file contains employee records. If you search employee name using the grep command, you will get all the details related to that particular employee as the command prints the entire line.

consider this file -

employee list file

to print all the lines matching a text value

Searching text in a file
Multiple search results

to print the count of number of matches, add -c to the command

searching count of all matching text

to ignore case when searching, add -i to the command

--

--