Standard Streams and File Manipulation in Linux: Explained in Detail

Standard Streams and File Manipulation in Linux: Explained in Detail

This is the second blog of the ongoing Linux Masterclass Series

Three Standard Streams

Stream here means the transfer of data. That data is simple text.

Stdin (Standard Input)

Stdin has the code 0. Linux takes the standard input and gives an output. It can be a standard output or a standard error.

Stdout (Standard Output)

Stdout has the code 1. Standard output can be streamed at three places in your system:

  1. Terminal Window
  2. some file
  3. Given to a pipe, which redirects it however you want.

If you want to store the text output in a file you can use the symbol >. For example: If you want to store the ls command output in an output.txt file. You can write the command ls > output.txt. If output.txt doesn’t exist, Linux will create it for you automatically. Screenshot from 2022-10-29 19-10-27.png

Now, if you go to a different folder and want to store a command’s Stdout, you need to specify the path to output.txt like, ls > ~/output.txt

Screenshot from 2022-10-29 19-20-07.png

Notice that the previous entry of the ls output is overwritten with the new one. But you can avoid this and keep adding new entries without overwriting previous ones by using the symbol >> like ls >> output.txt

Screenshot from 2022-10-29 19-22-22.png

Notice that the previous Documents folder entry is still there while new entries are added. Controlled Stdout with less command

If you ever try to read a large file in the terminal, for example, with the command ‘cat /var/log/syslog’, it will populate your terminal with an extremely large output. It will become a hindrance for you to scroll up and down. To tackle this, you can do ‘less /var/log/syslog’, this will still display the file data in the terminal, but you can exit it by typing ‘q’ and your output will disappear as it is not populated in the terminal.

Stderr (Standard Error)

Stderr has the code 2. Let’s understand how Stderr is different from Stdout with an example. If you write the command lg > output.txt, Linux will give a standard error, but will it be saved in output.txt? Let’s see what happens.

Screenshot from 2022-10-29 19-33-47.png

As you can see, the output.txt file did get overwritten but is empty. Because a Stderr is not output. It is a Standard error. Output.txt expected a Stdout, hence it is empty.

But let’s say you still want to store Stderr to output.txt. For that, you can use 2> instead of > like ls 2> output.txt. We are writing 2 here because it denotes the Stderr code that Linux recognizes.

Screenshot from 2022-10-29 20-42-36.png

How to use the pipe to redirect Standard Output as Standard Input

Let’s say, you want to use the less command with ls -la /etc but if done so, it will give an error as both the output-giving commands are clashing with each other and less is confusing ls with a file/directory name.

Screenshot from 2022-10-30 16-10-34.png

You can use pipe ( | ) here to use both ls and less commands together like ls -la /etc | less and it will work just fine.

Screenshot from 2022-10-30 16-20-08.png

Screenshot from 2022-10-30 16-20-12.png

Screenshot from 2022-10-30 16-20-38.png

Let’s understand how did the above command work behind the scenes.

The ls -la /etc part of the command which is a Stdin gave a Stdout in response. But rather than populating on the terminal, it was transferred as a Stdin to the less part of the command by | (pipe)

ls -la etc  less.png

Summary: Pipe allows you to take Stdout of one command and pass it as Stdin to another command.

Environment Variables

If you write echo $HOME, you’ll get something like /home/lenovo. If you do echo $USER, you’ll get lenovo.

Screenshot from 2022-10-30 16-34-00.png

Here HOME and USER are environment variables. In simple words, environment variables are useful information that the shell and its child processes use. You can check out all the environment variables stored in your system by using the command env

Screenshot from 2022-10-30 16-47-46.png

Environment variables are not constant. They keep changing based on what you’re doing in the system. For example, if you write the command echo $PWD where PWD is an environment variable, you’ll get an output like /home/lenovo. But if you change your present working directory with the cd command, the PWD data will change with it as well.

Screenshot from 2022-10-30 16-51-36.png

Now if you were wondering how the system gives out your location with the pwd command, it simply maintains an environment variable with that information and gives it out or uses it whenever the user desires.

Everything is a file in Linux, even the commands

If you type echo $PATH, you’ll find all the paths in which Linux checks for the command file that you have entered. If it doesn’t exist, you get the Standard error Command not found. The shell is saying that the file which executes that specific command is not found in any of the file paths stored in the PATH environment variable.

Screenshot from 2022-10-30 17-13-25.png

If you want to check the command files in one of these above paths, you can type any path like ls /usr/bin or for a better experience ls /usr/bin | less (to quit less environment, type ‘q’)

Screenshot from 2022-10-30 17-18-46.png

As you can see, the commonly used commands like ‘apt’, ‘ls’, and ‘npm’ exist as files in 'usr/bin'

Basic shell commands to interact with files

head command

If you type head [filename or filepath], it will show you the first ten lines that the file contains. If you want to see more or less lines, you can add the flag ‘-n’ and the number of lines like ‘head -n 5 /var/log/syslog’

Screenshot from 2022-10-30 17-36-01.png

tail command

You can do the same thing as head command but for the last ten lines. If you want to control the number of lines you want to display, use the flag -n with the desired number.

Screenshot from 2022-10-30 17-47-29.png

If you want to let the Stdout keep updating and populate your terminal, you can use the flag -f

Screenshot from 2022-10-30 17-49-18.png

As you can see, when I opened another terminal window, the terminal Stdout got updated with the information about the same (Read the third last line of Standard output).

sort command

This helps you sort the file content like below. You can also reverse sort the content with the flag -r

Screenshot from 2022-10-30 18-00-35.png

tr command

It takes a Standard input (Stdin) and translates or you can say, deletes and replaces it with a new, translated Standard Output (Stdout). You can use it with a pipe like cat output.txt | tr a-z A-Z

Screenshot from 2022-10-30 18-12-52.png

uniq command

Let’s say you have multiple lines with duplicate values, uniq command helps you display only the unique values as standard output. If you use -c flag, it will also show the number of occurrences of each value.

Screenshot from 2022-10-30 18-17-35.png

If you only want the non-repeated values, then you can use the flag -u. Only four values are not repeated in the example and hence, displayed. To see the opposite of this, which is only the duplicated values, you can use the flag -d

Screenshot from 2022-10-30 18-27-12.png

A problem you might face with uniq command

Let’s say the file contains values that are not adjacent to each other, then will the uniq command work?

Screenshot from 2022-10-30 18-33-56.png

The answer is no. As you can see, the repeated values are v, d, and r. But uniq command is not working on ‘v’ and ‘d’ unlike ‘r’. This is because r’s repetition happens in an adjacent way.

The solution

A handy solution is to use the sort command and transfer its Stdout as Stdin to uniq command via pipe like sort output.txt | uniq

Screenshot from 2022-10-30 18-40-07.png

Now you can see that ‘v’ and ‘d’ do not repeat in the Stdout.

wc command

wc command shows multiple count types of a file. If you write the command wc output.txt, it will give you three digits, which denote the number of lines, words and file size.

If you write wc -l [filename or path], you’ll get only the number of lines.

If you write wc -w [filename or path], you’ll get only the number of words.

If you write wc -c [filename or path], you’ll get only the size of the file in bytes.

Screenshot from 2022-10-30 21-35-18.png

grep command

It allows you to search files with the help of characters and regular expressions. For example, if you write env | grep PWD, you’ll get the search results with files that have PWD in their name. env command produces a standard output and pipe symbol (|) transfers it to the grep command as standard input.

Screenshot from 2022-10-30 22-03-04.png

Not only it shows you PWD, but also OLDPWD, which means the previous directory you were in. If you write cd - command, it will send you back to that directory (OLDPWD).

Thank you for reading :)

To read the previous blog of this Linux Series, click here

To see a video lecture of this blog, go to this youtube video

Follow me on Twitter here

Do comment your thoughts or anything unique you learned below!

Did you find this article valuable?

Support Kshitij Sharma by becoming a sponsor. Any amount is appreciated!