Pipes, Lists & Redirection

本文介绍了UNIX系统中管道、列表和重定向的功能及其使用方法。通过实例展示了如何利用这些特性来连接多个命令,实现数据处理流程自动化。还详细解释了不同类型的列表符号及其含义,并探讨了输入输出重定向的工作原理。

原文:http://www.injunea.demon.co.uk/pages/page208.htm

Pipes, Lists & Redirection

Now, as promised, a closer look at the pipe, list, and redirection characters and their functionality.

Pipe Dreams:

Pipes are a UNIX feature which allows you to connect several commands together in one line and pass data from one to the next much like a chain of firemen sending buckets of water down a line. The data in the bucket is processed by each command and then passed on to the next command without ever coming up for air. This happens because of two things:

  1. Most UNIX commands get input fromstdinand pass output tostdout
  2. The pipe symbol (|) directs UNIX to connectstdoutfrom the first command to thestdinof the second command.

So that sounds simple. How does it work in practice and what does a command pipe look like. The example below shows several pipes made up of groups of commonly piped commands. You will see examples of these syntax structures in most scripts somewhere in the code.

Example pipes

	line_count=`wc -l $filename | cut -c1-8`
	process_id=`ps -ef /
		   | grep $process /
		   | grep -v grep /
		   | cut -f1 -d/	`
	upper_case=`echo $lower_case | tr '[a-z]' '[A-Z]'`

In all cases the pipeline has been used to set a variable to the value returned by the last command in the pipe. In the first example, thewc -lcommand counts the number of lines in the filename contained in the variable$filename. This text string is then piped to thecutcommand which snips off the first 8 characters and passes them on to stdout, hence setting the variableline_count.

In the second example, the pipeline has been folded using the backslash and we are searching for theprocess_idor PID of an existing command running somewhere on the system. Theps -efcommand lists the whole process table from the machine. Piping this through to thegrepcommand will filter out everything except any line containing our wantedprocessstring. This will not return one line however, as thegrepcommand itself also has theprocessstring on its command line. So by passing the data through a secondgrep -v grepcommand, any lines containing the wordgrepare also filtered out. We now have just the one line we need and the last thing is to get the PID from the line. As luck would have it, the PID is the first thing on the line, so piping through a version ofcutusing the field option, we finally get the PID we are looking for. Note the field option delimiter character is an escaped tab character here. Always test the blank characters that UNIX commands return, they are not always what you would think they are.

You should be able to work out the last example yourself based on just the variable names alone. Note the shorthand version of the complete alphabet we used for thetrinExample Function Syntax.

Lists:

Lists look similar to pipes except the pipe symbol '|' is replaced by one of the followinglistsymbols between each command in the list: ';', '&', '&&', or '||', and optionally terminated by ';' or '&'. The semi-colon character is interpreted by UNIX to be a Carriage Return, so a list of commands separated by semi-colons behaves in much the same way as a list of commands on separate lines would behave (hence the name). The difference is that all the list types can be executed in the current shell or in a sub-shell by utilising a slightly different syntax and the output from the completed list can be redirected (see below) or piped (see above). The two syntax forms for shell locations of lists are shown inCurrent ShellandSub-Shellbelow. Hint - It's all in the brackets.

Current Shell:

{ command; command; command; }

Sub-Shell:

( command; command; command; )

The other list symbols change the way the list is processed and they have the following meanings:

  1. &Asynchronously executes the preceding pipeline (as a background task)
  2. &&Execute only if preceding command or pipe terminated with zero exit status (i.e. it exited okay)
  3. ||Execute only if preceding command or pipe terminated with non-zero exit status (i.e. it failed)

Redirects:

The matter of redirecting input and output follows a similar principle to that of piping. The significant differences are that redirects work with files, not commands, and there is a limit to how many you can put on one line - depending on the open file descriptors. Whereas the pipe connects one commands output to the next commands input, a redirect tells a command to put its output into a file or collect its input from a file. There is also a difference in the syntax due to the way UNIX executes its commands.

Normally UNIX will try and find an executable file somewhere on the command path ($PATH variable) which matches the first word on the current command line. So the first command in a pipe gets found, then executed, and data is piped on to the next command. Because redirecting is for files and not commands, a redirect file cannot be placed ahead of the command on the line. Take another look at the last pipe in the above example. Rewriting this command as a redirect would give the following:

tr '[a-z]' '[A-Z]' < $in_file > $out_file

Now you can see the difference. The command must come first, thein_fileis directed in by theless_thansign (<) and theout_fileis pointed at by thegreater_thansign (>). The file descriptor in thein_filecan include awild cardto select a number of files. However, theout_filemust be unique. Just remember thein_filepoints its arrow at the command, while theout_filegets pointed at.

The redirect arrows can also be doubled up as in the next example. Here the output from the cat command is a file as before. Thedoublegreater_than(>>) directs the output to beappendedto the file, if it already exists. If the file does not exist, it is created. The single arrow form, as used above, would always create a new file if there was none there, oroverwritean existing file.

On the input side the double arrow has a slightly different meaning. Here, where the single arrow gets the input from a file, the double arrow gets its input from the shell file that is currently executing. You may be wondering how the shell can tell where the end of this input is and where the continuing shell script restarts. Well, that is the reason for the word following thedouble less_thansign. The word is a marker that the shell will look out for when reading the input stream. When the word shows up, input will stop and the script will continue.

Example redirected cat

cat >> $out_file << EOF
first line of data
second line of data
more data
the end of the data
EOF

In this case I have used the flag word EOF to indicate the End Of File.However, any word will be acceptable as long as it is unique in the script file. What I generally do is use EOA, EOB, EOC, etc., if I create several files within a script. The capitalisation is not important, but it does make the flags easy to match up as a pair when reading the script.

There is one more thing you can do with this redirected input from a script file. Look at this next example and you will see a minus sign between the double arrows and the flag name:

cat >> $out_file <<-EOF

This instructs the shell to remove leading tab characters from all lines in the input steam including the matched flag word. This handy trick allows you to use a code indent which makes reading much easier as in the following example which is a copy of the previous code segment, but in this new easier to read format.

Example indented cat

cat >> $out_file <<-EOF
	first line of data
	second line of data
	more data
	the end of the data
EOF

Now it looks more like a code block and the flag word stands out too. The section which is indented is easily understood to be the contents of the created file. This feature is not available in the C Shell.

In addition, the redirect arrows can actually redirect input and output, to and from thestdiofiles, known by their descriptor names (0 and 1). These are the default input and output files for UNIX, usually connected to keyboard (0), display (1) and errors (2). Thus the syntax:

<&digit

uses the file associated with file descriptordigitas the standard input. The same goes for standard output if you reverse the arrow. You can also associate one file with another as in this example:

ls -l   $directory/*.log   >   $out_file   2>&1

Here we see anlscommand outputting toout_file. At the end however, is another redirect which is indicating thatstderr(file descriptor 2) should also be sent into thestdout(file descriptor 1), which in this case is ourout_file. To say the same thing in C Shell the syntax looks simpler, but is harder to read because the descriptor numbers are missing:

ls -l   $directory/*.log   >&   $out_file

Don't forget, you can use this mechanism to create files with any content you like generated from any other combination of commands. Here is an example of a menu file listing the files in a directory. This can then be displayed to the screen and the users choice selected quite simply.

Example simple menu

count=0
for file in `ls -1 $source_directory`
do
    count=`expr $count + 1`
    echo "$count:	$file"  >> $menu_file
done
echo "Please select a number from this menu"
cat $menu_file
read $choice
echo "Thanks"
filename=`grep $choice $menu_file | cut -f2 -d:`
echo "You chose [$filename]"

This example is very simplistic however and will not cope with filenames that contain digits or filename lists longer than 9 lines. Both of these conditions could lead to thegrepreturning more than one line which is an error condition (See -Simple Menu Functionsfor a better solution to these problems).

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值