|
|
LaTeXWhat is a text formatting language? As you have seen, the text editors that are usually provided with UNIX (vi and emacs) can not do all of the text formatting that most people have gotten used to on the micro-computer word processors such as Microsoft Word or Word Perfect. From this, some people tend to jump to the conclusion that those tasks are better for the personal computers and that larger systems or those running the UNIX operating system are better for programming and such things that do not require any fancy layout like newsletters and technical papers. Actually, there are many different text formatting software packages available for UNIX systems, some of the more popular ones are nroff, troff, TeX and LaTeX. In this chapter we will look into the LaTeX package. Text formatting languages are in many ways more powerful than most of the word processors available today, but in order to get this extra power, they are usually more difficult to learn and to use. Text formatting packages operate on a different principle than the common word processors. In a word processor, the text is entered into the document using the standard printable characters on the keyboard and the directions of how to format are either entered by the function keys, by the control characters or by the use of pull-down menus. In UNIX if these keys are available, they are most likely mapped to directions for the terminal or the editor so they can't also be used for the formatting directions. Because the regular printable keys of the keyboard must be used to give the commands as well as enter the text, the resultant file is a text (ASCII) file unlike most word processors and there is no WYSIWYG (what-you-see-is-what-you-get) capabilities. Files written to use a text formatting language must be translated into another form before printing or viewing. This process of translation is very much like a file written in a programming language must be compiled. The process is as follows:
The commands used are usually specific to the sight where you are working. On our class account UNIX machines we need to take the following steps: % vi filename.tex Puts you into an editor to create the source file. LaTeX usually needs the file name to end in the .tex extension in order to be sure it is being given the source file. % latex filename.tex Creates many new files in the current directory. The most important of these is the one that ends with the .dvi extension. This is the "device independent" file. It contains the same text and commands as the source file, but it is no longer displayable by using the more command. Other files created or updated include the .aux (auxiliary) and the .toc (table of contents). If there are any major errors in this file so that it can't be translated, LaTeX will put you into an interactive system to make the changes. Since we don't have time to go through all of those commands, type the command "q" and press <enter> if you see the question mark (?) prompt. This should put you back at the shell prompt so you can go into an editor to fix the error in the source file. The .dvi file can be displayed on some more advanced terminals by using a software package or may be directly printed for example in our "glue" labs on campus. On our class cluster machines, we need to translate it again so it can go to a printer. % dvips filename.dvi > filename.ps This command translates the dvi file into a postscript file. On many systems the command to do this is "postscript," but there is no standard. The file name after the redirection can be anything; I just used the .ps extension so it is easy to see which version it is and which source file it came from. This postscript file is then printable to any postscript printer. Sending this file to a text printer causes many pages of garbage to be printed. We send it to a postscript printer on UMDnet (our campus wide network) by using the following command. % qpr -q printername-ps filename.ps Don't forget the -ps on the name for any printer that will assume it is a text file unless told it is a postscript file. The printername is most likely on a sign near the printer in the lab with directions on how to access it as a postscript printer and how to access it as a text printer.
What is LaTeX? TeX is a very powerful text formatting program which was written by Donald Knuth in 1984. TeX is too powerful to be easy for many people to use so Leslie Lamport developed LaTeX in 1985 to create a more user friendly interface to TeX. It allows the user to make use of "macro's" which are variable settings and format selections already prepared the way they would usually be used. If a knowledgeable user is using LaTeX, he/she can still use all of the power of TeX. When the user is using TeX, he must describe how each thing should look and where it should be placed on the page. In LaTeX, on the other hand, the user puts the instructions in the form of what this text is rather than exactly how it should be formatted. By telling what it is, the macro then takes over and figures all of the details of size, font and page placement. LaTeX is not a program written specifically for running on a mainframe using the UNIX operating system. There are also versions of LaTeX written for the IBM-PC's and for the Macintosh. It fits into a course on UNIX because when using the UNIX system, this is the text formatting language many people choose to use especially in the academic environment. In LaTeX the delimiter between words and/or commands will be "white space" as it is in the UNIX shell. White space can be a single space, multiple spaces, a tab or a hard return. Since all of these things will be viewed as one delimiter, the lines you type in will not necessarily stay as lines because the system will automatically assume you would like the lines filled. The automatic fill can be turned off, but not having to worry about which lines are filled as you are typing into an editor is helpful. A standard practice is to have the end of each sentence end a line. The only white space that is special is when there are two end of line characters in a row. Just like in the editors, this is viewed as the separation between paragraphs. There is also a problem because all of the characters you would like to have in the finished product do not appear on the keyboard. Some of these extra characters are entered through the use of commands as described below. Others will be combinations of keys that do appear on the keyboard. For example, in most formal writing the quotation marks curl inward. On your keyboard you have one set quotation marks which looks like " (straight up and down curled neither way). In order to have quotes that curl clockwise you need to type in two apostrophes in a row and in order to get a quotation mark to curl the other way you use two accents (or back quotes in a row). If you use one or use a space in between, there is no special meaning...it is simply an apostrophe or an accent, but when two are placed in a row, they are changed into curled quotation marks.
How does it recognize formatting directions? In LaTeX there are three different types of directions it must be able to recognize from the text of the document around it. These three things are "declarations," "commands" and "environments." When interpreting the document, LaTeX usually recognizes these directions because they start with a \ (backslash). There are some special characters such as the % which are interpreted as a command without having a \ in front of it. The % will make the characters that follow it until the end of the line ignored by the LaTeX interpreter - this is called a "comment". In order to use these special characters as themselves, you must put a backslash in front (i.e. \& prints a & in the text). This is actually a command telling LaTeX to print the "&" not a prevention of interpretation (as the shell interprets a backslash), but it is difficult to see any real difference between these two meanings of the backslash. Declarations usually set some kind of environment parameter which effects the whole document. These are used to make standards so that the different occurrences of the same type of structure (such a list or a title) appears the same each time through without having to remember how you set it the last time. Commands effect the document where they occur in the text. The example above (the \%) is a command, but most commands actually use words rather than symbols. For example the command \today will be replaced with the date the source file was translated into a dvi file and the command \LaTeX will translate into the logo for the LaTeX software. Other commands will require arguments and/or may allow options (these words mean the same as they did when used describing shell commands). The options will follow the command name and be placed in [] (square brackets) while the arguments will be placed in {} (curly braces). There may be more than one option or more than one argument depending on the command. So command will look like one of the following
Other than for the single symbol commands, LaTeX knows that the command name is over
because it encounters a backslash, a open curly brace, a open spare bracket or some white
space. The fact that white space is used both as a delimiter between commands and text and
to represent space you would like to appear in the document causes some problems. If you
write the sentence: The last type of direction that needs to be given is called an environment because it
changes the general interpretation of the commands within it. Environments will have a
clear beginning and ending point. This is called the "scope" of the environment.
The scope will be determined by a pair of curly braces around the command and the effected
text for example:
What does a LaTeX source file look like? A LaTeX source file will have two distinct parts: the preamble and the body. In the preamble the decisions that are global to the whole document are placed. The body has a beginning marker and an end marker and contains all of the rest of the commands and the text for the document. The document will look like the following
The one that is most important in the preamble (and the only one that is required) is the \documentstyle declaration. This document style may be something that is provided with LaTeX such as article or book or it maybe something a knowledgeable TeX user has written specifically for the sight where you are working. At a university someone may have written a special document style that makes all of the decisions for how to layout a thesis so every thesis will look alike. The command to choose a style will look like the following \documentstyle[options]{stylename} where the options are things that may change the style that is already written a little bit such as change to two-column rather than 1 or 11 point size rather than 12 point. The body (between the \begin{document} and the \end{document} can further be broken up into sections. The sections available will be determined by the document style chosen. For example a book is broken into chapters but an article is not. The order is usually chapter, section, subsection, subsubsection, paragraph and subparagraph. (There is also a section called "part" but this is different in use from the others.) Any of these section must be initiated and be properly embedded (one of this level must end before the next of a higher level can start). The format for a sectioning command is \sectionname{title} In this format the title will both show at this point in the text and in any table of contents derived for this source file. The title is also assigned a number based on the next in sequence for that level. For example is we are just finishing chapter 4, the next chapter title will be numbered 5 and the first section within that chapter is numbered 5.1, and the next subsection 5.1.1, etc. These numbers can be modified with the use of the \setcounter command. \setcounter{sectionname}{value} Which tells it to assume we are currently on that level rather than the default. The size of the text and the placing of the text has already been determined by the document style chosen. It can be overridden, but this practice should be avoided since it takes the power away from LaTeX.
What are the Different Modes of LaTeX? Within the body of the source file, LaTeX must be acting in one of three modes. These three modes are "math mode", "paragraph mode" and "LR (left-to-right) mode". The first two modes will be described in detail below. The third is not used often - it is similar to paragraph mode except it will not put in line or page breaks as it feels they are necessary. Most of the commands mentioned below as part of the paragraph mode can actually be used in either of the other two, but they are most often used in paragraph mode.
What can be done with the math mode? Actually there are three different environments within the class named math mode. They are "math", "displaymath" and "equation." The things that are interpreted within any of these environments look very similar to the same object in one of the other math mode environments. The only differences are where the formula is placed on the page and the providing of numbers for cross reference. One important thing to remember about the math environments is that they are not evaluating the formulas or even verifying them to see if they are legal uses of the mathematical symbols. It is only a method that allows you to write using the notation and symbolism of mathematics. In the "math" environment, the formula is placed within the text that surrounds it. In "displaymath" it is placed on its own line and centered so it is separated from the surrounding text. In "equation" it is positioned the same as it is for displaymath except it is also numbered with an equation number (usually placed on the far right side of the page) for reference later. The math environment is indicated by the \begin{math} and \end{math} pair which can be shortened to \( and \) or to a dollar sign ($) to mark the beginning and another to mark the end. The displaymath environment uses either \begin{displaymath} and \end{displaymath} or \[ and \]. The equation mode used \begin{equation} and \end{equation}. So any of the following indicate the formula will be interpreted according to the rules of math mode. math: displaymath: equation: One of the main things about the math mode is the fonts that are assumed. In typesetting of mathematical documents, the Roman type face is used for constants and italics for variables. Because it would be difficult for people to remember all of these font changes while writing a formula, the math mode does this automatically. The math mode also allows the use of special symbols that are only used in mathematics. Like most other parts of LaTeX, the spacing between elements or around these symbols does not change the look of the resultant file. The mathematical symbols available on the keyboard (with the exception of the curly braces) can be typed into the formula directly; any that are not on the keyboard have a command to allow you to print that character. The curly braces, since they are used to indicate scope, can not be typed in directly. Instead you need to use the commands \{ and \} to have them print. For writing a simple formula using these symbols simply tell the system that you are going into a math mode type the formula and then tell the system that you are going out of math mode. For example \begin{displaymath} For special characters we must use the command to display that character. Some of the characters available are the
The Greek letters written with the first letter uppercase correspond to the uppercase Greek letter and the other to the lowercase letter. With the arrows, an uppercase letter indicates that the arrow is double. The "Log-like" functions that actually get written as words into the text, are written differently than just putting the letters in because the letters would be interpreted to be three variables, not the name of a function. The variable-sized (as the name implies) will become larger as needed to fit appropriately into the formula that uses it. Within the special characters, there is also the command which allows you to write any of the uppercase alphabetic characters using the calligraphy often used in the naming of functions. This command is \cal followed by the letter to be written in this fancy notation. {\cal F} Another piece of notation, that is commonly used in math and very easy in LaTeX (while it is difficult, if not impossible, in an editor) is the subscripting and superscripting with numbers, letters or other formulas. In order to superscript (raise the number above the normal base line and usually make it smaller) use the ^ (caret), and in order to subscript use the _ (underline). For example x_i = x^2 + 3 (read x sub i equals x squared plus three) In order to make it clear what is being subscripted, we often need to use curly braces to indicate the scope. The formula x_{i +1} = x_i + 1 is not always true because the left half says the x is subscripted with "i+1" and the right side the x is subscripted with i and the value x sub i has 1 added to it. These can even be embedded inside of each other. x_{i^2} \neq x_i^2 Here on the left the i is spared and ont the right the x is squared. The \neq in the middle is the "not equal to" symbol. The commands _ and ^ assume that the one next character is the argument to be placed in the alternate font unless it is told differently by the scoping rules. Some other common notation is math are the root symbol and fractions. The root symbol
command takes one argument which tells what will be placed under the root symbol. For
example \sqrt{x+5} means the square root of the quantity x+5. It can also take an option
if you wish to indicate that it is not the square root, but instead another based root.
For example
What can be done in Paragraph Mode? In the two text modes, there are not as many special symbols, but there is more flexibility for the fonts (or the style that the letters can be printed using. Some of the most common are emphasis (em), boldface (bf), sans serif (sf), small capitals (sc), italics (it), roman (rm) and typewriter (tt). The size of these can also be changed to tiny, scriptsize, footnotesize, small, normalsize, large, Large, LARGE, huge and Huge. Not all of the sizes in all of the different fonts are available on all systems. On some you may get a warning message as the file is being translated that another is being substituted. The other confusing thing about size and font is that when a size change is indicated, LaTeX will assume that you want that size in Roman font so the two commands {\Huge {\sc text}} and {\sc {\Huge text}} will not be the same on all systems. Some of the things done in text layout were (or even still are) difficult to do in word processors. These features are easier when using a text formatting language because of the level of processing that needs to be done before the file is printed. The processing allows the whole file to be looked at and compiled before the text that needs to be printed is displayed. These features include footnotes (margin notes), table of contents (tables of figures or equations) and lists in general. To do footnotes, all you really need is the \footnote command. This would be used as follows LaTeX\footnote{text formatting language by Leslie Lamport} This line in the source code would cause the word LaTeX to appear surrounded by the other text which isn't shown here in the regular portion of the document. It would then be immediately followed by a superscripted number indicating the number of the footnote. That number would again appear at the bottom of the page followed by the text about Leslie Lamport. If the number that would be assumed is not correct, this statement can be proceeded by the command \setcounter{footnote}{number} which would allow you to set it to another number. The type of number will be assumed to be arabic, but this can also be changed. If you did not want numbers to be used at all, you may use the \fnsymbol command which allows you to use the dagger, astricks, double dagger, etc. as footnote marks. To do a table of contents all you need is the command \tableofcontents at the point in the document where the table is to appear. This is controlled by the sectioning discussed earlier and the \setcounter{tocdepth}{num} which allows you to specify which of those entries will be included in the table of contents. The table of contents is maintained by the file that ends in the .toc extension. Since the table of contents is usually at the beginning of the document, it is difficult to see the pages of all of the section headers, so the first time LaTeX is run on a source file, the .toc file is created. Each time after that, the old .toc file is used for this table of contents and the .toc file is updated as it progresses through he document. Because of this, two passes by LaTeX over the document may be required to create an accurate table of contents. Lists are difficult in a word processor because as you type in the list items, you often assign numbers to them to tell where they appear in the list. If you later want to add another item (before any already existing item) you need to renumber all of the items that follow the new item. To get rid of this problem in LaTeX, you do not tell which item this is in the list you only tell that this is the beginning of a new item. When the source file is processed, the proper mark is assigned to each item based on the type of list and marker chosen. The three most common types of lists are itemize, enumerate and description. Itemize puts the same mark before each item in the list. By default this item is a bullet at the first level and then progressing to the dash, the star and a dot as lists are embedded. For the enumerate list, each item is assumed to be marked with the next arabic number. This can be easily changed to roman or alphabetic as required. The third type of list (description) is used to make entries similar to those in a dictionary where one word is usually the label. For this type of list, the label must be included with the item entry or it will assume to be blank. Each of the list environments start with the \begin{listtype} and end with the command \end{listtype}. Anything inside will be assumed to be part of the list. The begins and ends must match and be properly embedded. Each item in the list is then proceeded with the \item[option] command where the option includes the new mark if you do not want to use the one that would be assumed. Since "white space" is viewed as a single unit, some things are more
difficult to do in LaTeX than they are in most word processors. For example, if you are
writing a poem where the ends of the lines do matter (so you don't want LaTeX to fill the
lines as it sees fit), you need to tell it not to do the filling. Pressing an enter
at the end of the lines where you would like an enter to appear will not do it. One way to
tell LaTeX to put an end-of-line character is to use the \\ command or the \linebreak
command. For example Using tabs is another thing that is made difficult by the "white space" interpretation. Pressing the tab key is no different in LaTeX than pressing the space bar or the enter key. So the tab can not be used for tabbing. Even if you lay something out nicely using the tabs as understood by your terminal, unless it is in the verbatim environment, those tabs will be removed. The important things to remember are that the tabbing environment needs to be started and then setting tabs is a different action from using them. The tabbing environment is started used with \begin{tabbing} text \end{tabbing} where the text is the area the tabs will be in effect. Setting tabs is done with the \= command and using the tabs can be done with the \> command. If you know where on the page you would like the tab stops to go you can use the \hspace command to tell the cursor to move over that far to the right or it can be mixed in with text and refers to the position where that character would be printed. For example \hspace*{4 cm}\=some text \=\hspace{2 cm}\= \kill This command would place 3 tab stops. The * after the first hspace command tells it to
place the first tab stop 4 centimeters from the margin not from the edge of the page. The
text in the middle says after the amount of space it takes to print this text put the
second tab stop. The third is placed 2 centimeters after the second. The kill command
tells LaTeX not to print this line in the final document, but it does not remove the tab
stops. They can be used on any line that follows this one in the tabbing environment by
using the \> to left tab. For example The tabular environment not as similar to the tabbing as the name seems to imply. The tabular is used for setting up tables and placing text into those tables. For example \begin{tabular}{col def} table \end{tabular} The column definition tells it how many columns there will be and how each will be formatted. For example ||c|l||r| says there are three columns. The first has a double bar before it and is centered. The first is separated from the second by a single bar. The second is left justified and the third is right justified. The table follows the rules that columns are separated by a & command and rows by a \\ command. Some other common commands to use are \multicolumn which allows text to spread over more than one column, \hline which draws a horizontal line across the whole table and \cline which allows you to specify which columns have a horizontal line.
How can you create new commands and environments? These commands (and the others provided with LaTeX) may not be enough for what you want to do. In order to allow you to create your own commands and environments, LaTeX provides two commands \newcommand and \newenvironment. The \newcommand takes two arguments (and is therefore like an alias) the first argument tells what you would like to type to use the command and the second tells what is will mean when you use that command. For example \newcommand{\hi}{$\frac{hello}{world}$} says that from now on when I type the command \hi it will be replaced with a mathematical fraction to say hello world. The \newenvironment command takes three arguments. The first is the name of the
environment while the second tells what will happen when you turn it on and the third
tells what happens when you turn it off. For example if you are having trouble remembering
that enumerate means the same as a numbered list you could make the new environment named
numlist that does the same thing as enumerate used to do. For example |