Formulas in R

Many of the commands in R allow us to use a syntax as follows:

command(formula, data=datasetName, ...otherOptions)

A formula is a special syntax in R that in general looks as one of the following:


So it consists of a tilde followed by a “right-hand-side” expression, typically the name of a variable. It can optionally take a “left-hand-side” expression to the left of the tilde, and may end with a vertical line followed by a “groups” section. All sections contain the variables to be considered or expressions of such variables some times, and multiple variables can be typically separated by a plus sign.

The variables on the left-hand-side are the target variables and we are interested in how they are influenced by the other variables. When doing a scatterplot, lhs corresponds to the y axis.
The variables on the right-hand-side are the source variables and we are interested in themselves if there is no left-hand-side, or how they influence the variables in the left-hand-side if there are any.
The variables in the groups section are meant to be variables used to break down the results. If used in a graph situation for example, the graph would have a different panel for each value of the group variables.
