When you are working with experimental data it’s usual to get disperse information. It means that the experiment doesn’t fit perfectly to the theory. In this cases the researcher needs a mathematical model that fits the better the scatter data. The most simplest way to find a mathematical model that fits experimental data is through a linear regression. Let’s learn how to do it in \LaTeX.

Suppose you get a series of data from a experiment that measure position (r) versus time (t) of a particle that moves with constant velocity. The data can be stored in a file named \verb|r_vs_t.dat| and be allocated in the same folder of your main project.

t r 0 1 1.2 1.78 2.3 4.495 3.4 5.21 4.1 4.665 5.6 5.64 6.5 7.225 7.2 7.68 8.1 6.265 9.3 8.045 10.7 8.955 % Data file

## 1. Plot data from external file in \LaTeX

To plot this data, we can use the \verb|\addplot| command along with the \verb|table| option and specify the name of the columns we want to plot, as follows (more details):

\documentclass{standalone} \usepackage{tikz} \usepackage{pgfplots} \usepackage{pgfplotstable} \pgfplotsset{compat = newest} \begin{document} \begin{tikzpicture} \begin{axis}[ xmin = 0, xmax = 11, ymin = 0, ymax = 11, width = \textwidth, height = 0.75\textwidth, xtick distance = 1, ytick distance = 1, grid = both, minor tick num = 1, major grid style = {lightgray}, minor grid style = {lightgray!25}, legend cell align = {left}, legend pos = north west ] \addplot[teal, only marks] table[x = t, y = r] {r_vs_t.dat}; \end{axis} \end{tikzpicture} \end{document}

Notice that in the preamble we have included the \verb|pgfplotstable| package. This package allows us to use the \verb|table| command, and more important, it will help us to compute the linear regression for our data. In the previous code we have also included some extra options in the \verb|axis| environment that changes the visualisation of the grid, the limits of the plots and the position of the legend. Also notice that in the \verb|\addplot| command we have included the \verb|only marks| options to get a scatter plot.

## 2. Linear regression in \LaTeX

\addplot[options] table[ x = column_name, y = {create col/linear regression = {y = column_name}} ] {data_file_name.dat};

\documentclass{standalone} \usepackage{tikz} \usepackage{pgfplots} \usepackage{pgfplotstable} \pgfplotsset{compat = newest} \begin{document} \begin{tikzpicture} \begin{axis}[ xmin = 0, xmax = 11, ymin = 0, ymax = 11, width = \textwidth, height = 0.75\textwidth, xtick distance = 1, ytick distance = 1, grid = both, minor tick num = 1, major grid style = {lightgray}, minor grid style = {lightgray!25}, xlabel = {Time ($t$)}, ylabel = {Position ($r$)}, legend cell align = {left}, legend pos = north west ] \addplot[teal, only marks] table[x = t, y = r] {r_vs_t.dat}; \addplot[thick, orange] table[ x = t, y = {create col/linear regression={y=r}} ] {r_vs_t.dat}; \addlegendentry{Data} \addlegendentry{ Linear regression: $ r = \pgfmathprintnumber{\pgfplotstableregressiona} \cdot t \pgfmathprintnumber[print sign]{\pgfplotstableregressionb} $ }; \end{axis} \end{tikzpicture} \end{document}

## 3. Print the equation of the linear regression

