Econometrics has been defined as “the application of mathematics and statistical methods to economic data” and described as the branch of economics “that aims to give empirical content to economic relations.” More precisely, it is “the quantitative analysis of actual economic phenomena based on the concurrent development of theory and observation, related by appropriate methods of inference.” An influential introductory economics textbook describes econometrics as allowing economists “to sift through mountains of data to extract simple relationships.” The first known use of the term “econometrics” (in cognate form) was by Paweł Ciompa in 1910. Ragnar Frisch is credited with coining the term in the sense that it is used today.
Two main purposes of econometrics are to give empirical content to economic theory by formulating economic models in testable form, to estimate those models, and to test them as to acceptance or rejection.
For example, consider one of the basic relationships in economics: the relationship between the price of a commodity and the quantities of that commodity that people wish to purchase at each price (the demand relationship). According to economic theory, an increase in the price would lead to a decrease in the quantity demanded, holding other relevant variables constant so as to isolate the relationship of interest. A mathematical equation can be written that describes the relationship between quantity, price, other demand variables like income, and a random term ε to reflect simplification and imprecision of the theoretical model:
Q = β0 + β1Price + β2Income + ε.
Regression analysis could be used to estimate the unknown parameters β0, β1, and β2 in the relationship, using data on price, income, and quantity. The model could then be tested for statistical significance as to whether an increase in price is associated with a decrease in the quantity, as hypothesized: β1 < 0.
There are complications even in this simple example, and it is often easy to mistake statistical significance with economic significance. Statistical significance is neither necessary nor sufficient for economic significance. In order to estimate the theoretical demand relationship, the observations in the data set must be price and quantity pairs that are collected along a demand schedule that is stable. If those assumptions are not satisfied, a more sophisticated model or econometric method may be necessary to derive reliable estimates and tests.
Theoretical econometrics examines the statistical properties of econometric procedures. Such properties include the power of hypothesis tests and efficiency of estimators and of survey-sampling methods. Applied econometrics uses theoretical econometrics and real-world data for assessing economic theories, developing econometric models, analyzing economic history, and forecasting.
Econometrics may use standard statistical models to study economic questions, but most often they are with observational data, rather than in controlled experiments. In this, the design of observational studies in econometrics is similar to the design of studies in other observational disciplines, such as astronomy, epidemiology, sociology and political science. Analysis of data from an observational study is guided by the study protocol, although exploratory data analysis may by useful for generating new hypotheses.Economics often analyzes systems of equations and inequalities, such as supply and demand hypothesized to be in equilibrium. Consequently, the field of econometrics has developed methods for identification and estimation of simultaneous-equation models. These methods are analogous to methods used in other areas of science, such as the field of system identification in systems analysis and control theory. Such methods may allow researchers to estimate models and investigate their empirical consequences, without directly manipulating the system.
In recent decades, econometricians have increasingly turned to use of experiments to evaluate the often-contradictory conclusions of observational studies. Here, controlled and randomized experiments provide statistical inferences that may yield better empirical performance than do purely observational studies.
One of the fundamental statistical methods used by econometricians is regression analysis. For an overview of a linear implementation of this framework, see linear regression. Regression methods are important in econometrics because economists typically cannot use controlled experiments. Econometricians often seek illuminating natural experiments in the absence of evidence from controlled experiments. Observational data may be subject to omitted-variable bias and a list of other problems that must be addressed using causal analysis of simultaneous-equation models.
Data sets to which econometric analyses are applied can be classified as time-series data, cross-sectional data, panel data, and multidimensional panel data. Time-series data sets contain observations over time; for example, inflation over the course of several years. Cross-sectional data sets contain observations at a single point in time; for example, many individuals’ incomes in a given year. Panel data sets contain both time-series and cross-sectional observations. Multi-dimensional panel data sets contain observations across time, cross-sectionally, and across some third dimension. For example, the Survey of Professional Forecasters contains forecasts for many forecasters (cross-sectional observations), at many points in time (time series observations), and at multiple forecast horizons (a third dimension).
Econometric analysis may also be classified on the basis of the number of relationships modeled. Single-equation methods model a single variable (the dependent variable) as a function of one or more explanatory (or independent) variables. In many econometric contexts, the commonly-used ordinary least squares method may not recover the theoretical relation desired or may produce estimates with poor statistical properties, because the assumptions for valid use of the method are violated. One widely-used remedy is the method of instrumental variables (IV). For an economic model described by more than one equation, simultaneous-equation methods may be used to remedy similar problems, including two IV variants, Two-Stage Least Squares (2SLS), and Three-Stage Least Squares (3SLS).
Other important unifying or distinguishing methods include the Method of Moments, Generalized Method of Moments (GMM), time series analysis, and Bayesian methods.
Computational concerns are important for evaluating econometric methods and for use in decision making. Such concerns include mathematical well-posedness: the existence, uniqueness, and stability of any solutions to econometric equations. Another concern is the numerical efficiency and accuracy of software. A third concern is also the usability of econometric software.