Matrix in Matlab: Create and manipulate Matrices

In this post, we will discuss about Matrix in Matlab. Matrix (a two-dimensional, rectangular shape used to store multiple elements of data in an easily accessible format) is the most basic data structure in MATLAB. The elements of a matrix can be numbers, characters, logical states of yes or no (true or false), or other MATLAB structure types. Matlab also supports more than two-dimensional data structures, referred to as arrays in Matlab. Matlab is a matrix-based computing environment in which all of the data entered into Matlab is stored as a matrix.

A matrix in MATLAB can be created and manipulated

The MATLAB environment uses the term matrix for a variable that contains real or complex numbers. These numbers are arranged in a two-dimensional grid. An array is, more generally, a vector, matrix, or higher-dimensional grid of numbers. All variables in MATLAB are multidimensional arrays, no matter what type of data they store. A matrix is a two-dimensional array often used for linear algebra.

It is assumed in this MATLAB tutorial that you know some of the basics of how to define and manipulate vectors in MATLAB software. We will discuss the following:

  1. Defining Matrix in MATLAB
  2. Matrix Operations in MATLAB
  3. Matrix Functions in MATLAB

Define or Create a Matrix in MATLAB

Defining a matrix in MATLAB is similar to defining a vector in MATLAB. To define a matrix, treat it as a column of row vectors.

>> A=[1 2 3; 4 5 6; 7 8 9]

Note that spaces between numbers are used to define the elements of the matrix, and a semicolon is used to separate the rows of matrix A. The square brackets are used to construct matrices. The individual matrix and vector entries can be referenced within parentheses. For example, A(2,3) represents an element in the second row and third column of matrix A.

Matrix in Matlab
Matrix in Matlab

A matrix in Matlab is a type of variable that is used for mathematical/statistical computation—some examples of creating a matrix in Matlab and extracting elements.

>> A=rand(6, 6)
>> B=rand(6, 4)
>> A(1:4, 3) is a column vector consisting of the first four entries of the third column of A
>> A(:, 3) is the third column of A
>> A(1:4, : ) contains column  and column 4 of matrix A

Convenient Matrix-Building Functions

eye –> identity
zeros –> matrix of zeros
ones –> matrix of ones
diag –> create or extract diagonal elements of a matrix
triu –> upper triangular part of a matrix
tril –> lower triangular part of a matrix
rand –> randomly generated matrix
hilb –> Hilbert matrix
magic –> magic square

Matrix Operations in MATLAB

Many mathematical operations can be applied to matrices and vectors in MATLAB, such as addition, subtraction, multiplication, and division of matrices, etc.

Matrix or Vector Multiplication

If $x$ and $y$ are both column vectors, then $x’*y$ is their inner (or dot) product, and $x*y’$ is their outer (or cross) product.

Matrix division

Let $A$ be an invertible square matrix and $b$ be a compatible column vector, then

x = A/b is solution of A * x = b
x = b/A is solution of x * A = b 

These are also called the backslash (\) and slash operators (/), also referred to as the mldivide and mrdivide.

Matrix Functions in MATLAB

Matlab has many functions used to create different kinds of matrices. Some important matrix functions used in MATLAB are

eig –> eigenvalues and eigenvectors
eigs –> like eig, for large, sparse matrices
chol –> Cholesky factorization
svd –> singular value decomposition
svds –> like SVD, for large,e sparse matrices
inv –> inverse of matrix
lu –> LU factorization
qr –> QR factorization
hess –> Hessenberg form
schur –> Schur decomposition
rref –> reduced row echelon form
expm –> matrix exponential
sqrtm –> matrix square root
poly –> characteristic polynomial
det –> determinant of matrix
size –> size of an array
length –> length of a vector
rank –> rank of matrix

To learn more about the use of Matrices in MATLAB, see the MATLAB Help

R Language and Data Analysis

Sufficient Estimators and Sufficient Statistics

Introduction to Sufficient Estimator and Sufficient Statistics

An estimator $\hat{\theta}$ is sufficient if it makes so much use of the information in the sample that no other estimator could extract from the sample, additional information about the population parameter being estimated.

The sample mean $\overline{X}$ utilizes all the values included in the sample so it is a sufficient estimator of the population mean $\mu$.

Sufficient estimators are often used to develop the estimator that has minimum variance among all unbiased estimators (MVUE).

If a sufficient estimator exists, no other estimator from the sample can provide additional information about the population being estimated.

If there is a sufficient estimator, then there is no need to consider any of the non-sufficient estimators. A good estimator is a function of sufficient statistics.

Let $X_1, X_2,\cdots, X_n$ be a random sample from a probability distribution with unknown parameter $\theta$, then this statistic (estimator) $U=g(X_1, X_,\cdots, X_n)$ observation gives $U=g(X_1, X_2,\cdots, X_n)$ does not depend upon population parameter $\Theta$.

Sufficient Statistics Example

The sample mean $\overline{X}$ is sufficient for the population mean $\mu$ of a normal distribution with known variance. Once the sample mean is known, no further information about the population mean $\mu$ can be obtained from the sample itself, while the median is not sufficient for the mean; even if the median of the sample is known, knowing the sample itself would provide further information about the population mean $\mu$.

Mathematical Definition of Sufficiency

Suppose that $X_1,X_2,\cdots,X_n \sim p(x;\theta)$. $T$ is sufficient for $\theta$ if the conditional distribution of $X_1,X_2,\cdots, X_n|T$ does not depend upon $\theta$. Thus
\[p(x_1,x_2,\cdots,x_n|t;\theta)=p(x_1,x_2,\cdots,x_n|t)\]
This means that we can replace $X_1,X_2,\cdots,X_n$ with $T(X_1,X_2,\cdots,X_n)$ without losing information.

Sufficient Estimator Sufficient Statistics

For further reading visit: https://en.wikipedia.org/wiki/Sufficient_statistic

Computer MCQs Test Online

Creating Frequency Distribution Table

Using Descriptive statistics we can organize the data to get the general pattern of the data and check where data values tend to concentrate and try to expose extreme or unusual data values. Let us start learning about the Frequency Distribution Table and its construction.

Frequency and Frequency Distribution

A frequency distribution is a compact form of data in a table that displays the categories of observations according to their magnitudes and frequencies, such that similar or identical numerical values are grouped. The categories are also known as groups, class intervals, or simply classes. The classes must be mutually exclusive, showing the number of observations in each class. The number of values falling in a particular category is called the frequency of that category, denoted by $f$.

A Frequency Distribution Table shows us a summarized grouping of data divided into mutually exclusive classes and the number of occurrences in a class. Frequency distribution is a way of showing raw (ungrouped or unorganized) data into grouped or organized data to show results of sales, production, income, loan, death rates, height, weight, temperature, etc.

Relative Frequency

The relative frequency of a category is the proportion of observed frequency to the total frequency, obtained by dividing the observed frequency by the total frequency and denoted by r.f.  The sum of the RF column should be one, except for rounding errors. Multiplying each relative frequency of a class by 100, we can get the percentage occurrence of a class. A relative frequency captures the relationship between a class total and the total number of observations.

The Frequency Distribution Table may be made for continuous data, discrete data, and categorical data (for both qualitative and quantitative data). It can also be used to draw some graphs such as histograms, line charts, bar charts, pie charts, frequency polygons, Pareto Charts, Scatter diagrams, stem and leaf displays, etc.

Steps of Creating a Frequency Distribution Table

  1. Decide on the number of classes. The number of classes is usually between 5 and 20. Too many classes or too few classes might not reveal the basic shape of the data set, also it will be difficult to interpret such a frequency distribution. The maximum number of classes may be determined by the formula:
    \[\text{Number of Classes} = C = 1 + 3.3 log (n)\]
    \[\text{or} \quad C = \sqrt{n} \quad {approximately}\]where $n$ is the total number of observations in the data.
  2. Calculate the range of the data ($Range = Max – Min$) by finding the minimum and maximum data values. The range will be used to determine the class interval or class width.
  3. Decide about the width of the class denoted by h and obtained by
    \[h = \frac{\text{Range}}{\text{Number of Classes}}= \frac{R}{C} \]
    Generally, the class interval or class width is the same for all classes. The classes all taken together must cover at least the distance from the lowest value (minimum) in the data set to the highest (maximum) value. Also note that equal class intervals are preferred in frequency distribution, while unequal class intervals may be necessary in certain situations to avoid a large number of empty or almost empty classes.
  4. Decide the individual class limits and select a suitable starting point for the first class, which is arbitrary; it may be less than or equal to the minimum value. Usually, it is started before the minimum value in such a way that the midpoint (the average of the lower and upper-class limits of the first class) is properly placed.
  5. Take an observation and mark a vertical bar (|) for the class it belongs to. A running tally is kept till the last observation. The tally counts indicate five.
  6. Find the frequencies, relative frequency,  cumulative frequency, etc., as required.
Frequency Distribution Table
Frequency Distribution Table

A frequency distribution is said to be skewed when its mean and median are different. The kurtosis of a frequency distribution is the concentration of scores at the mean, or how peaked the distribution appears if depicted graphically, for example, in a histogram. If the distribution is more peaked than the normal distribution, it is said to be leptokurtic; if less peaked, it is said to be platykurtic.

Continuous Frequency Distribution Table

Further Reading: Frequency Distribution Table

Frequently Asked Questions

  • What is a frequency distribution table?
  • What is meant by mutually exclusive classes?
  • What is relative frequency?
  • What are the steps used for creating a frequency distribution table?

Learn R Language: R Frequently Asked Questions