Statisti🐲cs is the collection and analysis of data drawn from a sample group to study and interpret their applicability to the larger population.
What Is Statistics?
Statistics is a branch of applied mathematics that involves the collection, description, analysis, and interpretation of data drawn from a sample of a larger population. Statistical sampli❀ng is used in medicine, finance, marketing, and many other fields to increa🐎se understanding and inform decision-making.
The mathematical 𒊎theories behind statistics rely heavily on differential and integral calculus, linear algebra, and probabiliඣty theory.
Key Takeaways
- Statistics involves calculating mathematical probabilities based on data collected from a sample group.
- The two major areas of statistics are descriptive and inferential.
- The work of statisticians is used in virtually all scientific disciplines as well as in finance, medicine, the humanities, government, and manufacturing.
:max_bytes(150000):strip_icc()/statistics-ae8f1320de174c279eeeee49c4087917.jpg)
Dennis Madamba / Investopedia
Understanding Statistics
Statistics are used in virtually all scientific disciplines such as the physical and social sciences as well as in business, medicine, the humanities, government, and manufacturing. Statistics is a branch of applied mathematics including calculus and linear algebra that developed from the application of mathematical tools to probabilityꦬ theory.
It's the idea that we can learn about the properties of large sets of objects or events (a 澳洲幸运5开奖号码历史查询:population) by studying the characteristics of a smaller number of similar objects or events (a sample). Gathering comprehensive data about an entire population is too costly, difficult, or impossible in many cases so statistics start with a sample that can 𒊎be conv🅰eniently or affordably observed.
Statisticians measure and gather data about the individuals or elements of a sample and analyze this data to generate descriptive statistics. They can then use these observed characteristics of the sample data to make inferences ﷽or educated guesses about the unmeasured characteristics of the broader population. Th൲ese are known as the parameters.
Fast Fact
Statistics dates back centuries. An early record of correspondence between French mathematicians Pierre de Fermat and Blaise Pascal in 1654 is often cited as an early example of statistical probability analysis.
Descriptive and Inferential Statistics
The two major areas of statistics are descriptive statistics and inferential statistics. Descriptive statistics describes the properties of sample and population data. Inferential statistics uses those properties to test hypotheses and draw conclus💦ions.
Descriptive statistics include mean or average, variance, skewness, and kurtosis. Inferential statistics include linear regression analysis, analysis of ꩲvariance or ANOVA, logit/Probit models, and null hypothesis testing.
Descriptive Statistics
Descriptive statistics focus mostly on the central tendency, variability, and distribution of sample data. Central tendency refers to the estimate of the characteristics, a typical element of a sample or population. It includes descriptive statistics such as mean, median, and mode.
Variability refers to a set of statistics that show how much difference there is among the elements of a sample or population along the characteristics measured. It includes metrics such as range, variance, and 澳洲幸运5开奖号码历史查询:standard deviation.
The distribution refers to the overall “shape” of the data. This can be depicted on a chart such as a histogram or a dot plot and includes properties such as the probability distribution function, skewnessꦰ, and kurtosis.
Descriptive statistics can also describe differences between observed characteristics of the elements of a data set. The꧙y can help us understand the collective properties of the elements of a data sample and form the basis for testing hyp🌳otheses and making predictions using inferential statistics.
Inferential Statistics
Inferential statistics is a tool used by statisticians to draw conclusions about the characteristics of a population. It's drawn from the characteristics of a sample. It's also used to determine how certain the statistician can be of the reliability of those conclusions. Statisticians can calculate the probability that statistics will provide an accurate picture of the corresponding parameters of the whole population from which the sample is drawn based on sample size and distribution.
Inferential statistics are used to make generalizations about large groups such as estimating average demand for a product by surveying the buying habits of a sample of consumers or attempting to predict future events. This might mean projecting the future return of a security or an as🗹set class based on returns in a sample period.
澳洲幸运5开奖号码历史查询:Regression analysis is a widely used technique of statistical inference. It's used to determine the strength and nature of the relationship between a dependent variable and one or more explanato𓂃r🀅y or independent variables. The output of a regression model is often analyzed for statistical significance. A result from findings generated by testing or experimentation isn't likely to have occurred randomly or by chance.
Statistical sඣignificance suggests that the results are attributable to a specific cause explained by the data.
Important
Having statistical significance is important for academic discipl🦩ines or practitioners who rely heavily on analyzing data and research.
Mean, Median, and Mode
The terms “mean,” “median,” and “mode” fall under the umbrella of central tendency. They descrꩲibe an element that’s typical in a given sample group. You can find the mean descriptor by adding the numbers in the group and dividing the result by the number of da💙ta set observations.
The middle number in the set is the median. Half of all included numbers are higher than the median and half are lower. The median home value in a neighborhood would be $350,000 if five homes were located there and valued at $500,000, $400,000, $350,000, $325,000, and $300,000. Two values a💮re higher and two are lower.
Mode identifies the number that falls between the highest and lowest values. It appears most frequently in the data set.
Understanding Statistical Data
The root of statistics is driven by variables. A variable is a data set that can be counted that marks a characteristic or attribute of an item. A car can have variables such as make, model, year, mileage, color, or condition. Statistics allows us to better understand trends and outcomes by combining the variables across a set of data such as the colors of ꦆall cars in a parking lot.
澳洲幸运5开奖号码历史查询: There are two types of variables.
Qualitative Variables
Qualitative variables are specific attributes that are often non-numeric. Examples of qualitative variables in statistics include gender, eye color, or city of birth. Qualitative data is most often used to determine what percentage of an outcome occurs for any given qualitative variable. Qualitative analysis often doesn't rely on numbers. Trying to determine what percentage of women owℱn a business analyzeඣs qualitative data.
Quantitative Variables
The second type of variable in statistics is quantitative variables. These are studied numerically and only have weight when they’re about a non-numerical descriptor. This information is rooted in numbers. The mileage a car is driven is a quantitative variable but the number 60,000 holds no value unless it's understood that it's the total number of miles driven.
Quantitative variables can be further broken into two categories. Discrete variables have limitations in statistics and infer that there are gaps between potential discrete variable values. The number of points scored in a football game is a discrete variable because there can be no decimals and a team can't score only one point.
Statistics also makes use of continuous quantitative variables. Thes𒁏e values run along a scale. Discrete values have limitations but continuous variables are often measured into decimals. Any value within possible limits can be obtained when measuring the height of the football players and the heights can be measured down to 1/16 of an inch if not further.
Statistical Levels of Measurement
There are several resulting levels of measurement after analyzing variables and outcomes. Statistics can quantify o🐎utcomes in four ways.
Nominal-level Measurement
There’s no numerical or quantitative value in this measurement and qualities aren't ranked. Nominal-level measurements are instead simply labels or categories assigned to other variables. It’s easiest to think of nominal-level measurements as non-numerical facts about a variable.
Example: The nam🎐e of the U.S. president e🥀lected in 2020 was Joseph Robinette Biden Jr.
Ordinal-level Measurement
Outcomes can❀ be arranged in an order using this measurement but all data values have the same value or weight. They’re numerical but ordinal-level measurements can’t be subtracted against each other in statistics because only the position of the data point matters. Ordinal levels are often incorporated into nonparametric statistics and compared against the total variable group.
Example: American Fred Kerley was the second-fastest man at the 2020 Tokyo Olympics based on 100-meter sprint times.
Interval-level Measurement
Outcomes can be arranged in order in this measurement but differences between data values may now have meaning. Two data points are often used to compare the passing of time or changing conditions within a data set. There's often no “starting point” for the range of data values. Calendar dates or temperatures may not have a meaningful intrinsic zero value.
Example: 澳洲幸运5开奖号码历史查询:Inflation hit 8.6% in May 2022. The last time inflation was this high was in December 1981.
Ratio-level Measurement
Outcomes can be arranged in order with this measurement and differences between data values now have meaning. There’s a starting p♚oint or “zero value” that can be used to further provide value to a statistical value, however. The ratio between data values has meaning including its distance away from zero.
Example: The lowest meteorological temperature recorded was -128.6 degrees Fahrenheit in Antarctica in 1983.
Statistics Sampling Techniques
It's often not possible to access data from every data point within a population to gather statistical information. Statistics relies instead on different sampling techniques to create a representative subset of the population that’s easier to analyze. There are several primary types of sampling in statistics.
Simple Random Sampling
澳洲幸运5开奖号码历史查询:Simple random sampling calls for every member within the population to have an equal chance of being selected for analysis. The entire population is used as the basis for samp🏅ling and any random generator based on chance can select the sample items. Maybe 100 individuals are lined up and 10 are chosen at random.
Systemic Sampling
澳洲幸运5开奖号码历史查询:Systematic sampling calls for a random sample as well but its technique is slightly modified to make it♉𓄧 easier to conduct.
A single random number is generated to determine the starting point and individuals are then selected at a specified regಞular interval until the sample size is complete. Every subsequent ninth individual is selected until 10 sample items have been selected if 100 individuals are lined up and numbered and the random starting point is the seventh individual. It would look like this: 7th, 16th, 25♏th.
Stratified Sampling
澳洲幸运5开奖号码历史查询:Stratified sampling calls for more control over your sample. The population is divided into subgroups based on similar characteristics. You wo꧙uld then calculate how many people from each subgroup would represent the entire population. Maybe 100 individuals are grouped by gender and race. A sample from each subgroup is then taken in proportion to how representative that subgroup is ♍of the population.
Cluster Sampling
Cluster sampling calls for subgroups as well but each subgroup should be representative of the population. The entire subgroup is randomly selected instead of randomly selecting individuals within a subgroup.
Fast Fact
Not sure which Major League Baseball player should have won Most Valuable Player last year? Statistics is often used to determine value and is frequently cited when the award for the best player is announced. Statistics can include batting average, number of home runs hit, and stole✱n bases.
Uses of Statistics
Statistics is prominent in finღance, investing, business,🔯 and a wide scope of sectors. Much of the information you see and the data you’re given is derived from statistics used in all facets of a business.
- Statistics in investing include average trading volume, 52-week low, 52-week high, beta, and correlation between asset classes or securities.
- Statistics in economics include gross domestic product (GDP), unemployment, consumer pricing, inflation, and other economic growth metrics.
- Statistics in marketing include conversion rates, click-through rates, search quantities, and social media metrics.
- Statistics in accounting include liquidity, solvency, and profitability metrics across time.
- Statistics in information technology include bandwidth, network capabilities, and hardware logistics.
- Statistics in human resources include employee turnover, employee satisfaction, and average compensation relative to the market.
Why Is Statistics Important?
S🅘tatistics is used to conduct research, evaluate outcomes, develop critical thinking, and make informed decisions about a set of data. Statistics can be used to inquire about almost any field of study to investigate wꦿhy things happen, when they occur, and whether reoccurrence is predictable.
What’s the Difference Between Descriptive and Inferential Statistics?
Descriptive statistics are used to describe or summarize the characteristics of a sample or data set such as a variable’s mean, standard deviation, or frequency. Inferential statistics employ any number of 🤪techniques to relate variables in a da൩ta set to each other. An example would be using correlation or regression analysis. These can then be used to estimate forecasts or infer causality.
Who Uses Statistics?
Statistics are used whenever data are collected and analyzed and𒁏 used widely across an array of applications and professions.𓆏 These include government agencies, academic research, and investment analysis.
How Are Statistics Used in Economics and Finance?
Economists collect and look at all sorts of data ranging from consumer spending and housing starts to inflation and GDP growth. Analysts and investors collect data about companies, industries, 🍰sentiment, and market data on price and volume. The use of inferential statistics in these fields is known as econometrics.
Several important financial models including the 澳洲幸运5开奖号码历史查询:capital asset pricing model (CAPM), modern portfolio theory (MPT) and the 澳洲幸运5开奖号码历史查询:Black-Scholes options pricing model rely on statistical🍷 inference.
The Bottom Line
Statistics is the practice o𒁏f analyzing data a🐻nd drawing inferences from the sample results. Statistics is used across a variety of fields from governmental agencies to finance to gather conclusions about a given data set.
The study of statistics can lead to a career as a statistician but it can also be a handy met♛ric in everyday life. Statistics can be used to gain insights on probable outcomes of objects or events whether you’re analyzing the odds that your favorite team will win the Super Bowl before you place a bet, gauging the viability of an investment, or determining whether you’re being comparatively overcharged for a product or service.