Statistics 1



  • Week 1 : July 25th and July 30th


  • Week 2 : August 1 and August 6
    • Key Features of Numeric Data: Centre, Spread and Shape; Five number Summary, IQR, Outliers.
    • R- Scan function, Mean/trimmed Mean
    • Plotting Histogram (along with options), Stem-Leaf plot.
    • Installing Packages (external) and using Datasets in R
    • Scatter Plot in R.
    • Dice Experiment: Simulated uniform {1,2,3,4,5,6}, Generated 300 samples : of Sums of 5 uniform random variables. Plotted Histogram.


  • Week 3 : August 8 and August 13
    • Density Plot : how R does density estimates, choice of bandwidths.
    • Box plot along with options.
    • Transformation of data.


  • Week 4 : August 15th (holiday) and August 20th
    • Discussed Study on Maternal Smoking and Infant Mortality.
    • Normal Distribution: calculating pdf, cdf (Tail probabilities) and generating samples in R
    • Checks to see if Data is Normal:68-95-99.7 rule; Kurtosis and Skewness; Normal Q-Q plot.
    • Understanding Probabilistic Statement and implications at experiment.
  • Reading In Class Worksheet: chapter one from
    Stat Labs: Mathematical Statistics Through Applications by Deborah Nolan and Terry P. Speed [Stat Labs Website] [ Stat Labs Data]


  • Week 5: August 22nd and August 27th


  • Week 6 : August 29th and September 3rd


  • Week 7: September 5th and September 17th

    • In R : Simulating Law of Large Numbers
    • Using optim function in R to calculate absolute deviation line.
    • Working with Temperature Data and Belgium calls data.
    • Comparing with robust linear regression line using rlm function.
    • Simple Linear Regression: Estimator for error variance, Distribution of slope estimator.
  • Sep 17th Lecture by Rajesh Sundaresan
    Slides
    Rajesh Sundaresan is from the ECE department and the Robert Bosch Centre for Cyber-Physical Systems at the Indian Institute of Science. He works in the areas of communication, computation, and control over networks.
    Perceptual Distance and Visual Search: We will discuss a visual neuroscience experiment designed to quantify the similarity between two objects as perceived by human subjects. As an example, a chair with an arm rest and a chair without an arm rest are are objects that are more similar to each other than a chair with an arm rest and a table. The quantification involves attaching numbers to the similarities. We will then discuss two models for the similarity and will study how to compare them. In the process we will learn about the gamma distribution, its parameters, an equality of means test for the gamma distribution, a suitable statistic for this test, and how this statistic can be used to compare the two models.


  • Week 8: September 19th and September 24th
    • Sampling distributions : $\chi^2$, $F$, and $t$.
    • Confidence intervals under Normality assumption: Variance known and unknown cases. [Notes]
    • Hypothesis Testing: Null, Alternate, Level of Significance and p-value. [Notes]
    • Comparing means from two populations
    • Testing for group means, Analysis of Variance (introduction)
    • Notes
  • Sep 24th Lecture by Kalyani Ramachandran
    Slides
    Kalyani Ramachandran is an entrepreneur running a startup in diagnostics. She has a PhD in Molecular Biology and Human Genetics from John Hopkins University, and has obtained a grant from Government of India for R & D of startup.
    • Title: Null hypothesis rejection: experimental data analysis
    • Abstract:Hypothesis testing specifically rejection of null hypothesis is a commonly used statistical inference methodology in scientific experimental data. In healthcare, it is used in many applications including drug clinical trials etc. The p-value at a significance level above the set threshold gives us confidence to reject the null hypothesis. One should check the assumptions both statistical and experimental ones to interpret the data correctly.


  • Week 9 : September 26th and October 1st
    • Testing in R: understanding $t$ distribution versus Normal distribution, prop.test in R, performing z-test in R by explicitly writing the function, and numerically performing $t$ test.
    • One way Analysis of Variance: testing group means are equal or not for scaled search times from Rajesh Sundaresan Page 17 Slides

  • Sep 27th Lecture by Rajeeva L. Karandikar
    Slides
    Rajeeva L. Karandikar is currently director of Chennai Mathematical Institute (for last 9 years). He is a Probabilist, who has been active on the applications of statistical ideas to real life problems. In particular he has been involved with opinion polls for Indian parliamentary and state assembly elections over last 2 decades. Earlier, he had a long stint at ISI, as a student (M Stat & Ph D) at ISI Kolkata and faculty at ISI Delhi for over two decades.
    • Title: What determines accuracy of inference based on Sampling: Sampling fraction or Sample size
    • The talk will focus on sampling and inference based on sample data. One important issue to be addressed is : how to arrive at a suitable sample size given the objective. Most people think of sample size as a proportion of the population and we will illustrate as to why that is not correct and what is relevant is the sample size and not sampling fraction.
  • Oct 1st Lecture by Rituparna Sen
    Slides
    Rituparna Sen obtained her PhD in statistics from University of Chicago after completing BStat and MStat from ISI. She taught at University of California Davis before joining ISI.
    • Title: Stylized facts of the Indian Stock Market
    • Abstract: Stylized facts are properties that are common across various markets and time domains. These properties offer a way to generalize stock price behavior irrespective of the instruments used. Lists of several such stylized facts are available in the literature for the developed western markets. In this talk, we'll present the analysis of historical daily data for eleven years of the fifty constituent stocks of the NIFTY index traded on the National Stock Exchange to check for the stylized facts. It is observed that while some stylized facts of other markets are also true in Indian markets, there are some significant deviations.
    • Related articles:


  • Week 10: October 3rd and October 8th (holiday)
  • Bannana Muffin Challenge notes
    • Method of Moments Estimator
    • Maximum Likelihood Estimator
    • Chapter 9, Section 9-9.3
      from Probability and Statistics with Examples using R
      Siva Athreya, Deepayan Sarkar, and Steve Tanner Version: April 25th, 2016



  • Week 11: October 10th and October 15th
    • Erdos Renyi Graphs.
    • M.L.E. for connection Probabilities
  • October 10th Lecture by Dootika Vats
    Slides
    Dootika did her undergraduate in Mathematics from Lady Shri Ram College, Delhi University, Masters in Statistics from Rutgers University, and PhD in Statistics from the University of Minnesota, Twin-Cities. She then did a two year postdoc at the University of Warwick in England, before joining as an Assistant Prof in the Department of Mathematics and Statistics at IIT Kanpur in July 2019. She works in the area of Bayesian computation and specifically on Markov chain Monte Carlo algorithms.
    We will introduce the concept of Monte Carlo simulations and Monte Carlo integration from both a statistical perspective, and from it's utility in practically relevant problems. As with most things in statistics, the driving question in many Monte Carlo integration systems is the problem of quantifying variability. We will discuss the challenges in quantifying variability and constructing confidence intervals for multidimensional Monte Carlo problems, discussing the problem of multiple comparisons.
    • $\chi^2$-goodness of fit test.


  • Week 12 : October 17th and October 22nd
    • $\chi^2$-test for independence.
    • Testing for slope in Simple Linear Regression.
    • Back to Basics: understanding Correlation
    • Linear Regression

  • Week 13: October 24th and October 29th
  • Class Board Photos :
    (October 24th) Set 1 Set 2
  • Class Board Photos :
    (October 29th) Photos
    • October 24th, 2019: Simulating Random Variables from Uniform $(0,1)$
    • October 29th, 2019: Bootstrap and Jackknife methods.
  • October 24th Lecture by Arjun Gopalaswamy
    Slides
    After completing his B.E in Industrial Engineering and Management from BMS College of Engineering, Bangalore University, he changed tracks and went on to do his Master's in Wildlife Ecology and Conservation at the University of Florida, Gainesville with a Minor in Statistics. He went on to do his D.Phil(PhD) at the University of Oxford. He is currently the Science Advisor, Global Programs, Wildlife Conservation Society (USA) and carries out research projects on iconic wildlife, such as tigers, lions, cheetahs and elephants in Asia and Africa. He specializes in the field of statistical ecology and is involved in developing innovative statistical models to help understand wildlife populations better.
    The fields of ecology and the environmental sciences restricts our use of hypothesis testing for drawing meaningful inference. A primary reason for this is because we rarely conduct "experiments" in these disciplines, because we simply cannot due to the primary issue of scale. In this lecture, I will talk about how this fundamental issue of scale called for a change in how ecologists practice their craft and how statisticians (and "statistical ecologists") helped them do so. In the process we will discuss how the practice forced a shift from hypothesis testing paradigms to to hypothesis discrimination discrimination paradigms using model selection and how hierarchical models replaced conventional models for experiments with real data. As I work on issues related to wildlife, I will draw some examples from wildlife ecology and conservation and will discuss one such likelihood (called occupancy models) that hugely benefited wildlife applications and extended to a range of other disciplines.
  • October 29th Lecture by V. Venugopal
    Slides
    Trained as a civil engineer (specialising in Hydrology), Venu is at the Centre for Atmospheric and Oceanic Sciences, IISc. His present interests are to quantify and understand the space-time characteristics of tropical rain and its extremes.
    We will try to quantify/understand patterns of tropical rain using satellite-retrievals. We will begin with the question: At any given instant in time, what fraction of the tropics receives rain? We will then try to build a story around that to identify a few "intuitively appealing" statistical attributes of rain. Time permitting, we will see how the (probability) distribution of rain behaves as we aggregate in space and/or time.

  • Week 14: October 31st and November 5th
    • Using Random Number Table.
    • Simulating Geometric and Binomial data.
    • Worksheet on:
      • $\chi^2$ goodness of fit,
      • Maximum Likelihood Estimator, and
      • Bootstrap

  • Final Exam week



Last Modified: October 14th, 2019. Courses Page Teaching Page