STA10003 FOUNDATIONS OF STATISTICS

1STA10003 FOUNDATIONS OF STATISTICSASSIGNMENT PART 2This Assignment – Part 2 is worth 20% of your final mark for STA10003.The Industry ScenarioYou are a new graduate researcher at a social science and psychological sciences researchinstitute, and you have been given a dataset to analyse which relates to a study of Californianadults in 2020. The survey, titled the California Health Interview Survey [CHIS] collectsextensive information regarding health status, health conditions, health related behaviours,health insurance coverage and other health-related issues and demographic information. Youhave been tasked with conducting the initial analysis of some variables using SPSS and towrite brief reports.Data PreparationFor Assignment Part 2 you can use the random sample of 1500 of the 6259 observations yougenerated from the original data file STA10003 SP1 2022 Assignment Data.sav. You do notneed to generate a new random sample. If you have misplaced the random sample drawn forAssignment Part 1 you should draw a new random sample as per the instructions contained inthe Assignment part 1 document.The data file STA10003 SP1 2022 Assignment Data.sav can be found on the Canvas underAssignments > Assignment – Part1Submission Instructions• Your submission must be a single Word file or PDF file.• Although a cover page is not required, you should include your name and student numberwithin the document [e.g., in footer].• You must submit your file via Canvas by the specified due date and time. Only the lastdocument you submit will be retained by Canvas.• Once submitted, please review your submission to ensure the correct file has beensubmitted.• This is an individual assignment. Do not share your work with other students. They willhave a different random sample of data, so any copying will be detected.2ASSIGNMENT – PART 2For your Assignment Part 2, you are required to complete the first three [3] questions byproducing the appropriate analyses and writing the relevant report for each question. Youare also required to complete question 4, containing short answer questions.For each of the first three questions requiring SPSS, include the relevant output – tables andgraphs. Note that many of the variables have similar names, so it is important that the correctvariable be selected to address the question asked.Question 1: Cigarette ConsumptionThe variable Cigarettes gives an indication of number of cigarettes each Californian adult reportedsmoking in the previous day. Research indicated that American adults smoke, on average, 15cigarettes per day. A respiratory researcher has claimed that Californian adults smoke more.Conduct a one sample t-test using the Cigarettes variable to test this claim.Produce the relevant output and write a one-sample t-test report based on your output in the stylepresented in Supplement G: Report writing for Hypothesis Tests. Include the relevant output withyour answer.Question 2a: Psychological DistressThe variable PsychDistressScore gives an indication of psychological distress displayed by eachparticipant. Scores are based on the Kessler K6 screening scale which asks participants sixquestions relating to how often they felt nervous, hopeless, restless, depressed, worthless and‘everything is an effort’ over the previous month. Scores can range between 0 [no distress] and 24[high distress].A psychologist predicted that there is a difference in psychological distress score between peopleaged less than 40 years and people aged 40 or older. Conduct an independent samples t-test usingthe PsychDistressScore and AgeGrp variables to test this claim.Produce the relevant output and write an independent samples t-test report based on your outputin the style presented in Supplement H: Report Writing for Independent Samples t-Test. Includethe relevant output with your answer.Question 2b: Assumption Checking for the independent samples t-testCheck and comment on the assumptions of the independent samples t-test produced in Q2a. Includethe relevant output with your answer.3Question 3a: Walking for leisure and walking to get somewhereThe variable WalkLeisure gives an indication of the time each participant spent walking forleisure, and the variable WalkSomewhere gives an indication of the time each participant spentwalking ‘to get somewhere’ in particular. Each of these variables was measured in minutes spentwalking in the previous week.A health researcher believes that there is a difference, on average, between the time spent walkingfor leisure and the time spent walking purposefully in order to get somewhere in particular.Conduct a paired samples t- test [related samples t-test] using the WalkLeisure andWalkSomewhere variables to test this prediction.Produce the relevant output and write a paired samples t-test report based on your output in thestyle presented in Supplement I (Reporting Writing for Paired Samples t-Test).Question 3b: Assumption Checking for the paired samples t-testCheck and comment on the assumptions of the paired samples t-test produced in Q3a. Include therelevant output with your answer.Question 4: [does not require SPSS]A dietician wants to investigate ice-cream consumption, as this is her favourite ‘treat’ food. Shehas recently learnt that Americans consume, on average, 20.8 litres of ice-cream per person peryear. She wondered if Australians consumption of ice-cream differs to that of Americans, so sheaccessed the data from an Australian Health Survey collected in 2021 pertaining to consumptionof various food groups. A random sample of the data collected provided information pertaining tothe ice-cream consumption [litres per year] of Australians.(a) What type of statistical test would be appropriate to investigate the dietician’s prediction?(b) What is the population we can draw conclusions about in this study?(c) The dietician predicted that, on average, the consumption of ice-cream [litres per year] byAustralians differs to that consumed by Americans. She conducted the appropriate hypothesistest and obtained a p-value of 0.143. Based on this result, the dietician concluded thatAustralians consume 20.8 litres of ice-cream per year. Is this conclusion valid or not?(d) Comment on the validity of this conclusion. Provide justification for your answer.4Prior to submitting your Assignment via Canvas, use the following checklist as a guide to ensurethat all of the relevant information is provided.Q1 – Should include [as appropriate]:The One-sample t-test output including any additional output produced to answer the question andreport the results of the One-sample t-test following the format of reports in Supplement G.Q2 – Should include [as appropriate]:The Independent samples t-test output including any additional output produced to answer thequestion and report the results of the Independent samples t-test following the format of reports inSupplement H.Q3– Should include [as appropriate]:The Paired samples t-test output and all other output produced to answer the question and reportthe results of the Paired samples t-test following the format of reports in Supplement I.Q4– The answers should be presented with sections (a), (b), (c), and (d) clearly identifiedChecklist:• Correct variable used to produce output [note that many of the variables have similar namesso it is important to double-check that the correct variable has been used]• Correct procedure performed• Correct test values used• All figures quoted in report correct according to your own output• Including 95% confidence interval interpretations• Significance interpreted correctly (i.e. not suggesting that the finding is significant when itis not or vice versa)• Correctly referring to the sample or population where / when appropriate• Proof reading of reports for errors5STA10003 Assignment Part 2 Marking Rubric [out of 27]:0 1 2 3 4 5Q1One-sample t-test[5 marks] Incorrect procedureand/or no report,report covers norelevant/correctfeatures of output. Correct procedure, butincorrect variable orcomparison and/or majorerrors in report. Correct output. Reportpresented following formatused in course materials.Report has no major errors,however not all componentsare included, and report hasmultiple minor errors. Correct output. Reportpresented following formatused in course materials.Report has no major errors,however not all componentsare included, or report hasmultiple minor errors. Correct output.Report presentedfollowing format used incourse materials.Report has only 1-2minor errors. Correct output. Reportpresented followingformat used in coursematerials. Report has noerrorsand is clearly andconcisely