Analysis Of Australian Income Tax Lodgement Process
Dataset and Analysis
After the finishing of financial year, Australians lodge their income tax. Mainly in two ways that are by self-preparer and registered tax-paying agents, Australians lodge their income tax return. We are discussing and analysing briefly in this research report various inherent facts of lodgement processes lodges to the Australian Taxation Office (ATO).
The report analyses the dataset. The dataset includes of few numerical and categorical variables like Gender, range of age, Lodgement process, Total income amount and Total deduction amount. The proportion of Australians who like to lodge a tax return with the help of a tax agent is a matter of interest. The variability among the age groups with respect to the lodging group is also observed. An essential relation between “total income” and “lodgement method” is being investigated. Finally, it is a matter of fact whether there is a relevance between total income and deduction amount or not.
Internet resources helped to gather the data set. Hence, the dataset is secondary in nature. A total of 1000 samples are present in the dataset. Gender and Lodgement process are ordinal variables, range of age, total amount of income and total amount of deduction are the quantitative variables.
“Statkey” online software is used to accomplish the analysis. For analysis part of all the five variables, “One quantitative variable”, “One categorical variable”, “”One quantitative variable and one categorical variable”, “Two Categorical Variables” and “Two Quantitative Variables” are selected. Then, some randomised testing of hypothesis that are “Test for Single Mean”, “Test for Single Proportion”, “Test for Difference in Means” as well as “Test for Difference in Proportions” are carried out. Lastly, some sophisticated randomized tests “ANOVA for Difference in Means” and “ANOVA for Regression” are executed for reflection part.
More to say that we have used “MSExcel” for accomplishing the analysis. Primarily, “Analysis ToolPack” named as “Data Analysis” in MSExcel is installed for advanced analysis. With the help of it, bar charts, box plots, linear regression analysis and hypothesis testing are incorporated with the installed “Data analysis tool”.
The graph provides the frequency distribution of two types of genders. Note that, “1” indicates “Female” and “2” indicates “Male”. The heights of bars are proportional to the frequencies of both the genders.
Out of 1000 data, the number of “Female” (1) is 474 having proportion 0.474, whereas the number of “Male” (0) is 0.526 having the proportion 0.526.
The pie chart shows the distribution of frequencies of both types of Genders
Gender and Lodgement Method
The graph of spread of age of 1000 observations shows that the spread of age of the peoples varies from 0 to 11. The spread of ages of group “9” has highest count (> 120) whereas the spread of ages of group “11” has lowest count (< 40).
The histogram of range of ages shows the distribution of ages of all the 1000 samples. Most of the samples have range of age from 10 to 11. Least occurrence of the range of ages is in the interval of spread of age from 1 to 2.
The box plot provides the “Five number summary” of range of ages. It refers the distribution and location measures of all the samples.
The descriptive statistics table of range of age of 1000 samples specify that age-group “9” has most count (125) and age-group “11” has least count (38). The average of spread of age is 5.859 and standard deviation is 3.127 (Holcomb 2016). The highest and least ranges of age are 0 and 11. The 1st quartile, 2nd quartile (median) and 3rd quartiles of the distribution of spread of age are 3.5, 6 and 9 respectively.
The graph of Lodgement process refers two different kinds of tax-lodging processes that are “self-prepare” and by “agents”. The number of peoples whose tax-lodging method is by agent is significantly more than self-prepare.
Out of all sampled data, 246 data (proportion = 0.246) lodges their tax by self-prepare and 754 data (proportion = 0.754) lodges their tax by agents.
The graph shows the frequency distribution of two types of lodgement process.
The graph of total amount of income of all samples refers that almost of the data are concentrated in the total income range from $0 to $200000. Generally people do not have total amount of income greater than $300000. The data whose total income amount is more than $300000 could be stated as prominent outliers. Note that, the distribution of total income amount is highly negatively skewed. Therefore, most of the people are not rich according to the income.
It is the histogram of total income amount. It indicates that almost all the people earn within the range from $0 to $100000 per month. A few people earns in the interval $500000 to $700000.
The graph shows the location measures and frequency distribution of total income amount. All the three quartiles of total income amount are below the total income amount $100000. Many outliers are detected in this box plot.
Range of Age and Lodgement Method
The descriptive statistics of all the samples specify that the average of total income amount is $58782.986 and standard deviation of total income amount is $62863.622. The least amount of total income is found to be $(-150) and highest of total amount of income is found to be ($869090). The 1st quartile, 2nd quartile (median) and 3rd quartiles of the amount of total amount are found to be $24538.5, $46761 and $72206 respectively.
The graph indicates the distribution of total amount of deduction. Most of the data are distributed in the interval of $0 to $10000. The distribution of total income amount is highly negatively skewed and it is rare to have total deduction amount more than $150000.
The graph of deduction income amount shows that almost all the people of Australia pay total deduction amount from $0 to $5000 with a highly significant frequency. Some people have total deduction amount in the range of $500000 to $700000 monthly. Very few people have total deduction amount in the range $10000 to $50000.
The graph indicates the location measures and distribution of total deduction amount. All the three-quartile (1st quartile, median and 3rd quartile) values lie below $5000. The amounts of deduction over $30000 are marked as outliers.
The descriptive statistics of all samples show that the average of total deduction amount is $2173.759 and standard deviation is $4162.795. The total deduction amount has the range $0 to $49832. All the three-quartiles of the total deduction amount are $141, $643 and $2599.5.
The graph and frequency table of two qualitative variables refer that among all the 1000 people, the frequency of female is 474 and male are 526. 246 Self-prepare tax lodging method is of 246 people and lodgement method via agents is of 754 people.
The Australians whose tax-lodgement process is self-prepare has average range of age 6.65 and who lodges tax via agents has average range of age 5.601. The overall average of range of age is 5.829.
More of it, the median value of range of age for the people whose lodgement method is self-prepare is 7.5, median age of the people whose lodgement method is by agents is 6 and overall is 6. The spread of interval of age in terms of standard deviation is highest for self-prepare than lodgement method via agents.
Most of the students of age group “4” make their lodgement process with significant count 82 and students of age group “11” have their least lodgement process have frequency 23.
Total Income Amount and Lodgement Method
Most of the students of age-group “9” makes their lodgement with count 47 and least students of Age-group “2” makes their lodgement method with count 8.
The Australians whose lodges their tax by self-prepare has average total monthly amount of income $46391.293 and the people who lodges their tax by registered agents has average total amount of income $62825.899. The overall average of total monthly income amount is found to be $58782.986. The average total amounts of income by agents have greater spread and higher quartiles than the total income amount by self-prepare.
Australians who lodges their tax is by registered agent has higher average ($2510.082) of deduction amount than by self-preparation whose average is 1142.915. In accordance to the standard deviation, the spread is higher for the total monthly deduction amount by agents than total monthly deduction amount by self-prepare. All the three-quartile values are greater in case of total amount of deduction whose tax lodgement method is via agents than total amount of deduction whose tax lodgement amount is by self-prepare.
The graph refers the distribution of amount of deduction according to the Lodgement method.
The scatter plot between monthly total amount of deduction (Y) and monthly total income amount of income(X) is given here. The trend line is also provided in the scatter plot.
The total monthly income amount has greater average than the total amount of deduction ($58782.986>$2173.759). The total amount of income is also found more dispersed than total amount of deduction ($62863.622>$4162.795). The Pearson correlation coefficient ( r ) is found to be 0.427. Hence, there lies a moderate positive correlation between total monthly amount of income and total monthly amount of deduction.
Our linear regression model assumed, “Total Income Amount” an independent or explanatory factor whereas “Total Deduction Amount” as response factor. The slope of the apprehended linear regression model (b) is found to be 0.028 and intercept of the linear regression model (α) is 512.828.
The linear regression model is- Y = a + b*X (It 2015).
The achieved linear regression model is, “Total amount of deduction” = 512.828 + 0.028 * “Total amount of income”.
As the multiple R2 (“Coefficient of Variation”) is 0.182, total amount of income can explain only 18.2% variation by total income amount. The calculated F-statistic is 222.147 with significant p-value 0.0, which less than 0.05. Hence, we firmly reject the null hypothesis of linear relevance of response factor and predictor factor at 0.05 level of significance. Therefore, it could be inferred that total monthly amount of deduction does not significantly rely on the total monthly amount of income.
Total Deduction Amount and Lodgement Method
The procedure of “Randomized trial” in “Statkey” refers that it is 95% evident that the proportion of lodging method of tax by agents ranges in the interval 0.724 and 0.777. Not only that, the randomised trial assuming the proportion 0.75 in the null assertion is accomplished.
The procedure of “Randomized trial” in “Statkey” shows that it is 95% evident that the proportion of total monthly income amount varies in the interval of $56381.024 and $64080.981. Note that, the randomised trial assuming then average of total monthly income amount as $60000 in the null assertion is accomplished.
The procedure of “Randomized trial” in “Statkey” refers that it is 95% evident that the proportion of total monthly deduction amount varies in the interval $1735.612 to $2291.685. Note that, the randomised trial assuming the average of total deduction amount as $2000 in the null assertion is accomplished.
Tests of analysis of variance (ANOVA):
The mean range of age of Lodgement method is greater for self-prepare and lesser for via agents (6.7>5.6). The standard deviations of range of ages are almost equal for both types of Lodgement methods (3.2 ~ 3.1).
Lodgement process vs. total monthly income amount:
The mean value of total amount of income of Lodgement method is least for self-prepare and higher for lodgement method via agents ($62825.9>$46391.3). The standard deviation of total income amount is lower for lodging process by self-prepare than lodging process by registered agents ($68626.6>$37874.7).
Lodgement Process vs. total monthly deduction amount:
The mean value of total amount of income is lesser for lodgement process by self-prepare ($1142.9) and greater for lodgement method by registered agents ($2510.1). The dispersion in terms of standard deviation of total income amount is lesser for self-prepare than via agents ($4609.6>$1984.6).
According to the executed paired two samples t-test for the equality of means. The t-statistic of paired t-test is 29.24909445 with 999 degrees of freedom (d.f.). The two-tail p-value is 0.0, which is less than 0.05. Now, with 95% possibility, we cannot accept the null assertion of equality of average values of total amount of income and total amount of deduction (Kim 2015).
Conclusion:
The people of Australia generally prefer to lodge their tax by registered agents rather than by self-prepare according to the “analysis and discussion” of the report. The Australians, who lodges tax by agents, usually have higher total income amount and higher total deduction amount. The mean values of total amount of income and total amount of deduction are statistically and significantly different to each other. The monthly total amount of amount cannot be estimated by total income amount of the Australians. It is observed that the total deduction amount and total amount of monthly income has positive and moderate correlation.
References:
Holcomb, Z.C., 2016. Fundamentals of descriptive statistics. Routledge.
It, X., 2015. Simple linear regression.
Kim, T.K., 2015. T test as a parametric statistic. Korean journal of anesthesiology, 68(6), pp.540-546.
McClelland, G.H., Irwin, J.R., Disatnik, D. and Sivan, L., 2017. Multicollinearity is a red herring in the search for moderator variables: A guide to interpreting moderated multiple regression models and a critique of Iacobucci, Schneider, Popovich, and Bakamitsos (2016). Behavior research methods, 49(1), pp.394-402.