Yelp Reviews Analysis For Restaurants And Cafes

Introduction and Background

“Yelp” is an online business that accepts and publishes reviews in the fields of local businesses and daily life incidents. Yelpers have written 71 million reviews to date. “Yelp” has become a very crucial site specially for small businesses that can achieve success or close down business as per online reviews.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

The data analysis undertakes yelp reviews for restaurants and cafes; summarises the “Online Reviews” vis sentiment analysis. The data analysis is executed on the basis of Yelp_reviews.csv file that contains 1569264 rows and 12 columns. The file size including yelp_reviews.csv is 122.5 MB in size. The big data is analyses with the help of “R” software. The data analysis is both qualitative and quantitative simultaneously. 

The average rating is satisfactory. (Mean = 3.743 and SD = 1.311468). Hence, overall the reviews are indicating satisfaction. 

The number characters of reviews have range from 0 to 1047. The average number of characters of reviews is 126 with SD = 115.498. 

The total number of positive words used in a review is 7.07 on an average with standard deviation 5.927. The total number of positive words used in a review has range 0 to 94. 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

The total number of negative words used in a review is 2.55 on an average with standard deviation 3.25. The total number of positive words used in a review has range 0 to 65.

Hence, positive words in the yelp review is far greater than negative words in the review. 

The average net sentiment score is very low with mean = 4.522 and standard deviation = 4.522. Its range is (-59) to 80. The low net sentiment average is obviously a reason of concern. 

Among the first twenty selected samples, the modal discrete value of positive words is 3 with the 4-time occurrences. 

The distribution of first twenty cases is positively and right skewed.

Among the first twenty selected samples, the modal discrete value of negative words is 0 with the 8-time occurrences followed by 2 and 3 negative words with 4-time occurrences. 

The distribution of first twenty cases is positively and highly right skewed. 

Out of first 20 cases of net sentiment, only 3 values are positive and rest of the 17 values are negative. The values of the first 20 cases lies in the interval of (-4) to 12. 

Out of first 20 cases, the value of net sentiment “3” occurred mostly for the 5 times followed by the value of net sentiment “5” occurred for the 3 times. Overall, the distribution is slightly negatively skewed. 

The samples that have referred that-

  • Samples with low rating (stars2 and stars1) have highest review length as an average, while highest rating (star5) has lowest review length as an average.
  • Samples with stars 2 have highest median review length, while highest rating (stars 5) has highest review length.
  • Lowest rating (stars 1) is most scattered numbers of review length, while highest rating (stars 5) has minimum scattered numbers of review length.
  • The interquartile range is highest for the review lengths of low rating, while it is highest for the review lengths of low rating. 

The bar plot shows that average review length is highest for low ratings (stars 1 and 2). The average review length is decreasing as the rating improves afterwards 2. 

Yes, on an average, positive reviews are lengthier than negative reviews (127>118). 

Businesses with highest and lowest rating: 

The lowest rating (stars 1) is maximum for the business “6LM_Klmp3hOP0JmsMCKRqQ” and minimum for the business “- -D12rW_xO8GuYBomlg9zw”. 

Businesses with highest and lowest rating: 

The rating (stars 2) is maximum for the business “Xhg93cMdemu5pAMkDoEdtQ” and minimum for the business “- -lemggGHgoG6ipd_RMb-g”. 

Businesses with highest and lowest rating:  

The moderate rating (stars 3) is maximum for the business “Xhg93cMdemu5pAMkDoEdtQ” and minimum for the business “- -4Pe8BZ6gj57VFL5mUE8g”. 

Businesses with highest and lowest rating:  

The high rating (stars 4) is maximum for the business “4bEjOyTaDG24SY5TxsaUNQ” and minimum for the business “- -4Pe8BZ6gj57VFL5mUE8g”. 

Businesses with highest and lowest rating: 

The highest rating (stars 5) is maximum for the business “2e2e7WgqU1BnpxmQL5jbfw” and minimum for the business “- -qeSYxyn62mMjWvznNTdg”. 

The two variables ratings (stars) and number of useful voters (votes_useful) are uncorrelated (correlation coefficient = -0.04897). “votes_useful” is moderately and positively correlated with length of the reviews (correlation coefficient = 0.3258).

The linear regression model is executed assuming “votes_useful” as dependent variable and “stars” as well as “review_length” as independent variables. The p-values shows that review length has significant association with votes_useful but “stars” does not have linear significant association with votes_useful. 

Conclusion:

The overall analysis depicts that positive reviews are more than negative reviews, that shows an optimistic attitude of the people towards “lodging and food” than pessimistic approach. The low ratings are found to be more reviewed than high ratings. However, average sentiment of the people towards the restaurants and cafes is not satisfactory. Usefulness of votes of consumers also has statistical significant association with review length. The negative ratings are more discussed than positive ratings in terms of review length. Although, positive feedbacks are slightly greater than negative feedbacks in almost all the cases discussed.    

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.