Home CPSC 340

Assignment 3 - Simple Sentiment Analysis

 

Due: October 30


 

Objective

To be able to use hash tables in a program, and be familiar with the idea of sentiment analysis


 

Task

Sentiment analysis is a simple form of natural language processing in which we try to determine if a piece of text is positive, negative or just neutral. There are many techniques for performing sentiment analysis. In this assignment we will use a simple approach.

We will do our analysis by assigning sentiment ratings to individual words and phrases. The scale will be from -5 (very negative) to 5 (very positive). We will fill up a hash table where the keys are the words or phrases and the values are the sentiments. For instance the key "crap" produces the value -3 while the key "awesome" maps to the value 4.

Some multi-word phrases will go in the table too. For example, the words "fed" and "up" are pretty neutral, but the phrase "fed up" together has a negative rating. All of our phrases are either a single word, or two words. There are none longer than that.

We will use the sentiments.txt file to fill up our hash table. This data file was adapted from this source. The file has 2,454 words in it.


 

Details

  1. Start by downloading HashTable.java which is the HashTable class we developed this week. You will use this for storing the table of words.
  2. Write code to load the data file into a hash table. The keys are String and the values are integers. You can assume the "sentiments.txt" file will be in the same directory as your program.
  3. Next, get user input from the keyboard. You should keep reading input until the user types the word "END".
  4. Remove all punctuation from the words you read and convert them to lower-case. This will allow it to match words in the input file more easily.
  5. Keep track of the total sentiment of the input. For each word, check if it's in the hash table. If so, add the sentiment into the running total.
  6. You will also have to check for the two-word phrases. To do this, just keep track of the previous word you read too.
  7. After seeing all the input, print out the number of words, the total sentiment, and the average sentiment of the text (with two decimal places).

 

Example Runs

$ java SentimentAnalysis
Enter text:
The Mexican restaurant downtown is in a charming location and has a nice menu
of delicious fare.  They serve homemade tortillas that are soft and tasty, and
the soup is incredible.
END 

Words: 31
Sentiment: 19
Overall: 0.61
$ java SentimentAnalysis
Enter text:
Game of Thrones final season was atrocious.  Fans largely hated where the plot
went and found the writing disappointing, with characters making decisions that
make no sense.
END

Words: 27
Sentiment: -12
Overall: -0.44

In this second example, note that the phrase "no sense" has to be matched!


 

General Requirements


 

Submitting

When you are done, submit your code for this program on Canvas.

Copyright © 2020 Ian Finlayson | Licensed under a Creative Commons Attribution 4.0 International License.