PART 1 Using the attached files of around 3200 tweets per person, show a histog
PART 1 Using the attached files of around 3200 tweets per person, show a histogram (frequency distribution) of the tweets of both Dave and Julia. Use `UTC` to create the time stamp. Remember that the case of column headers matters. Make a dataframe of word frequency for each of Dave and Julia. Plot the frequencies against each other. Include a dividing line in red showing words nearby that are similar in frequency and words more distant which are shared less frequently. Create a stacked chart comparing the odds ratios of the top 15 words used by each tweeter. Remove twitter handles from the list of words. Calculate the word usage ratios (usage v. total) and display it on a log scale. Do you notice any interesting differences? Does anything stand out as a difference? PART 2 Using the tweet files: Create time series charts for each tweeter showing how word usage has changed over time. Show for three words. You may have to manipulate a parameter to show Comment your code, line by line. Show a graph for each tweeter revealing the ten words with the highest number of retweets. Comment your code, line by line. Here is link to Text use your words so that there is no plagiarism. I need the rmd file so that I can execute the code on my windows and word doc of the output you executed. Always repeat the question you are answering. Thank you! Requirements: as needed for code/ graphs   |   .doc file

Leave a Reply

Your email address will not be published.