Song of the Day

Main ideas

Coming up

Student Recommendations

Discussion guidelines

Privacy: Target(ed) Advertising

Data science teams at retailers like Target study consumer purchase history and demographics to increase sales. Strategies like appropriately pricing items and sending targeting coupons are employed.

A data scientist at Target used buying data for individuals who later signed up for a Target baby registry.

“…buying … unscented lotion … supplements like calcium, magnesium and zinc. … scent-free soap and extra-big bags of cotton balls [and] hand sanitizers and washcloths…”

“…each shopper [received] a pregnancy prediction score … estimate her due date to within a small window [and] send coupons timed to very specific stages of her pregnancy.”

Later…

“My daughter got this in the mail!” he said. ‘She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?’

Even later…

“‘I had a talk with my daughter,’ he said. ’It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

The Hathaway Effect

Does Anne Hathaway News Drive Berkshire Hathaway’s Stock?

Misrepresenting Data

Carefully examine the visualizations below. Then, discuss the questions below with your group.

What baby boomers think

Brexit poll results

Engineering & Cheese

Web scraping

A researcher is interested in the relationship of weather to sentiment (positivity or negativity of posts) on Twitter. They want to scrape data from https://www.wunderground.com and join that to Tweets in that geographic area at a particular time. One complication is that Weather Underground limits the number of data points that can be downloaded for free using their API (application program interface). The researcher sets up six free accounts to allow them to collect the data they want in a shorter time-frame.

Posting data from social media

A data analyst received permission to post a data set that was scraped from a social media site. The full data set included name, screen name, email address, geographic location, IP (Internet protocol) address, demographic profiles, and preferences for relationships. The analyst removes name and email address from the data set in effort to deidentify it.

Submission

To complete the lecture notes for today, push your annotated R Markdown file to GitHub.

Sources

Additional Information