Skip to Main Content

Digital Scholarship and Digital Humanities

Resources and information for students, researchers and faculty who are incorporating technology into their research, scholarship, and teaching.

Social Media Analysis

Facebook, Twitter, Reddit, Instagram, TikTok and other online platforms have created whole new avenues for research. "Social Media Analysis" can be seen as the intersection of text analysis and network analysis. The "text" is often in the form of discussions, and may include images, emoji, and other media. The "media" is often structured in the form of a network, where the relationships between people and ideas become the focus of study. Information presented here is geared towards helping you collect social media data, and includes tools tailored for analyzing and visualizing data from specific social media platforms. For a more in-depth look at data sources you may want to use for your research, see our Data Services guide.

Related techniques: Text Analysis, Qualitative Analysis, Web Scraping

This technique is part of the Analysis activity.

Theory & Methods

General Web Scraping Tools

Web Scraping Tools

Data Miner

free plan limited to 500 pages/month | data collection only | web-based, using browser extension

Data Miner is a commercial service for web scraping that uses a browser extension as its primary interface. Many popular websites have existing templates you can use without doing any work, but building a customized scraper to pull data from tables is very straightfoward. See this video overview and "how it works" page.

 

Octoparse

free plan limited to 10k records per export | data collection only | Windows only

Octoparse uses desktop software in conjunction with a large set of pre-configured templates to enable web scarping from web sites, social media platforms, and more. You can also build a custom web scraper using visually-oriented tools. Data can be exported to CSV and Excel formats.

 

Morph.io

free | data collection only | programming required if not using existing data sets

Morph.io interfaces with GitHub to facilitate the creation and sharing of scripts for data scraping in Python, Node.js, PHP, Ruby, and Perl. The system sets up a basic GitHub project for you, with a template in your chosen language. You then customize it (programming expertise required) to perform a specific data scraping task. Once a scraper has been built for a specific data source, it is added to a searchable directory. The site currently lists more than 10,000 publicly-available scrapers and data sets.

Data Cleaning Tools

Data Cleaning

OpenRefine

free | Windows, Mac and Linux downloads | easy to learn

OpenRefine is a very powerful tool for data cleaning and transformation. It also allows you to augment your data with web services (such as looking up addresses to find geolocation data, or reconciling place names and objects with entity references from Wikidata or other linked data sources).