Recent Projects
(click on the pictures to explore)
(click on the pictures to explore)
We build a novel interface to help readers make sense of news article clusters; automatically identifying, highlighting, and linking supporting evidence and contradictory statements across multiple documents. Interactive demo.
We seek insight into how modern QA systems work, and to produce efficiency gains along the way.
We study the extent to which cross-segment attention is needed for multi-segment reasoning tasks. We found that partially parallel segment processing (ie, late interaction) can enable segment caching and reduce latency without harming performance for many reasoning tasks.
We aim to distil the knowledge of sota QA systems, as well as sota document representation, to passage-legel "evidence graphs," and to then use an adaptive retriever to efficiently traverse the graph. This should render various forms of question answering more interpretable, fast, and accurate.
We developed a new method for voice conversion that continuosuly shifts between speaker identities, creating an eerie effect. This achieves improvements in speech anonymization and naturalness, and I hope to develop extended uses for emotion and personality-design in human-computer interaction.
We investigate the use of different curriculum learning strategies for fine-tuning expert-domain document representation system. We achieve state of the art results for scientific document representation, and introduce a new benchmark for legal document representation.
We demonstrate that language models can be a powerful tool to measure echo chambers and identify cross-community ideological similarity.
We performed a computational study of the behavior of "power users" on Reddit. We found that users move through the social network in strategic ways, and that as they rise to power they increasingly use their power to drive division. We suggest new strategies for taming radicalization and addressing misinformation networks online.
We use techniques from multilingual embedding alignment to automatically discover polarized word meanings across communities, enabling a large-scale, multidimensional, and unsupervised study of worldview differences online.
Using methods from link analysis, we identify the most influential and powerful users on Reddit. Then, we explore how they move across communities over time.
Collected from over 28,000,000 repositories on Github, we built a training dataset, designed an evaluation suite (based on practical downstream tasks), and trained a set of baseline models for representing Github repositories. Also see work on Java vs. Python, in submission at CSCW 2022.
I love to write beautiful code that makes beautiful things! Sometimes when I get hooked on a game (like NYT's "LetterBoxed") I write a program to let me play more than 1/day. Sometimes I write projects to help me visualize data. Sometimes I turn classic algorithms into art :)
I developed an economic plan to combat climate change. My "green homesteading" program would simultaneously: create market incentives to promote energy farms, associate greentech with American entrepreneurship and innovation, and create a platform to address historic land inequity. You can read it here.
Some college friends and I built an app which helped people find the best dish on every menu. I worked on the machine learning pipeline, which involved named entity recognition, sentiment analysis, and working with data labeling contractors. I also built an trend-tracking tool for restaurants. The app is defunct, but I've documented some stuff here!
In college, I played Dungeons & Dragons with my friends. More than just play, I loved to design worlds, characters, and even the tabletop objects for the game! At UChicago's Fab(ulous) Lab, Elise and I made artistic player tokens and laser-cut wooden architectural maps.
For an added layer of irony, I programmed it entirely using Elm.
As a research assistant intern at the Berkman Center, I assisted with research on privacy and EdTech. I also managed the Center's "This Week in Student Privacy" newsletter (complete with weekly cat pictures), and advised the MIT Media Lab's Lifelong Kindergarten team on communicating the remix philosophy :)
Although I started programming in Microworlds at a very young age, it was not until Scratch came about that I fell in love with the ability to build my own worlds, games, and systems with code. I attended the very first Scratch conference, and later had the privilege to work with the Scratch team.