In mid-2018, I discovered that I followed a disproportionately large number of people from the U.S.A.
I had 2 months to go before my first job in Bangalore and decided that I needed to familiarize myself with the people in the Design/Tech scene there.
So I picked someone I admire at random who was already a part of Bangalore’s design scene, who followed a good chunk of people and decided that their following list would be a good source for this.
While I could’ve simply checked out a 1000 Twitter profiles, I was also in the mood for a side-project since I was also itching to build post-college.
StumbleUpon was a serendipity engine. It took you from one cool website to another. You could bookmark these sites to check them out later. I absolutely loved discovering cool content on the internet using it.
I combined this nostalgia with my special affection for personal websites in order to take a crack at solving this problem to build… * drum roll *… StumbleTwitter
What StumbleTwitter does
You specify a Twitter account and this website extracts those profiles which have personal websites in their Twitter bio. It then presents these websites one at a time for you to stumble through them. Once I visit their websites and get a better idea of what they do, I can tag them for future reference. Repeat ad infinitum until all websites are exhausted.
Well, it functioned. I started with 341 twitter accounts and ended up classifying 192 of them. A significant number of Indian designers that I currently follow were directly a result of this experiment.
An observation from dogfooding
I noticed myself rushing through websites. The goal I started off with was discovering new people. This should have meant giving each website the time it deserved. Instead, I found myself only spending the minimum amount of time required to figure out how I could best categorize them. Interestingness quotient of a website didn’t lead to extra time spent. (qualitative, although I could have measured this quantitatively with effort)
- I suspect this had something to do with context. Since the website was presenting these links at random, I had no reason to care extra. Which might qualify for ‘not a bug, it’s a feature’. But perhaps showing some of their top-tweets along with the website could’ve given me a better reason to care.
- I had to switch BACK to the original tab in order to tag them. And the button was a ‘Submit & Next’. Perhaps I could’ve quelled this tendency not by removing it but by allowing myself to tag the website while on their website only. So this way, I could instantly tag them based on first impressions but also not be hurried into going to the next one.
One problem I solved
I extracted websites from Twitter bios but I was only really interested in personal websites because I wanted to learn about the person not the place they worked at. So when extracting the website urls, I ran a similarity check for the twitterhandle and the personal website. This weeded out the accounts of those who use their company websites as their website link. There may have been false-negatives too.
Couple of problems I ran into
- I discovered a little too late that my method of storing data by writing to an on-disk file is actually NOT recommended when deploying and that Heroku erases these changes every 24 hours. This meant that while the website worked, it didn’t have any lasting value for me. I needed to figure out how to get databases to work in order to fix this and that proved to be a hill bigger than what I had time to climb back then.
- Being hyper-fixated on getting all of this to a functional-running state that I didn’t consider some obvious cases. Some views (list of 20+ results) didn’t render properly and overflowed.
How was this built?
Extracted the Twitter data using Tweepy
Wrote the data into a pickle file on disk
Flask for the front-end and back-end
Heroku for deploying the project
Thanks to @jjmojojjmojo on Twitter who suggested I use Flask instead of Django since it’s much more lightweight. Flask turned out to be extremely approachable and super fun to use.