DMI Consolidation
Developing Meaningful Indicators (DMI) was a module in USP taught be Dr. Charles Burke. DMI allowed us understand, use, and visualise data in a way that prioritises creating an effect on others. This is in-line with the USP’s purpose to shape independent, adaptable thinkers and doers who will make an impact in the world.
This page serves as a consolidation of the notable work that I have produced, as well as a reflection of my takeaways from this module.
Contents
Iterations
This module was designed to promote iterative learning, where multiple “mini-project” help with the learning and development. The following are some of the key iterations that I have produced, presented chronologically. In all, this would also be an indicator of my progress throughout the semester.
Sentiment Analysis
Inspired by a comment about Natural Language Processing (NLP) by Dr. Charles, my friend Ling Hui and I attempted to try it out for ourselves. We initially hypothesised that people on Reddit would generally give comments with positive sentiment as after all, Reddit is a community platform where people would share their ideas and stories with one another. We also thought that there would be some positive correlation between sentiment score and score.
Among all the graphs we then produced, this was a scatter plot of average sentiment score against average score in upvotes.
We co-documented details of the entire process in a Medium article. The article was subsequently published onto Towards Data Science and is titled “Trying to use a 30GB database for sentiment analysis”. More graphs created can also be found in the article as well.
30% Rule
This visualisation was inspired by a Reddit post which used the 30% rule to predict minimum wages in the US. The 30% rule states that one should spend no more than 30% of their salary on rental. Naturally, used this idea and produced a graph for Singapore using median rental rates in different neighbourhoods.
I shared this visualisation onto Reddit. Below are the links to the posts on the different subreddits. My friend Ling Hui had helped me crosspost into r/singapore since I did not have enough karma at that time to post in it.
Comparing Machine Learning
My classmate Nicole used Azure Machine Learning on Excel to conduct sentiment analysis on over 4000 Pfizer-related tweets in order to determine how most people felt about the vaccine. Based on my previous visualisation using AWS Comprehend this actually got me asking, “Which of the AI models is more accurate?”
I then compared the scores resulting from both algorithms to produce the graph below.
This whole process was written into a Medium article that is titled “Comparing AWS Comprehend and Azure ML on Excel”. Additionally, I also shared another visualisation to Reddit.
Property Cooling Measures
This next visualisation came about because I was discussing BTO-ing with my girlfriend and I was wondering about BTO prices in singapore. As we discussed, the idea of buying a resale flat came up as a viable option. I then came across this article that talked about how more property cooling measures were to be expected in the coming months.
I plotted the resale prices of HDBs over the years, as well as the various property cooling measures.
This was also shared onto Reddit on several subreddits.
Singapore’s Minimum Wage
I collaborated with another classmate Erika who shared an interest in minimum wage in Singapore. We collected data from 100 countries and used a simple linear regression model to predict what Singapore’s minimum wage would be based on all the variables collected. Through that process, we discovered certain trends between the variables, one of them is represented below.
The entire process was written over 2 Medium articles which were both cowritten by Erika and I. The first is “If Singapore Had a Minimum Wage, what could it be?” and the second is “Stepwise Regression Tutorial in Python”. The visualisation was also shared on Reddit, where one of the post was posted by Erika.
Avengers
I had wanted to go back to the core idea of developing a meaningful indicato” and with a topic that was organically inspired, so I started listening to the people around me. The biggest thing that came up was my friend complaining that WandaVision fans were hypocrites. As of Avengers: Age of Ultron, both Wanda and Vision were not very well received. However, it is only after the hype of the television series WandaVision that those two beloved characters became beloved.
This visualisaiton aims to be an indicator of the popularity of each Avenger.
Among all my visualisations, this was the most well received on Reddit.
Reorganisation
A informal hackathon was held by Dr. Burke as a way for the class to showcase our skills during the last week of the semester. It was held during lecture time slot and topics were assigned to pairs at random. My partner Kok Lee and I were assigned the topic of “Reorganisation”. We addressed this topic in 3 broad topics.
Firstly, the social aspect covered the relationship between midlife crisis and one’s social life.
Secondly, the economic aspect brought us to look at how Disney’s worth changed as they acquired various companies.
Lastly, the political aspect allowed us to find a way to represent a president’s influence using political realignment.
These were then shared on a private subreddit.
- r/meaningfulindicators Midlife Crisis
- r/meaningfulindicators Disney
- r/meaningfulindicators Political Realignment
Reflection
Looking back on all the visualisations I have made, I believe that I’ve had come a long way, thanks to the guidance of Dr. Charles and my classmates. My biggest takeaway would be this: What are you trying to say and how can you do that?
Initially, it was clear that my graphs were merely showing data, taking raw numbers and giving it to someone in a picture. Of course, this was the intended first step to take, as planned out by Dr. Burke. Before becoming good at developing meaningful indicators, we must first dabble with data and experiment with them. That was what sentiment analysis and comparing machine learning algorithms were. Taking something that was interesting to me, getting the data, and giving it wholesale to someone else.
The next natural progression would then be making the graphs nicer and more aesthetic. This involved playing with tools and features of those tools. The first of which is definitely Tableau, which was where 30% Rule was birthed. After which, I not only used Tableau but also PowerPoint to produce the colourful property cooling measures. Lastly, DataWrapper was used to create Singapore’s minumum wage, which attempted to further spawn conversation using a catchy title.
However, a flashy graph is not enough. I draw upon this video that I had come across and watched nearing the end of the semester. This video was Storytelling with Data by Cole Nussbaumer who was an ex-employee of Google, and had returned to Google to talk about the new book that she had written about data visualisation.
She covered two main topics which were focusing attention and telling a story. Focusing attention involved the technical use of position and colours to bring the attention of the audience to where you want the audience to look. Telling a story involved attaching a narrative to the chart; audiences pay attention to your diagram when they know what you are trying to tell them.
This forms the basis of my main takeaway from this class, which is “What are you trying to say and how can you do that?”. This thought process ended up motivating the final few visualisations avengers and those in reorganisation. Especially in the midlife crisis chart, a large focus was placed on creating a message, and using colours to convey that message.
This class is the most unique one that I have taken so far, with no assignments or tests, but with a strong learning objective so that we students can make a strong impact in the world. Ultimately, that is the most valuable thing, and the soft skills will be the ones that can bring us further in life. Thank you Dr. Burke for this experience, and allowing me to remember that visualisations, like most things we do in life, are done to benefit others. Only when we truly see things from the perspective of our audience, then can we truly become effective thinkers and doers, to make a difference in the world.
Summary of Impact
These are statistics collated as a personal measure of my impact on the world. The statistics are accurate as of 17 April 2021.
Platform / Subreddit | Title | Reads / Upvotes | Link |
---|---|---|---|
Medium | Trying to use a 30GB database for sentiment analysis | 365 | Link |
r/dataisbeautiful | [OC] Inspired to create “minimum wage” for Singapore based on 30% rule | 43 | Link |
r/singapore | [OC] Inspired to create “minimum wage” for Singapore based on 30% rule | 37 | Link |
Medium | Comparing AWS Comprehend and Azure ML on Excel | 159 | Link |
r/dataisbeautiful | [OC] Difference between Azure and AWS ML on same dataset | 15 | Link |
r/SentimentAnalysis | [OC] Difference between Azure and AWS ML on same dataset | 2 | Link |
r/dataisbeautiful | [OC] How effective are Singapore’s property cooling measures? | 46 | Link |
r/singapore | [OC] How effective are Singapore’s property cooling measures? | 167 | Link |
r/SingaporeRaw | [OC] How effective are Singapore’s property cooling measures? | 8 | Link |
Medium | Stepwise Regression Tutorial in Python | 1348 | Link |
r/dataisbeautiful | [OC] How are Civil Rights related to Minimum Wage? | 716 | Link |
r/SingaporeRaw | If Singapore Had a Minimum Wage, What Would it Be? | 11 | Link |
r/marvelstudios | Who is the Most Popular Avenger? | 970 | Link |
r/Marvel_Movies | Who is the Most Popular Avenger? | 88 | Link |
r/CaptainAmerica | Who is the Most Popular Avenger? | 47 | Link |
r/ironman | Who is the Most Popular Avenger? | 43 | Link |
r/Avengers | Who is the Most Popular Avenger? | 17 | Link |
Other Visualisations
A collection of all visualisations that I had created during this module can be found here.