Education

  • Ph.D. Department of Computer Science, University of California Santa Barbara, Santa Barbara CA.
  • M.S. Department of Electrical and Computer Engineering, University of Cape Town, Cape Town, South Africa.

Work experience

  • 2023-: Data Analyst II, Ookla, Seattle, WA
    • Created a methodology to benchmark end user’s quality of experience while accessing web pages for over 150 countries
    • Built a IP-subnet based manipulation detection framework
    • Developed a novel methodology to automate anomaly detection processes, significantly enhancing the system’s ability to identify and respond to irregular data patterns. This methodology not only streamlined operations but also improved the accuracy and efficiency of anomaly detection, contributing to more reliable data analysis and decision-making processes.
    • Engineered and deployed a robust data engineering pipeline using Airflow, specifically tailored to implement the newly developed anomaly detection methodology in a production environment. This strategic initiative ensured seamless integration and operational efficiency, markedly reducing manual intervention and facilitating continuous, automated analysis. The pipeline’s successful deployment has led to a more dynamic, scalable, and maintainable system, bolstering the organization’s data processing capabilities.
    • Part of the Ookla for Good team helping to create a template for better engaging with the universities and research communities.
  • 2018-2023: Ph.D. Computer Science, University of California Santa Barbara, Santa Barbara, CA
    • Designed and developed a new “broadband offering tool” and statistical frameworks to scalably collect and analyze price points of broadband offerings by major internet service providers in the U.S. The tool has been utilized to collect over 10 Million data points that enables enhanced insights into the problem of digital divide in the U.S. Currently focused on utilizing this tool and methodology to characterize the presence and extent of digital inequity between various geographical locations and demographics within the US. [in progress]
    • Developed a novel machine learning based Broadband Subscription Tier (BST) methodology, to idenitify the subscription tier a given speed test taken by a user originates from. The methodology significantly enhances our understanding of crowdsourced measurements which further enables effective and accurate utilization of large scale crowdsourced datasets from Ookla and Measurement-Lab. [To appear in IMC 2022]
    • Curated a data set of 17,000 tweets obtained from the social media platform, Twitter, and developed a natural language processing framework to detect and isolate power and communication outage related tweets to assist first responders in the event of natural disasters. Implemented 22 different machine learning algorithms and achieved close to 90% accuracy in performing the required classification task. [Published in WWW 2020]
  • 2022: PhD Research Intern, IBM Thomas J. Watson Research Center, Yorktown Heights, NY
    • Contributed to an Agent-Based Model (ABM) that simulates the impact of lack of good quality internet connectivity on populations of different socioeconomic status. Built the component of the model that estimates the effect of poor quality internet for different households. Additionally, contributed to the building of a web application that was deployed in IBM Cloud.
    • Conducted a longitudinal analysis on a dataset of 30 Million to understand how the internet quality has changed for different population groups. Applied statistical tests to determine the magnitude of difference in interent quality between sub-populations. Finally, deployed machine learning models to predict internet performance from demographics and infrastructure metrics. Work under review for publication.