Experienced Data Analyst proficient in R, Python, Tableau, Javascript, and SQL. Skilled in cleaning and analyzing large datasets, merging data frames, and applying machine learning techniques to solve real-world problems. Holds a Bachelor's degree in Statistics with a minor in Mathematics from the University of California Davis, and recently finished a certificate in Data Analytics BootCamp from the University of California, Berkeley. Committed to lifelong learning and professional development, seeking a challenging opportunity to help companies and clients advance effectively and productively. Known for analytical problem-solving abilities and collaboration across diverse groups, with a focus on delivering detailed and efficient analysis for stakeholders and consumers.
2025
San Francisco State University
Practiced encompassing modern statistical and machine learning techniques, classical statistical theory, and practical applications. Students also develop and refine computational skills tailored for diverse datasets, particularly focusing on large-scale data common in business, technology, and science. The entire curriculum is built upon a solid foundation of statistical theory and an understanding of the mathematical principles behind various techniques and algorithms. This holistic approach equips students with a well-rounded skill set, preparing them for versatile applications in data analysis and decision-making across different industries.
2023
UC Berkeley Extenstion
Focused on the practical technical skills needed to analyze and solve data problems. Gain proficiency in a broad array of technologies like Excel, Python and R programming, JavaScript charting, SQL databases, Tableau, machine learning and more.
2022
University of California, Davis
Acquired quantitative and qualitative research and analytical skills to understand healthcare, social policy, and many other public datasets. Based on analysis create effective strategies to identify possible solution. My background in statistics and mathematics focused on fundamental linear algebra, data structure modeling, and analysis of algorithms.
The ultimate goal of this analysis is to determine the possible pipe break. In this hypothetical scenario, we're given some dummy client data which closely follows what we typically see in the real world. The goal is to clean and do some basic preprocessing on data, as well as provide some insight into how the data are structured.
Tools used   :   Python, Jupyter Notebook, Pandas, Supervised Machine Learning, Logistic Regression
Category  : Machine Learning, Time Series Forecasting
Year     :   March 2023
The All-Star Game is a game between teams of outstanding players and many baseball fans are interested in players to be chosen for next coming All-Star game. We made a prediction by applying effective machine learning techniques to MLB 2021 data, which was crawled from the web page. Through visualizing the data, we could understand the data better and found some meaningful insights that might improve our prediction. Since our target variable is a binary categorical variable, we used Logistic regression to train the model and we were able to get the probability of the players to be involved in the next All-Star game.
Tools used   :   Python, Jupyter Notebook, Pandas, Supervised Machine Learning, Logistic Regression, BeautifulSoup
Category  : Machine Learning
Year     :   June 2022
A potentially hazardous asteroid is with an orbit that can make close approaches to the Earth and is large enough to cause significant regional damage in the event of impact. By Identifying the potential hazardous asteroids, we can assess potential prevent the collision between the Earth and the asteroid. To achieve our goal to classify whether the asteroid is potentially hazardous or non-hazardous, we trained NASA Asteroids data with the Naive Bayes Classifier, Support Vector Machine, and Decision Tree. As a result, Decision tree modeling predicted hazardous asteroids with the best performance by achieving 99% accuracy.
Tools used   :   Python, Jupyter Notebook, Pandas, Supervised Machine Learning, Decision Tree
Category  : Machine Learning
Year     :   May 2022