July 2022

Data Manipulation
In Sql

The ongoing Wimbledon 2022 championships have inspired me to put a newly discovered excitement for data analysis into practice.

The Project

The goal of this project was to produce a Tableau visualisation comparing the success of this years womens finalists, Elena Rybakina and Ons Jabeur, an initial aim of three comparisons was set. These being: the variation of WTA Rank for each player throughout their career, the Wimbledon performance of each player throughout their respective careers and finally the road to this years final for both players. Additionally, a map of locations in which each player had previously won matches would be added to the visualisation and hence the data for this was prepared.

Wimbledon Career Comparison

The first task was to refine the dataset to display only the Wimbledon matches played by Elena Rybakina and Ons Jabeur, this years Ladies singles finalists. The original dataset described tennis matches with a winner_name column and a loser_name column. Therefore it was neccesary to generate a temporary table in which there is a column for the Player (either Jabeur or Rybakina) and the result of the match. Using the CASE function, matches were filtered based upon whether Jabeur and Rybakina were the winner or loser and inserted into a new table. The sum function was used to total the values from each match.

Career rank variation

The second point of focus was to understand how each of the players rankings had varied throughout their careers, therefore the data was refined to select games in which either Rybakina or Jabeur had played and extract the rank of the winner or loser, depending on which they had been in that match.

Next, a simple query to select the run of games for each player in this years competition was carried out.

Success Map

Finally, in order to use the geographical features within Tableau, the locations of each match needed to be assigned, rather than the tennis tournament name. A seperate list was produced assigning country names to the tournaments and competitions that vary in location based upon the year were seperately considered. Using a join, these were attatched to the data set and the data was grouped to total the number of matches won in each location.





Project Code

/* WTA Data from 2011-2022 */
select *
from PortfoiloProject..WTAdata
-- Wimbledon Career Stats by Player (Temp Table)
if OBJECT_ID('WimbledonCareers','U') Is not null
Drop Table WimbledonCareers
Create Table WimbledonCareers
(Date date, player nvarchar(255), opponent nvarchar(255), minutes numeric, breakpoints numeric,breakpointwins numeric,doublefaults numeric, aces numeric, winloss nvarchar(255))
Insert into WimbledonCareers
select tourney_date,winner_name as player,loser_name as opponent,minutes, l_bpFaced as breakpoints, (l_bpFaced-l_bpSaved) as breakpointwins, w_df as doublefaults, w_ace as aces,
CASE
when winner_name='Elena Rybakina' then 'Win'
when winner_name='Ons Jabeur' then 'Win'
else 'Loss'
end as winloss
from PortfoiloProject..WTAdata
where (winner_name='Elena Rybakina' or winner_name='Ons Jabeur') and tourney_name='Wimbledon'
Insert into WimbledonCareers
select tourney_date,loser_name as player,winner_name as opponent,minutes, w_bpFaced as breakpoints, (w_bpFaced-w_bpSaved) as breakpointwins, l_df as doublefaults, l_ace as aces,
CASE
when (loser_name='Elena Rybakina' or loser_name='Ons Jabeur') then 'Loss'
else 'Win'
end as winloss
from PortfoiloProject..WTAdata
where (loser_name='Elena Rybakina' or loser_name='Ons Jabeur') and tourney_name='Wimbledon'
delete from WimbledonCareers
where (player='Elena Rybakina' and opponent='Ons Jabeur')
or (opponent='Elena Rybakina' and player='Ons Jabeur')
select *
from WimbledonCareers
Select player, sum(minutes) as Totalmins,round((sum(breakpointwins)/sum(breakpoints)*100),1) as Breakpoint_Success, round(avg(doublefaults),1) as Avgdoublefaults, round(avg(aces),1) as Avgaces,
sum(case
when winloss='win' then 1 else 0
end)*100/count(winloss) as winpercent
from WimbledonCareers
group by player
-- Career Ranking variation
if OBJECT_ID('CareerRank','U') Is not null
Drop Table CareerRank
Create Table CareerRank
(Date date, player nvarchar(255), tourney_name nvarchar(255), wta_rank numeric)
Insert into CareerRank
select tourney_date, winner_name as player, tourney_name, winner_rank as wta_rank
from PortfoiloProject..WTAdata
where (winner_name='Elena Rybakina' or winner_name='Ons Jabeur')
Insert into CareerRank
select tourney_date, loser_name as player,tourney_name, loser_rank as wta_rank
from PortfoiloProject..WTAdata
where (loser_name='Elena Rybakina' or loser_name='Ons Jabeur')
select *
from CareerRank
group by Date, player, tourney_name, wta_rank
order by 2,1
-- Wimbledon 2022 Performance
select winner_name, loser_name, round, winner_ioc, loser_ioc, score, winner_rank, loser_rank
from PortfoiloProject..WTAdata
where year(tourney_date)='2022'
and tourney_name='Wimbledon'
and ((winner_name='Elena Rybakina' or winner_name='Ons Jabeur') or (loser_name='Elena Rybakina' or loser_name='Ons Jabeur'))
order by 1
--Success Map
--Matches won in different locations
--List of locations played at by the players
Select tourney_name
from PortfoiloProject..WTAdata
where (tourney_name Not like 'Fed Cup%' and tourney_name not like 'BJK Cup%' and tourney_name!='Olympics') and (winner_name='Elena Rybakina' or winner_name='Ons Jabeur')
group by tourney_name
--List of tournaments with varying locations based on year
Select tourney_name, year(tourney_date) as year
from PortfoiloProject..WTAdata
where (tourney_name like 'Fed Cup%' or tourney_name like 'BJK Cup%' or tourney_name='Olympics') and (winner_name='Elena Rybakina' or winner_name='Ons Jabeur')
group by tourney_name, year(tourney_date)
order by 2
--Join with seperate list to assign countries to each match played, grouped to find number of wins in each country
Select country, count(Country) as Wins, winner_name
from PortfoiloProject..WTAdata a
join PortfoiloProject..locationdata b
on a.tourney_name = b.tourney_name
where winner_name='Elena Rybakina' or winner_name='Ons Jabeur'
group by Country, winner_name
order by 1
Menu