Data Analysis: Developer 101 | Data Science Uncovered
A Crisp analysis over available data from students registrations and attendees information from GoToMeeting
Developer 101 | Data Science Uncovered
The following notebook is an analysis of an online webinar organised by Sathyabama Coding Club
- The data used here is private
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
gotom_data = pd.read_excel("Developer 101_ Data Science Uncovered Attendees.xls")
reg_data = pd.read_excel("Developer 101 _ Data Science Uncovered (Responses).xlsx")
gotom_data.head(5)
attn_data = gotom_data.iloc[6:,:5]
attn_data.columns = gotom_data.iloc[5:,:5].iloc[[0]].values.reshape(5,)
attn_data.reset_index(drop=True, inplace=True)
attn_data.head()
SESSION_DURATION = int(attn_data['Time in Session (minutes)'].max())
print("Session Duration is Minutes: ", SESSION_DURATION)
no_of_regs = len(reg_data)
REG_COUNT = no_of_regs
print("No of Registrations : ", REG_COUNT)
reg_data.Batch.value_counts()
sns.countplot(x="Batch", data=reg_data)
plt.title("Barplot on Academic year participation")
sns.countplot(x="Have you ever worked with Data Science before?", data=reg_data)
plt.title("Barplot over the background of participants in Data Science")
print(reg_data['Have you ever worked with Data Science before?'].value_counts())
web_mob_yes = reg_data["Have you ever worked with Data Science before?"].value_counts()[1]
web_mob_no = reg_data["Have you ever worked with Data Science before?"].value_counts()[0]
print("The percentage of people joined the webinar who has worked with Data Science bofore :",
(web_mob_yes/no_of_regs)*100)
print("The percentage of people joined the webinar who has never worked with Data Scinece before :",
(web_mob_no/no_of_regs)*100)
sns.countplot(x="Knowledge of Python Programming language ",
data=reg_data)
sns.countplot(x="Do you think Math is Required for Machine Learning?",
data=reg_data)
reg_data["Where do you wish to use Data Science skills?"].value_counts().plot(kind='barh', figsize=(10,10))
Observation:
- Majority of the participants Wished for Learning Data Science for:
- Business Analytics
- Predictions/ Forecast
- Natural Language Processing
- Computer Vision
attn_data.head()
ATTENDEES_COUNT = len(attn_data['Name'].value_counts())
ATTENDEES_COUNT
len(attn_data.groupby(by=attn_data.Name, axis=1).sum())
attn_data.groupby(['Name', 'Time in Session (minutes)']).sum().iloc[:,:0].head(20)
# Converting the 'Time in Session (minutes)' column values to int
attn_data['Time in Session (minutes)'] = pd.to_numeric(attn_data['Time in Session (minutes)'])
type(attn_data['Time in Session (minutes)'].iloc[0])
(attn_data['Time in Session (minutes)'] == attn_data['Time in Session (minutes)'].iloc[0]).all()
def time_agg(group_series):
if (group_series==group_series.iloc[0]).all():
return group_series.iloc[0]
else:
return group_series.sum()
attn_data.groupby('Name', as_index=False).agg(time_agg)[['Name', 'Join Time', 'Leave Time', 'Time in Session (minutes)']]
atten_group_df = attn_data[['Name', 'Time in Session (minutes)', 'Email Address']].groupby('Name', as_index=False).agg(time_agg)
atten_group_df.sort_values(by=['Time in Session (minutes)'],ascending=False, inplace=True)
sns.factorplot(x="Name", y="Time in Session (minutes)",
data=atten_group_df, kind="bar",
size = 15, aspect=2,
palette = "muted")
# for value in plot:
# height = value.get_height()
# plt.text(value.get_x() + value.get_width()/2.,
# 1.002*height,'%d' % int(height), ha='center', va='bottom')
plt.xticks(rotation=45);
Individual time spent analysis of attendes
sns.factorplot(x="Name", y="Time in Session (minutes)",
data=atten_group_df[atten_group_df["Time in Session (minutes)"] >= SESSION_DURATION//2],
kind="bar",
size = 8, aspect=2,
palette = "muted")
plt.xticks(rotation=45);
atten_group_df[atten_group_df["Time in Session (minutes)"] >= SESSION_DURATION//2][['Name', 'Time in Session (minutes)']].set_index('Name')
atten_group_df
len(atten_group_df[atten_group_df["Time in Session (minutes)"] >= SESSION_DURATION//2].set_index('Name')['Time in Session (minutes)'])
registerd_attendes_ratio = (ATTENDEES_COUNT/REG_COUNT) * 100
print("Percentage of Students registered and attended the session {}".format(registerd_attendes_ratio))
Registration Data Analysis
- Name of the Event: Developer 101 | Data Science Uncovered
- No of registrations: 186
-
Registration Count with Batch filter
- 2021 : 76
- Professional : 69
- 2022 : 36
- 2023 : 5
-
No of registrations With out prior knowledge of Data Scinece : 50 [61.82795698924731%]
- No of registrations With prior knowledge on Data Science : 71 [38.17204301075269%]
- No of registrations who are Beginners in Python Programming language : 76
- No of registrations who are Intermediate in Python Programming language : 98
- No of registrations who are Advanced in Python Programming language: 12
- Registrations wish to use Data Science in
- Business Analytics, Prediction / Forecast, Natural Language Processing, Computer Vision 39
- Business Analytics, Prediction / Forecast 24
- Business Analytics, Prediction / Forecast, Natural Language Processing 16
- Prediction / Forecast, Natural Language Processing, Computer Vision 15
- Business Analytics 14
- Prediction / Forecast, Natural Language Processing 11
- Computer Vision 11
- Natural Language Processing, Computer Vision 10
- Prediction / Forecast 10
- Business Analytics, Computer Vision 7
- Business Analytics, Natural Language Processing 6
- Natural Language Processing 4
- Prediction / Forecast, Computer Vision 4
- Business Analytics, Natural Language Processing, Computer Vision 3
- Business Analytics, Prediction / Forecast, Computer Vision 2
- Natural Language Processing, Computer Vision, Deep Learning 1
- Business Analytics, Prediction / Forecast, Information Security 1
- Business Analytics, Prediction / Forecast, Natural Language Processing, - Computer Vision, AI, Implementation in Web Apps 1
- Computer Vision, GAN 1
- Business Analytics, Prediction / Forecast, Natural Language Processing, Computer Vision, Almost every area 1
- Business Analytics, Prediction / Forecast, Use in data analysis for payments , health science and machine learning 1
- Business Analytics, Prediction / Forecast, Natural Language Processing, Computer Vision, BFSI,Medical health care, image processing etc.. 1
- AI Music Composer 1
- Business Analytics, Prediction / Forecast, Natural Language Processing, Computer Vision, Specific Industry 1
- Business Analytics, Prediction / Forecast, Natural Language Processing, Computer Vision, Recommendation 1
Webinar Attendes Data Analysis
- No of Attendees: 54
- No of students spent more than half in the session: 28
- Percentage of Students registered and attended the session 29.03225806451613%
- Students spent more than half in the session
Sri Harish Teja Kummarikuntla Dikshita Basu Kajjal sneha gupta Sourav Kumar Rehan Razak sahib pratap singh Aditya Gowrish Menti reconnecting.... Suryanshu Singh Abhiram AJ Deepansh Neeraj Jayaram Dinesh L Alok Kumar Fireflies.ai Notetaker Gaurav Santosh Kumar Mugunthan Akash M Hardik Gupta keshav Mostlyinsane BVN PRANEETH Anand