Python, SQL, Data Pipeline Automation, Statistical Analysis (Correlation & Regression), Power BI (DAX & Visualization)

YouTube Channel Analytics

Scroll

View in GitHub

Description

This project focused on building a complete data analytics pipeline to monitor and evaluate YouTube channel performance metrics using the YouTube Data API v3, Python, and SQL. I engineered a system to automatically extract video-level statistics such as views, impressions, likes, and comment counts, storing daily data snapshots in a structured SQL database for time-series analysis. This automation eliminated manual tracking and enabled consistent, scalable data collection across multiple video categories and upload periods.

Once the data pipeline was established, I conducted comprehensive exploratory and statistical analysis in Python to uncover patterns in audience engagement, retention, and growth. Techniques such as correlation and regression modeling helped quantify relationships between variables like content type, video length, and upload frequency, revealing the factors most strongly associated with audience reach and performance consistency. These findings provided meaningful insights into how production and scheduling decisions could impact viewership and subscriber dynamics.

To enhance interpretability, I designed an interactive Power BI dashboard that visualized audience growth, engagement ratios, traffic sources, and category-level performance. Using DAX calculations and dynamic filters, the dashboard allowed real-time exploration of trends while simplifying complex metrics into intuitive visuals. This analytical interface offered a clear understanding of content strategy effectiveness, enabling data-driven recommendations to improve engagement and visibility.

By combining API integration, statistical modeling, and visualization, the project demonstrated the value of automation and structured analytics in understanding digital content ecosystems. It bridged raw platform data with actionable insights, providing a scalable framework for ongoing content performance evaluation and strategy optimization.

Key Outcomes

This project successfully automated the extraction and analysis of complex YouTube data, transforming disparate video metrics into a cohesive analytical workflow. Through the integration of the YouTube Data API with SQL-based storage and Power BI visualization, I created a unified platform that tracked audience engagement and growth across multiple time horizons. The analysis identified key drivers behind audience retention, revealing that factors such as upload consistency, video duration, and content type had significant correlations with engagement and growth. The project also served as a model for how API-driven pipelines can support performance tracking, strategic decision-making, and continuous optimization of content strategy.

Future Considerations

Moving forward, this project can be expanded to achieve greater scalability, depth, and accessibility. Migrating the SQL backend to a cloud data warehouse such as PostgreSQL or BigQuery would allow faster querying and support multi-channel analytics at scale. Adding predictive analytics, such as models to forecast future viewership, retention rates, or engagement trends, would elevate the dashboard into a proactive strategy tool. Integrating private creator metrics through OAuth authentication could unlock deeper insights into demographics, watch time, and audience sources. Finally, deploying the dashboard in Streamlit or a web-hosted environment would make the analytics interactive and shareable in real time, turning this personal project into a complete end-to-end content intelligence solution.

Previous
Previous

Sales Performance Analytics

Next
Next

Oscars - Data Analysis