Y. Afisha Project: Revenue and Cohort Analysis Insights

Y. Afisha Project: Revenue and Cohort Analysis Insights

The Y. Afisha Project delves into understanding revenue trends and customer behavior through comprehensive cohort analysis. This project aims to uncover key insights that can guide business strategy, particularly in enhancing customer retention and boosting revenue growth. By analyzing patterns over time and segmenting customers into cohorts based on their purchase behavior, the project sheds light on which strategies have the most significant impact on sustaining long-term business performance.

With a data-driven approach, the Y. Afisha Project helps stakeholders identify strengths and areas for improvement in customer engagement and revenue management. This analysis not only reveals how revenue evolves but also highlights the importance of retention strategies for maintaining a stable customer base.

Data Preparation

A solid data preparation phase was essential for the success of the Y. Afisha Project. This process involved meticulous cleaning and transformation to ensure the accuracy and reliability of the analysis. Key steps taken included:

  • Handling Missing Values: The dataset contained missing entries in some columns, notably in areas related to customer activity and revenue. A strategic approach was adopted where non-critical missing values were either left as NaN or filled using median values to maintain the data's integrity.
  • Date Transformations: Purchase dates were converted to datetime objects, allowing for easier extraction of features such as the month, year, and time-based segmentation needed for cohort analysis.
  • Customer Cohort Creation: Customers were grouped based on their first purchase date to create cohorts. This segmentation was crucial for tracking customer behavior over time and understanding how different groups contributed to revenue during their lifecycle.

Data Preparation with Three Main DataFrames

A.- Visits DataFrame (`visitsdf`):

Contains server log data detailing user visits to Y. Afisha's platform from January 2017 to December 2018.

Key columns include user ID (`uid`), device type, start and end timestamps of the visit, and source IDs.

Actions taken:

  • Converted timestamps to datetime format for analysis.
  • Dropped entries with inconsistent data (e.g., end timestamps earlier than start timestamps).

B.- Orders DataFrame (`ordersdf`):

Captures user orders, with columns such as user ID (`uid`), purchase timestamp (`buy_ts`), and revenue.

Actions taken:

  • Parsed the buy_ts as datetime objects.
  • Removed outliers and inconsistencies to maintain data accuracy.

C.- Costs DataFrame (`costsdf`):

Contains marketing costs associated with different dates, representing the expenses incurred by the platform for promotional activities.

Key steps:

  • Converted date columns to datetime format.
  • Verified consistency between marketing expenses and relevant timeframes.

Cohort Assignment and Merging

These dataframes were merged and aligned to enable a cohesive analysis:

  • Merging Process: The visitsdf, ordersdf, and costsdf dataframes were merged using user ID and date as primary keys. This merging allowed us to connect user interactions (visits), purchases (orders), and marketing expenses (costs) for a comprehensive view of user behavior and revenue generation.
  • Cohort Creation: Users were grouped into cohorts based on their first recorded visit (`start_ts`). This cohort label was then applied across all dataframes to track user activity, revenue contributions, and marketing impacts over time.

These steps in data preparation set a strong foundation for analyzing revenue, user behavior, and cohort retention.

Key Findings and Analysis

After preparing and merging the three main dataframes—`visitsdf`, ordersdf, and costsdf—we uncovered significant insights into revenue trends, cohort behavior, and the impact of marketing expenses. Below are the main findings:

Revenue Trends Over Time

The analysis of revenue from January 2017 to December 2018 provided the following insights:

  • Steady Revenue Growth: The data displayed an upward trend in revenue over the analyzed period, with certain months showing pronounced revenue spikes. These increases often coincided with major marketing campaigns or seasonal peaks, indicating successful promotional strategies.
  • Seasonal Influences: Months such as December and July stood out due to higher revenue, likely reflecting holiday and mid-year sales promotions that boosted user spending.


Revenue Distribution by Source and Device

The analysis of revenue by source and device provided valuable insights into user behavior and the contribution of different traffic sources. The visualization highlights the following:

Context and Data Interpretation:

  • Top Revenue Sources: The data showed that Source IDs 1 and 2 were the most significant contributors to total revenue, each generating over 1.6 million in revenue. This indicates that these sources are highly effective in driving user conversions and purchases.
  • Device Type Impact: Revenue from desktop devices formed the majority share across all sources, showcasing a higher conversion rate compared to touch devices (e.g., mobile). This finding suggests that desktop users tend to engage in more substantial or higher-value purchases.
  • Mixed Performance of Other Sources: Other sources such as Source ID 5 also contributed notable revenue, although significantly lower than the top sources. Source IDs 9 and 10 had minimal revenue impact, indicating their limited effectiveness.

Strategic Insights:

  • Optimization Focus: The strong performance of Source IDs 1 and 2 suggests that marketing and engagement efforts should prioritize these channels to maximize returns. Investing in campaigns that cater specifically to desktop users can further boost revenue, as these users demonstrate higher spending power.
  • Mobile Engagement: While touch devices contributed less revenue overall, their consistent presence indicates an opportunity to enhance mobile user experience and potentially increase conversion rates. This could involve optimizing the website for mobile use or introducing targeted promotions for mobile users.
  • Diverse Strategy for Low-Impact Sources: Sources such as 9 and 10 may need reevaluation or specific strategies to improve their effectiveness. Alternatively, resources may be better allocated toward already successful sources or exploring new channels.

This analysis demonstrates how understanding revenue distribution across different sources and devices can inform data-driven marketing and platform strategies.

Cost Distribution by Source ID

The Cost Distribution by Source ID visualization reveals how marketing resources are allocated:

  • Source ID 3 dominates with 42.9% of the total costs, indicating a significant investment in this channel.
  • Source IDs 4 and 5 also contribute notably with 18.6% and 15.7%, respectively.
  • In contrast, Source IDs 9 and 10 account for only 1.7% and 1.8%, suggesting limited effectiveness.

This data highlights the need to assess the ROI for high-cost sources and explore optimization strategies for lower-cost channels.

Revenue Distribution by Source ID

The Revenue Distribution by Source ID visualization provides insights into how different sources contribute to overall revenue. Here’s a concise analysis:

  • Source ID 1 contributes the largest share at 34.8%, indicating it is a key driver of revenue.
  • Source ID 2 follows closely behind with 33.7%, highlighting its significant role in generating income.
  • Source ID 5 accounts for 18.0%, while Source IDs 3 and 4 contribute smaller shares of 7.8% and 4.8%, respectively.

This distribution showcases the reliance on specific sources for revenue generation and highlights areas where efforts could be concentrated for maximizing returns.


Cumulative Revenue by Source ID

The Cumulative Revenue by Source ID visualization illustrates how revenue accumulates over time from different sources. Here’s a concise analysis:

  • Source ID 1 and Source ID 2 show the steepest cumulative revenue growth, indicating they are the most effective sources for generating income over the analyzed period.
  • Source ID 3 demonstrates moderate growth, while Source IDs 4 and 5 exhibit slower accumulation, suggesting they contribute less significantly to overall revenue.
  • Sources 7, 9, and 10 have the least impact, with their cumulative revenues remaining low throughout the time frame.

This data indicates that focusing on the top-performing sources, particularly Source IDs 1 and 2, can optimize revenue strategies, while further investigation into the lower-performing sources could reveal opportunities for improvement.

Number of Users by Source ID

The Number of Users by Source ID visualization provides valuable insights into user acquisition across different marketing channels. Here’s the analysis:

  • Top User Sources: Source IDs 3 and 4 stand out with the highest user counts, registering over 14,000 users each. This indicates that these sources are effective in attracting significant traffic to the platform.
  • Source IDs 1 and 2 also contribute a substantial number of users, with counts around 6,000 to 8,000. This suggests they are reliable channels for user acquisition.
  • Lower User Engagement: In contrast, Source IDs 7, 9, and 10 attracted far fewer users, indicating that these sources may not be as effective for driving traffic.

The data highlights the importance of focusing on high-performing sources for user acquisition, which can lead to increased revenue potential.

Customer Retention and Churn Analysis

In this section, we focus on understanding customer retention rates and identifying factors contributing to churn across different cohorts. By analyzing how users interact with the platform over time, we can derive actionable insights to enhance customer engagement and loyalty.

Retention Rates by Cohort

The cohort analysis provided key insights into retention trends:

  • Declining Retention Rates: As cohorts aged, a noticeable decline in retention rates was observed. For example, cohorts from the beginning of 2017 showed higher retention in their initial months compared to those formed later. This suggests that while new users may initially engage well, sustaining their interest over time is a challenge.
  • Cohort Performance: The first few cohorts performed significantly better in terms of retention compared to subsequent cohorts. Strategies that worked well for early adopters may need to be adjusted to maintain interest among newer users.

The Customer Retention Matrix reveals that October 2017 had the highest retention with 4,335 users, indicating strong engagement from that cohort. However, retention declines for newer cohorts, suggesting decreased engagement. The darker shades in the matrix represent higher user counts, quickly highlighting periods of strong retention. This analysis underscores the need for targeted strategies to maintain engagement with newer customer cohorts.

The Monthly Average Revenue by First Order Month matrix provides insights into average revenue generated from customers based on their first order month. Here are the key insights:

The highest average revenue appears in October 2017, at 26.8, indicating that customers who made their first purchase in that month tended to spend significantly more compared to other months.

In contrast, Source ID 10 shows the lowest average revenue, reflecting limited engagement or spending by users from that cohort.

The matrix also indicates a general trend where customers from earlier months tend to have higher average revenue, suggesting that initial engagement strategies were more effective during that period.

The Average Customer Purchase Size matrix reveals insights into the average spending behavior of customers based on their first order month and their cohort lifetime. Here are the key insights:

The highest average purchase size occurs for customers who made their first purchase in October 2017, reaching 26.8. This suggests that users acquired during this month tend to spend more than other cohorts, possibly due to effective marketing strategies or promotions.

As we move down the matrix, there is a general decline in average purchase sizes for newer cohorts, indicating that retention and purchase behavior may not be as strong for customers acquired later.

In particular, customers with a cohort lifetime of 0 months (i.e., their first purchase month) show varying average sizes but tend to be higher in the early months, suggesting initial engagement is crucial for spending behavior.

Churn Analysis

Understanding the factors that contribute to customer churn is critical for formulating retention strategies:

  • User Behavior Insights: Analysis indicated that users who visited the platform less frequently were more likely to churn. Therefore, increasing engagement through targeted campaigns can help mitigate this risk.
  • Marketing Impact: Churn rates appeared to correlate with marketing spend; periods of low marketing activity often coincided with higher churn rates. This emphasizes the importance of consistent engagement through effective marketing efforts.

The Cohorts: User Retention matrix provides crucial insights into how user retention varies across different cohorts based on their first activity month. Here are the key insights:

This matrix shows retention rates for various cohorts over their lifetime. For instance, the cohort that first engaged in June 2017 retains 7.9% of users by the end of the first month, dropping to 4.5% by the end of the 11th month.

The October 2017 cohort displays strong retention, maintaining 7.9% after one month but shows a gradual decline, reflecting the need for consistent engagement strategies.

The overall trend indicates that retention rates decline significantly as the months progress for all cohorts, highlighting a common challenge in maintaining user engagement over time.

The Cohorts: Cancel Rates matrix provides insights into the cancellation behavior of customers based on their first activity month. Here are the key insights:

The highest cancellation rate appears in January 2018, reaching 55.4%. This indicates that customers who started engaging with the platform during this period were more likely to cancel their accounts.

As we examine earlier cohorts, we see a gradual decline in cancellation rates, particularly for those who first engaged in June 2017, which has a low cancellation rate of 0.0%.

The matrix reveals a pattern where newer cohorts tend to experience higher cancellation rates over time, suggesting a challenge in retaining users acquired during those months.

Strategic Recommendations

To address the insights gained from the retention and churn analysis, the following strategies are recommended:

  • Implement enhanced engagement tactics through regular communication and personalized marketing efforts. Tailored promotions or reminders can effectively re-engage users who have shown reduced activity.
  • Establish feedback channels to better understand user needs and pain points. This feedback will inform adjustments to services or offers, ultimately improving user satisfaction and reducing churn.
  • Focus on targeted re-engagement campaigns for cohorts exhibiting declining engagement. Offering incentives can help encourage these users to return to the platform.

These recommendations underscore the necessity of proactive strategies to enhance user loyalty and support sustained revenue growth.

Conclusion

The analysis of user behavior, retention, churn, and revenue trends for Y. Afisha has provided valuable insights into the dynamics of customer engagement and profitability. Key findings highlighted the importance of understanding cohort performance, the relationship between marketing spend and revenue, and the varying behaviors of users across different sources and devices.

The recommendations outlined—such as enhancing user engagement through personalized marketing, establishing robust feedback mechanisms, and targeting retention campaigns—are crucial for addressing the challenges of declining engagement and churn rates. By refining marketing strategies and focusing on high-performing sources, Y. Afisha can leverage its data to foster user loyalty and drive sustained revenue growth.

Ultimately, a proactive and data-driven approach will empower Y. Afisha to adapt to changing user behaviors and market conditions, ensuring long-term success and profitability.

Click here to explore the complete analysis and technical details of the Y. Afisha Project by visiting the dedicated GitHub repository. Here, you’ll find the full Jupyter Notebook with code, visualizations, and comprehensive explanations of the methodologies and insights derived throughout the project.

https://github.com/ricardosillercardenas/ricardo_siller_da_projects/blob/main/Y.Afisha_Project.ipynb


Return to my complete list of Data Analysis projects


要查看或添加评论,请登录

Ricardo Siller的更多文章

社区洞察

其他会员也浏览了