This section covers questions for the stake holder to be discussed at the client meeting on May 8, 2023, in order to fine-tune the recommendations. The fine-tuned recommendations will be submitted by May 15, 2023.

The following are the follow-up questions for discussion with the stake holder (client)

  1. Filtering recommendations made based on temporal insights

    1. Recommendations included weekend hours. Is ad space priced differently on weekends and weekdays? Should weekends be considered at all? Weekends are primarily capturing ridership by casual members. Would it be preferred to primarily target hybrid workers (mixture of annual and casual members) on weekdays only and ignore casual members on weekends?
  2. Filtering selected top stations and recommendations made based on geospatial insights

    1. Of the top ~90 recommended stations, will a smaller grouping of stations be more cost-effective? If so, then some ideas for follow-up analysis include
      1. (Discrete segmentation) Select a grouping based on

        1. performance, such as one of the following
          1. choose top 25 stations only
          2. choose top 50 stations only
          3. etc.
        2. location, such as one of the following
          1. choose all top-performing stations with downtown neighborhoods only
          2. choose stations within subset of downtown neighborhoods that are closest to the MCU campus
          3. etc.
      2. (Data-driven segmentation) Perform further analysis to reduce number of stations

        1. (option 1) cluster top ~90 stations or neighborhoods containing top ~90 stations
          1. if it is possible to identify k clusters using historical ridership and other station attributes, then choose top stations within one or more discovered clusters
        2. (option 2) forecast future ridership for top ~90 stations or neighborhoods containing top ~90 stations
          1. forecast future weekly or monthly departures and arrivals for the next four weeks (or one month) per
            1. station for the top ~90 stations, or
            2. neighborhood, for all neighborhoods containing the top ~90 stations
          2. (discrete) choose the stations that rank in the top
            1. 25, in terms of forecasted ridership (departures and arrivals)
            2. 50, in terms of forecasted ridership
          3. (data-driven) cluster using historical and future ridership and other metadata
            1. identify k clusters based on historical and future ridership
            2. choose stations within one or more discovered clusters

        In the recommendations, two groupings of stations were identified based on geospatial insights that were extracted during the EDA step. A high-level profile was provided for each grouping. The data-driven approach above has the added benefit of being able to create a non-geospatial profile for each cluster of stations that is discovered.

        In order to perform clustering at the

        • station level
          • the clustering dataset can be used as is
        • neighborhood level
          • based on published research on this topic, the distances between stations and the five closest points of interest in the clustering dataset would need to be replaced by the number of stations per neighborhood

        This approach could further help in marketing initiatives to fine-tune the content displayed in the digital ads by identifying a theme of the content to be displayed at each cluster of stations.

  3. Are there additional constraints on budget that should be incorporated into recommendations?