Data Science

# Shopping cart analysis with R – Multi-layer pie chart

01st Jul `15, 02:50 PM in Data Science

In this post, we will review a very interesting type of visualization – the Multi-layer Pie Chart – and use…

 Sergey Bryl' Contributor Follow

In this post, we will review a very interesting type of visualization – the Multi-layer Pie Chart – and use it for one of the marketing analytics tasks – the shopping carts analysis. We will go from the initial data processing to the shopping carts analysis visualization. I will share the R code in which you shouldn’t write code for every layer of chart. You can also find an example about how to create a Multi-layer Pie Chart here.

Ok, let’s suppose we have a list of first orders/carts that were bought by our clients. Each order consists of one or several products (or category of products). Our task is to visualize a relationship between products and see the share of orders that includes each product or combination of products. The Multi-layer Pie Chart can help us draw each product and its intersections with others.

We loaded the necessary libraries with the following code:

We will simulate an example of the data set. Suppose we sell 4 products (or product categories): a, b, c and d and each product can be sold with a different probability. Also, a client can purchase any combinations of products, e.g. “a” or “a,b,a,d” and so on. Let’s do this with the following code:

After this, we will process data for creating data frame for analysis. Specifically, we will:

• Remove the duplicates. For example, if the order consists of more than one similar product (“a,b,a,d”), we want to exclude the effect of quantity,
• Combine products with the new feature ‘cart’ that will include all unique products in the cart,
• Calculate number of carts (‘num’ column).

Let’s take a look at the resulting data frame with the head(prod.matrix) function:

From this point, we start working on our Multi-layer Pie Chart. My idea is to place orders that include one product into the core of the chart. Therefore, we’ve calculated the total number of products in each combination (‘prod.num’ value) and will split data frame for two data frames: the first one (one.prod) will include carts with one product and the second one (sev.prod) with more than one product.

Therefore, the data is ready for plotting. We will define parameters for the chart with the following code:

Note: we’ve defined the color palette with fourteen colors including white color for spaces. This means if you have more than thirteen unique products in the data set, you need to add extra colors. Finally, we will plot the Multi-layer Pie Chart with the following code:

In case you want to add some statistics on plot, like the total number of combinations or share of combinations in total amount, we just need to create this table and add it on plot with the following code:

Therefore, we’ve studied how The Multi-layer Pie Chart can help us to draw each product and its intersections with others.