How to count distinct number of customers?

This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

  • How to count distinct number of customers?

    Hello All.

    I managed so far to install Palo, modelize a database, import data and generate nice reports with Palo.

    However I'm stuck with an issue that could anihilate all these efforts.

    How to count the number of clients buying a certain type of product for a certain period of time. I will give you a very simplified example:

    You have 3 customers: A, B and C.
    you have 4 products with 2 family: Fam1 with products P1 and P2 and Fam2 with products P3 and P4.

    Let say customer A bought 3 P1 and 4 P3.
    Customer B 5 P2
    Customer C 6 P1, 2 P2, 8 P3 and 1 P4.

    Therefore you will agree that 2 distinct customers bought P1 and 2 distinct customers bought P2.
    But only 3 distinct customers bought products of family Fam1!
    In other words the consolidation rule is not 2+2 = 4. but 2+2 = 3 because customer C bought P1 AND P2.

    Of course we could make the same example with the time dimension:
    3 active customers in month1, 4 in month 2 but total month1 + month2 is 5 and not 7 since two customers who bought in month 1 also bought in month 2.

    How can we do that in Palo??
    someone else must have had to deal with this kind of issue.
    Counting the number of active customers, new customers, lost customers, conquered customers etc. is a classic and very important KPI for many businesses.

    Thank you for any advice,


    The post was edited 1 time, last by gilberto ().

  • Hi to all Palo users again.

    I made some progress on the above topic.
    I now know exactly what I would need to solve the issue.
    A rule able to sum up the values of all child elements when these elements are themselves calculated by a rule.


    I have a customer dimension like this:
    ChannelA is the parent element with 3 child elements:
    Cust A
    Cust B
    Cust C.

    I have a rule let's call it is_active that calculates if the customer is active or not. If active then it sets the cell to 1 if not it sets it to 0.

    Let's say that Cust A is active and Cust C is active. I will have:
    Cust A =1.
    Cust B = 0.
    Cust C = 1.

    All I would need is a rule able to say that ChannelA = 2.

    Any idea how to do it?

    Thank you,


    The post was edited 1 time, last by gilberto ().

  • If you create a separate cube dimensioned by customer, product and time and have an intercube rule that says if sales for this customer product combination in your main cube > 0 then the value in this cube = 1 then you can sum them in each period to get a distinct count of customers who bought the product each period. Would this answer your requirement?
    Best wishes

    John Hobson
    The Planning Factory, Lytham, UK
  • Hello Pommie.
    First of all thank you for answering me.

    Either I do not understand right your proposal (most likely) or I don't see how it can work.

    Let's simplify as much as possible the example and take only two dimensions:



    as well as two measures, QTY and NB_ACTIVE.
    QTY is filled in the cube.
    NB_ACTIVE has to be calculated by a rule.

    What we want is to have the distinct number of active customers (QTY>0) for any product or family of product.

    For a specific customer, for instance CustA, NB_ACTIVE will have only two possible values, 0 if CustA is not active and 1 if Cust A is active.

    But for CHANNEL1 NB_ACTIVE should vary between 0 (no active customers for the products or product family) and 3 (all customers active).

    If I follow what you propose I manage to get the good values for each customers.
    In the second cube (call it "sales_work") I write a rule like this one:

    ['NB_ACTIVE'] = N: IF(GT(PALO.DATA("TEST_DN", "sales", !'Customers', !'Products', "Qty"), 0), 1, 0)

    But then how can I sum up correctly the value at the product family level?

    Right now if CustA is active for P1 and P2 then I have 2 instead of 1 for FAM1.

    And if I change the rule to set "ALL ELEMENTS" instead of "based elements" I've got the good value at the product family level.

    But for CHANNEL1 I have 1 instead of 2 or 3. (the rule applies to channel1 like if channel1 was a customer).

    I hope I'm clear :)
  • I did say that this solution only works for each period!!

    "you can sum them in each period to get a distinct count of customers who bought the product each period"

    I'm not sure without spending more time one this how best to suggest you approach the time issue there.

    I'll have a think over the weekend if no-one else suggests anything
    Best wishes

    John Hobson
    The Planning Factory, Lytham, UK
  • Hi everyone.. I have a similar problem at the moment..
    my idea is to create a subset for the specific scenario in a cell.
    set the data filter accordingly (Number of buys > =)

    Then change the palo.subset() function to palo.subsetsize() and this should give you the number of distinct subjects..
  • Hi everbody,

    The palo function PALO.SUBSETSIZE works like a SQL count distintc.
    The syntax are the same like PALO.SUBSET, but if you use the sice type, the function returns the number of the diferents elements instead of an elements array.

    [Blocked Image:]

    Jorge López Hernández

    Business Intelligence Consultant

    +34 912 868 220–