Efficient way of finding overlapping elements of sets and dispersing those elements b/w the sets

Discussion in 'AutoCAD' started by Leo, Jul 25, 2003.

  1. Leo

    Leo Guest

    Hi, I need some help with a problem we are trying to solve: We need an
    efficient way of finding the counts of different sets taking into
    account the overlap between the different sets. Also the counts of the
    different sets
    should take into account the way the overlaps occur between the sets.
    i.e. the counts to disperse the elements of the overlap between the
    two sets needs to take into account the universe count of each set and
    then use a ratio of the universe counts to disperse the overlapping
    elements.

    To better illustrate this, lets say there are two sets, whose
    universes are depicted by A and B respectively:

    Scenario I (Two Sets A & B):
    -----------------------------------

    Count (A) = 2000 // Universe Count for Set A

    Count (B) = 3000 // Universe Count for Set B

    Count (A & B) = 100 // Count for overlapping
    elements b/w Sets A & B

    Count (A - B) = 1900 // Count for elements
    belonging uniquely to Set A (and not B)

    Count (B - B) = 2900 // Count for elements
    belonging uniquely to Set B (and not A)


    Now the new counts for sets A & B will be computed as follows:

    Real_Count_A = Count(A - B) +
    Count(A & B) * [Count(A) / (Count(A) + Count(B))]

    Real_Count_B = Count(B - A) +
    Count(A & B) * [Count(B) / (Count(A) + Count(B))]

    i.e.

    Real_Count_A = 1900 +
    100 * [2000 / (2000 + 3000)]

    Real_Count_A = 1900 + 40 = 1940


    Real_Count_B = 2900 +
    100 * [2000 / (2000 + 3000)]

    Real_Count_B = 2900 + 60 = 2960


    So the count for Set A is made up of 1940 elements of which 1900
    elements belong exclusively to it plus the 40% of the 100 overlapping
    elements.
    Similarly the count for Set B is made up of 2960 elements of which
    2900 elements belong exclusively to it plus the 60% of the 100
    overlapping elements.
    [ This is because the ratio to disperse the overlapping
    elements b/w A and B was Count A / Count B = 2000 / 3000 = 40% / 60% ]


    Now if you were to extend this scenario to 3 Sets, the logic becomes:

    Scenario II (Three Sets A, B & C):
    -----------------------------------

    Real_Count_A = Count(A - B - C)
    + Count(A & B & ~C) * [Count(A) / (Count(A) + Count(B))]
    + Count(A & ~B & C) * [Count(A) / (Count(A) + Count(C))]
    + Count(A & B & C) * [Count(A) / (Count(A) + Count(B) + Count(C))]

    Real_Count_B = Count(B - A - C)
    + Count(B & A & ~C) * [Count(B) / (Count(A) + Count(B))]
    + Count(B & ~A & C) * [Count(B) / (Count(B) + Count(C))]
    + Count(A & B & C) * [Count(B) / (Count(A) + Count(B) + Count(C))]

    Real_Count_C = Count(C - A - B)
    + Count(C & B & ~A) * [Count(C) / (Count(B) + Count(C))]
    + Count(C & ~B & A) * [Count(C) / (Count(A) + Count(C))]
    + Count(A & B & C) * [Count(C) / (Count(A) + Count(B) + Count(C))]


    Scenario N: (N Sets)
    ---------------------

    If we generalize this for "N" Sets, I think there will be N * 2^(N-1)
    Total Terms to compute the counts for the "N" Sets.


    --------------------------------------------------------------------
    # of Sets = N Total Terms (N X 2 N-1 )
    --------------------------------------------------------------------
    1 1
    2 4
    3 12
    4 32
    5 80
    10 5120
    15 491520
    20 10485760
    --------------------------------------------------------------------

    If we use the above formula (using mathematical induction), there will
    be an
    exponential rise in the number of terms to be computed as the number
    of sets increases.

    What I would like to know is if there is any efficient method of
    finding the same counts (at least approximate counts) for the sets by
    using a more simpler formulae or other heuristic methods / tools?

    Any pointers will be greatly appreciated, Thanks!
    Leo
     
    Leo, Jul 25, 2003
    #1
Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.