*Note. In order to understand the inversion counting method in this blog, you need to be familiar with mergesort algorithm first. Especially the "divide and conquer paradigm" concept applied in mergesort.*

*I have explained that mergesort can be done in O(n*log(n)) in previous blog here*

**What is inversion**

**given an array A with length n, the number of inversion in this array equals to number of pairs (A[i], A[j]) with i < j and A[i] > A[j].**

example: array = {1,3,5,2,4,6}, the inversions are (5,2) (5,4) (3,2)

**Why study inversion**

**for example: to study the similarity of movie taste for two people.**

identify 6 movies and give them to the two to rank according to their favors.

input result from 1st person as array A and result from 2nd person as array B, then calculate the number of inversions to quantify their movie favor differences.

**The design of inversion counting**

1.

it uses

*"divide and conquer paradigm"*and algorithm as below:

count_inversion (array A)

count_inversion (left half of A i,j < n/2)

count_inversion (right half of A i,j > n/2)

count_split_inversion (i < n/2 < j)

at each level of count method, the operation carried out is "count_split_inversion".

**The key challenge**is to make sure "count_split_inversion" method can be done in O(n) so that inversion counting can be done with O(nlog(n)).

2,

**(brilliant idea!)**

Piggyback on mergesort!

**sort_and_**count_inversion (array A)

**sort_and_**count_inversion (left half of A i,j < n/2)

**sort_and_**count_inversion (right half of A i,j > n/2)

**merge_and**_count_split_inversion (i < n/2 < j)

The reason to add mergesort, which has O(nlog(n)) is that it will reduce the inversion count complexity from O(n^2) to O(nlog(n)), and we will see as below:

when do merge_and_count_split_inversion from "sorted left half A" and "sorted right half A", items are copied from them to an empty array (same as in mergesort). Now we can claim that: whenever we compare A[i] and A[j] if we found A[i] > A[j] (i<n/2<j) in this step , an inversion is found. this step is O(n).

Therefore, the computational complexity at each level is O(n) and the total complexity is O(nlog(n)).

**C# Implementation**

```
using System;
using System.Linq;
namespace InversionCount
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("please give a integer for the size of array, system will generate internal values and do sorting for you");
int size = int.Parse(Console.ReadLine());
int[] count = new int[size];
RandomizeArray(ref count);
SortAndCountInversion(count);
Console.ReadLine();
}
public static int[] SortAndCountInversion(int[] A)
{
if (A.Length == 1)
return A;
int size = A.Length;
int halfSize = A.Length / 2;
int[] A1 = new int[halfSize];
int[] A2 = new int[A.Length - halfSize];
Split<int>(A, halfSize, out A1, out A2);
int[] ResultA = SortAndCountInversion(A1);
int[] ResultB = SortAndCountInversion(A2);
return MergeAndCoutInversion(ResultA, ResultB);
}
private static int[] MergeAndCoutInversion(int[] ResultA, int[] ResultB)
{
int resultSize = ResultA.Length + ResultB.Length;
int[] resultArray = new int[resultSize];
int i = 0, j = 0;
for (int k = 0; k < resultSize; k++)
{
if (i == ResultA.Length)
resultArray[k] = ResultB[j++];
else if (j == ResultB.Length)
resultArray[k] = ResultA[i++];
else if (ResultA[i] <= ResultB[j])
resultArray[k] = ResultA[i++];
else if (ResultA[i] > ResultB[j])
{
for (int s = i; s < ResultA.Length; s++)
{
Console.WriteLine("{0} - {1}", ResultA[s], ResultB[j]);
}
resultArray[k] = ResultB[j++];
}
else
{
Console.WriteLine("odd!");
}
}
return resultArray;
}
public static void Split<T>(T[] array, int index, out T[] first, out T[] second)
{
first = array.Take(index).ToArray();
second = array.Skip(index).ToArray();
}
private static void RandomizeArray(ref int[] count)
{
int size = count.Length;
Random random = new Random();
int randomInt;
for (int i = 0; i < size; i++)
{
randomInt = random.Next(0, size);
count[i] = randomInt;
}
}
}
}
```

## No comments:

## Post a Comment