Tuesday, June 30, 2015

C# Nutshell Chapter 3 Creating Types in C#

Object Initializers

To simplify object initialization, any accessible fields or properties of an object can
be set via an object initializer directly after construction.

public class Bunny
{
public string Name;
public bool LikesCarrots;
public bool LikesHumans;
public Bunny () {}
public Bunny (string n) { Name = n; }
}

// Note parameterless constructors can omit empty parentheses
Bunny b1 = new Bunny { Name="Bo", LikesCarrots=true, LikesHumans=false };
Bunny b2 = new Bunny ("Bo") { LikesCarrots=true, LikesHumans=false };

Properties

Properties look like fields from the outside, but internally they contain logic, like
methods do.
A property is declared like a field, but with a get/set block added. Here’s how to
implement CurrentPrice as a property:
public class Stock
{
decimal currentPrice; // The private "backing" field
public decimal CurrentPrice // The public property
{
get { return currentPrice; } set { currentPrice = value; }
}
}

Const v.s. static readonly

A static readonly field is also advantageous when exposing to
other assemblies a value that might change in a later version.
For instance, suppose assembly X exposes a constant as follows:
public const decimal ProgramVersion = 2.3;
If assembly Y references X and uses this constant, the value 2.3
will be baked into assembly Y when compiled. This means that
if X is later recompiled with the constant set to 2.4, Y will still
use the old value of 2.3 until Y is recompiled. A static
readonly field avoids this problem.
Another way of looking at this is that any value that might
change in the future is not constant by definition, and so should
not be represented as one.


Structs

A struct is similar to a class, with the following key differences:
  • • A struct is a value type, whereas a class is a reference type.
  • • A struct does not support inheritance (other than implicitly deriving from object, or more precisely, System.ValueType).

A struct can have all the members a class can, except the following:
  • • A parameterless constructor
  • • A finalizer
  • • Virtual members
public struct Point
{
int x = 1; // Illegal: cannot initialize field
int y;
public Point() {} // Illegal: cannot have
// parameterless constructor
public Point (int x) {this.x = x;} // Illegal: must assign field y
}

Class Access Modifiers

public
Fully accessible. This is the implicit accessibility for members of an enum or
interface.
internal
Accessible only within containing assembly or friend assemblies. This is the
default accessibility for non-nested types.
private
Accessible only within containing type. This is the default accessibility for
members of a class or struct.
protected
Accessible only within containing type or subclasses.

Class2 is accessible from outside its assembly; Class1 is not:

class Class1 {} // Class1 is internal (default)
public class Class2 {}

ClassB exposes field x to other types in the same assembly; ClassA does not:

class ClassA { int x; } // x is private (default)
class ClassB { internal int x; }

Functions within Subclass can call Bar but not Foo:

class BaseClass
{
void Foo() {} // Foo is private (default)
protected void Bar() {}
}
class Subclass : BaseClass
{
void Test1() { Foo(); } // Error - cannot access Foo
void Test2() { Bar(); } // OK
}

Constructors and Inheritance

A subclass must declare its own constructors. The base class’s constructors are
accessible to the derived class, but are never automatically inherited.

public class Baseclass
{
public int X;
public Baseclass () { }
public Baseclass (int x) { this.X = x; }
}
public class Subclass : Baseclass { }
//the following is illegal:
Subclass s = new Subclass (123);

Subclass must hence “redefine” any constructors it wants to expose. In doing so,
however, it can call any of the base class’s constructors with the base keyword:
public class Subclass : Baseclass
{
public Subclass (int x) : base (x) { }
}

If a constructor in a subclass omits the base keyword, the base type’s parameterless constructor is implicitly called. If the base class has no accessible parameterless constructor, subclasses are forced to use the base keyword in their constructors.

public class BaseClass
{
public int X;
public BaseClass() { X = 1; }  //without this, subclass has to use 'base' keyword
}
public class Subclass : BaseClass
{
public Subclass() { Console.WriteLine (X); } // 1
}


Using Interface or Superclass

As a guideline:
• Use classes and subclasses for types that naturally share an implementation.
• Use interfaces for types that have independent implementations.
Consider the following classes:
abstract class Animal {}
abstract class Bird : Animal {}
abstract class Insect : Animal {}
abstract class FlyingCreature : Animal {}
abstract class Carnivore : Animal {}
// Concrete classes:
class Ostrich : Bird {}
class Eagle : Bird, FlyingCreature, Carnivore {} // Illegal
class Bee : Insect, FlyingCreature {} // Illegal
class Flea : Insect, Carnivore {} // Illegal

The Eagle, Bee, and Flea classes do not compile because inheriting from multiple
classes is prohibited. To resolve this, we must convert some of the types to interfaces.
The question then arises, which types? Following our general rule, we could
say that
insects share an implementation, and birds share an implementation, so
they remain classes. In contrast, flying creatures have independent mechanisms
for flying, and carnivores have independent strategies for eating animals, 
so we
would convert FlyingCreature and Carnivore to interfaces:
interface IFlyingCreature {}
interface ICarnivore {}


Enum initialization

It is different from other type initialization. 

public enum BorderSide { Left, Right, Top, Bottom }
BorderSide topSide = BorderSide.Top;
bool isTop = (topSide == BorderSide.Top); // true


Enum Operator Type-safety Issues

Since an enum can be cast to and from its underlying integral type, the actual value
it may have may fall outside the bounds of a legal enum member

BorderSide b = BorderSide.Bottom;
b++; // No errors

An invalid BorderSide would break the following code:

void Draw (BorderSide side)
{
if (side == BorderSide.Left) {...}
else if (side == BorderSide.Right) {...}
else if (side == BorderSide.Top) {...}
else {...} // Assume BorderSide.Bottom
}

One solution is to add another else clause:
...
else if (side == BorderSide.Bottom) ...
else throw new ArgumentException ("Invalid BorderSide: " + side, "side");

Another workaround is to explicitly check an enum value for validity. The static
Enum.IsDefined method does this job:

BorderSide side = (BorderSide) 12345;
Console.WriteLine (Enum.IsDefined (typeof (BorderSide), side)); // False































Monday, June 29, 2015

C# Nutshell Chapter 1-2 Basics

"==" makes a compile-time decision

int x = 1;
int y = 1;
x == y // true

object x = 1;
object y = 1;
x == y // false; referential equality from object's == operator

"Equals" is resolved at runtime, according to the object's actual type

object x = 1;
object y = 1;
x.Equals(y);  //true


object class static helper method providing null-safe equality comparision

object x = 3, y = 3;
Console.WriteLine (object.Equals (x, y)); // True
x = null;
Console.WriteLine (object.Equals (x, y)); // False
y = null;
Console.WriteLine (object.Equals (x, y)); // True

implicitly assigned default value

static void Main()
{
int x;
Console.WriteLine (x); // Compile-time error
}


class Test
{
static int x;
static void Main() { Console.WriteLine (x); } // 0
}

"ref" and "out" keywords


  • "ref" - pass in as reference rather than value
  • "out" - no need to be assigned before going into the function, but must be assigned before comes out of the function. 
  • both "ef"and "out" are passed in by reference

"params" modifier

The params parameter modifier may be specified on the last parameter of a method
so that the method accepts any number of parameters of a particular type.

static int Sum (int a, int b, params int[] ints)
{
Console.WriteLine(a);
Console.WriteLine(b);
int sum = 0;
for (int i = 0; i < ints.Length; i++)
sum += ints[i]; // Increase sum by ints[i]
return sum;
}

Optional parameters

void Foo (int x = 23) { Console.WriteLine (x); }
Foo(); // 23

Mandatory parameters must occur before optional parameters in both the method declaration and the method call public method that’s called from another assembly requires recompilation of both assemblies.

void Foo (int x = 0, int y = 0) { Console.WriteLine (x + ", " + y); }
void Test()
{
Foo(1); // 1, 0
}

Named arguments

Rather than identifying an argument by position, you can identify an argument by
name.

void Foo (int x, int y) { Console.WriteLine (x + ", " + y); }
void Test()
{
Foo (y:2, x:1); // 1, 2
}

Named arguments are particularly useful in conjunction with optional parameters.
For instance, consider the following method:
void Bar (int a = 0, int b = 0, int c = 0, int d = 0) { ... }
We can call this supplying only a value for d as follows:
Bar (d:3);




















Friday, January 23, 2015

Finding closet pair

Challenge

Assuming multiple points in a 2D dimension, what is the pair of points with closest distance


Naive brute-force solution

Double loop through all points in the scope and find the closest pair will cost O(n^2). However, we can improve and reduce the operational time to O(nlogn) as a smart algorithm demonstrates below.


O(nlogn) for closest pair

note that this is not the details for this algorithm, only the essentials/magics used in it.

Essentials:
  1. Sort points by x-cooridnate and y-cooridnate respectively and store in two arrays => Px, Py. This step will have running time of O(nlogn) with mergesort, which we discussed in one  previous blog, or quick sort.
  2. utilizing "divide and conquer paradigm" and construct recursive function as below:
  3. split points by half => find the closest pair in left half => find the closest pair in right half => find the closest pair with 1 point in left half and 1 point in right half. here the last step, i.e.split_counting, is the difficulty.
  4. By first observing the split_counting step, it has running time of O(n^2), because all points are possible candidates. however, by proof we see that only the subsequent 7 points in Py array can make a closest pair, which is:
    1. for i = 0, i in Py
      1. for j = i + 1, j in [i ... i + 7]
        1. count_cloest_pair(Pi, Pj)
  5. in the nested loop above, it has running time of O(n) only! this is the beauty of this algorithm.

for this algorithm details and the mathematics proof, please google "closest pair algorithm".



why study algorithm - quotation

People who analyze algorithms have double happiness. First of all they experience the sheer beauty of elegant mathematical patterns that surround elegant computational procedures. Then they receive a practical payoff when their theories make it possible to get other jobs done more quickly and more economically.      D. E. Knuth

Friday, January 16, 2015

Mergesort used in inversion counting

Note. In order to understand the inversion counting method in this blog, you need to be familiar with mergesort algorithm first. Especially the "divide and conquer paradigm" concept applied in mergesort.

I have explained that mergesort can be done in O(n*log(n)) in previous blog here

What is inversion

given an array A with length n, the number of inversion in this array equals to number of pairs (A[i], A[j]) with i < j and A[i] > A[j].

example: array = {1,3,5,2,4,6}, the inversions are (5,2) (5,4) (3,2)

Why study inversion

for example: to study the similarity of movie taste for two people.
identify 6 movies and give them to the two to rank according to their favors.
input result from 1st person as array A and result from 2nd person as array B, then calculate the number of inversions to quantify their movie favor differences.

The design of inversion counting 
1.
it uses "divide and conquer paradigm" and algorithm as below:

count_inversion (array A)
    count_inversion (left half of A i,j < n/2)
    count_inversion (right half of A i,j > n/2)
    count_split_inversion (i < n/2 < j)

at each level of count method, the operation carried out is "count_split_inversion". The key challenge is to make sure "count_split_inversion" method can be done in O(n) so that inversion counting can be done with O(nlog(n)).

2,  (brilliant idea!)
Piggyback on mergesort!

sort_and_count_inversion (array A)
    sort_and_count_inversion (left half of A i,j < n/2)
    sort_and_count_inversion (right half of A i,j > n/2)
    merge_and_count_split_inversion (i < n/2 < j)

The reason to add mergesort, which has O(nlog(n)) is that it will reduce the inversion count complexity from O(n^2) to O(nlog(n)), and we will see as below:

when do merge_and_count_split_inversion from "sorted left half A" and "sorted right half A", items are copied from them to an empty array (same as in mergesort).  Now we can claim that: whenever we compare A[i] and A[j] if we found A[i] > A[j] (i<n/2<j) in this step , an inversion is found. this step is O(n).

Therefore, the computational complexity at each level is O(n) and the total complexity is O(nlog(n)).

C# Implementation
 using System;  
 using System.Linq;  
 namespace InversionCount  
 {  
   class Program  
   {  
     static void Main(string[] args)  
     {  
       Console.WriteLine("please give a integer for the size of array, system will generate internal values and do sorting for you");  
       int size = int.Parse(Console.ReadLine());  
       int[] count = new int[size];  
       RandomizeArray(ref count);  
       SortAndCountInversion(count);  
       Console.ReadLine();  
     }  
     public static int[] SortAndCountInversion(int[] A)  
     {  
       if (A.Length == 1)  
         return A;  
       int size = A.Length;  
       int halfSize = A.Length / 2;  
       int[] A1 = new int[halfSize];  
       int[] A2 = new int[A.Length - halfSize];  
       Split<int>(A, halfSize, out A1, out A2);  
       int[] ResultA = SortAndCountInversion(A1);  
       int[] ResultB = SortAndCountInversion(A2);  
       return MergeAndCoutInversion(ResultA, ResultB);  
     }  
     private static int[] MergeAndCoutInversion(int[] ResultA, int[] ResultB)  
     {  
       int resultSize = ResultA.Length + ResultB.Length;  
       int[] resultArray = new int[resultSize];  
       int i = 0, j = 0;  
       for (int k = 0; k < resultSize; k++)  
       {  
         if (i == ResultA.Length)  
           resultArray[k] = ResultB[j++];  
         else if (j == ResultB.Length)  
           resultArray[k] = ResultA[i++];  
         else if (ResultA[i] <= ResultB[j])  
           resultArray[k] = ResultA[i++];  
         else if (ResultA[i] > ResultB[j])  
         {  
           for (int s = i; s < ResultA.Length; s++)  
           {  
             Console.WriteLine("{0} - {1}", ResultA[s], ResultB[j]);  
           }  
           resultArray[k] = ResultB[j++];            
         }  
         else  
         {  
           Console.WriteLine("odd!");  
         }  
       }  
       return resultArray;  
     }  
     public static void Split<T>(T[] array, int index, out T[] first, out T[] second)  
     {  
       first = array.Take(index).ToArray();  
       second = array.Skip(index).ToArray();  
     }  
     private static void RandomizeArray(ref int[] count)  
     {  
       int size = count.Length;  
       Random random = new Random();  
       int randomInt;  
       for (int i = 0; i < size; i++)  
       {  
         randomInt = random.Next(0, size);  
         count[i] = randomInt;  
       }  
     }  
   }  
 }  







Thursday, January 15, 2015

MergeSort study

MergeSort algorithm learning notes:

I am studying algorithm course in coursera to sharpen my CS understanding.

Karatsuba algorithm (KA) is the first algorithm in this course opened my eyes. I never thought of any faster algorithm can do better/faster than the grade 3 learned calculating algorithm. I plan to implement KA in another thread in C# and post it here.

Back to track, this thread is about MergeSort algorithm (MA). Teacher Tim Roughgarden introduced the MA with a simple example and illustrate further step by step why MA is faster than most of other O(n2) algorithm. 

Here is MA's sorting complexity explanation:

  • Each level of MA calculation has maxi. 6j operations, given that j is the number of items in the current level array input in each recursive call). 
  • MA has sum of log2(n) + 1 levels (assume original level is level 0)
  • At any given level j, there are 2^j (2pwoer j) number of recursive calls
  • At any given level j, each recursive call  has array size n/2^j
We find out that the operations at any given level j is 2^j * (6 *n/2^j) = 6n (astonishingly independent of j ! isn't this cool!?)

Finally, 6n (log2(n) + 1) is the total operations in MA.

In summary, the reason Mergesort is fast is that its operations in each call is constant 6n and the level of recursive call is log2(n), which is flatter than f(n) = n. The drawback for mergesort is that it requires extra space to store the items when aggregating back from the left and right sub-streams.


C# implementation
 using System;  
 using System.Linq;  
 namespace Mergesort  
 {  
   class Program  
   {  
     static void Main(string[] args)  
     {  
       Console.WriteLine("please give a integer for the size of array, system will generate internal values and do sorting for you");  
       int size = int.Parse(Console.ReadLine());  
       Random random = new Random();  
       int randomInt;  
       int[] count = new int[size];  
       for (int i = 0; i < size; i++)   
       {  
         randomInt = random.Next(0, size);  
         count[i] = randomInt;  
       }  
       int[] countResult = Mergesort(count);  
       foreach (int i in countResult)  
       { Console.WriteLine(i); }  
       Console.ReadLine();  
     }  
     public static int[] Mergesort(int[] A)  
     {  
       if (A.Length == 1)  
         return A;  
       int size = A.Length;  
       int halfSize = A.Length / 2;  
       int[] A1 = new int[halfSize];  
       int[] A2 = new int[A.Length - halfSize];  
       Split<int>(A, halfSize, out A1, out A2);  
       int[] ResultA = Mergesort(A1);  
       int[] ResultB = Mergesort(A2);  
       return MergeTwoArray(ResultA, ResultB);  
     }  
     private static int[] MergeTwoArray(int[] ResultA, int[] ResultB)  
     {  
       int resultSize = ResultA.Length + ResultB.Length;  
       int[] resultArray = new int[resultSize];  
       int i = 0, j = 0;  
       for (int k = 0; k < resultSize; k++)  
       {  
         if (i == ResultA.Length)  
           resultArray[k] = ResultB[j++];  
         else if (j == ResultB.Length)  
           resultArray[k] = ResultA[i++];  
         else if (ResultA[i] <= ResultB[j])  
           resultArray[k] = ResultA[i++];  
         else  
           resultArray[k] = ResultB[j++];  
       }  
       return resultArray;  
     }  
     public static void Split<T>(T[] array, int index, out T[] first, out T[] second)  
     {  
       first = array.Take(index).ToArray();  
       second = array.Skip(index).ToArray();  
     }  
   }  
 }