miércoles, 27 de abril de 2011

LOGISTIC REGRESSION


Supervised Learning


LOGISTIC REGRESSION 


$MATLAB
% Logistic Regression
clc, close all
N = 100; n = 5;
x = sort( randn(N,1));
X = [x ones(N,1)];
theta = [3.5 2]';           % Column Vector
y = ( fsigmoid( theta'*X' ) ); y = y >= .5;
IDX = unidrnd( N, 1, n);
y( IDX) = mod( y( IDX)+1,2);
plot(x, y, '+r','linewidth', 2), hold on

% STOCHASTIC ASCENT RULE
thetahat = [ 0 0]';
theta = 1+thetahat; k = 1; MAXITER = 100;
alpha = .1;
while( norm(theta - thetahat) > 1e-3  && k < MAXITER)
    theta = thetahat;
    for j = 1:2
        for i = 1:N
            thetahat(j) = thetahat(j) + alpha*...
                ( y(i)- fsigmoid(thetahat'*X(i,:)' ) )*X(i,j);
        end
    end
    k = k + 1;
    h = fsigmoid(thetahat'*X');
    plot(x, h, '--', 'linewidth', 1)
end
axis tight, xlabel('x'), ylabel('y'), title('Logistic Regression'), grid


%%%%% Logistic Regression
function gz = fsigmoid( z)
gz = 1./(1+exp(-z));

domingo, 17 de abril de 2011

Rejection Sampling Algorithm

Rejection Sampling Algorithm


The Monte Carlo Principle: Draw samples from a Target Density p(x) known up to a proportionality constant. The rejection sampling algorithm generate RANDOM samples from known proposal distribution q(x) that satisfies p(x) <= Mq(x),  M-Bounded.

Algorithm :
  1. Sample xi~q(x) - (N(MU, SIGMA^2)) and u~U(0,1)
  2. if u < p(xi)/Mq(xi) accept propose RANDOM sample xi and
    • Increment i, ELSE REJECT
MATLAB script
% Monte Carlo Methods
% Accept Reject Sampling Algorithms
clc, close all, clear
x = linspace(-10, 10, 1e4);

c = 1; s = 2; f = 1.5; M = 5.2;
p_x = px( x, c, s, f );
q_x = normpdf(x, c, s);
Mq_x = M*q_x;

N = 1e3;                % N- Number of Samples,
i = 1; j = 0; tic
while(i <= N)
    u = rand; j = j + 1;
    xi = normrnd(c, s);
    if u < px( xi, c, s, f)/ ( M*normpdf(xi, c, s ));
        xrand(i) = xi;
        i = i + 1;
    end
   
end
rtime = toc
NofProposals = j - 1;

% Histogram of Random Samples,
[ xP, bw] = hist(xrand, 200);
xP = xP /( N*(bw(2)-bw(1)));

figure, bar(bw, xP), hold on
plot(x, p_x, 'r', x, q_x, '--y', x, Mq_x,'--g', 'LineWidth', 2)
legend('Random Samples','Target Distribution','Proposal Distribution','Scale Proposal Dist')
title('Rejection Sampling Algorithm '), xlabel('x'), ylabel('p_x')

Target Distribution:
function p_x = px( x, c, s, f);
% Target Distribution
p_x = abs( exp( -(x-c).^2/( 2*s^2 )) .* cos( f* x) );

MATLAB Simulation Results
N = 1e5;
NofProposals =  1.63e5;
Running Time = 2.2min Approx;
Acceptance Perc = 61%;
Almost 40% of the samples were discarded, 

This Link has a very good explanation of MCMC methods
 MCMC and Machine Learning:

Rejection Sampling Dev-C
The Main Advantage of c-implementation over MATLAB is that for 1e8 samples it took less than a minute
 while in MATLAB the same simulation last well beyond 20min. 

/* C Example */
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include "MCMC.h"

#define N 10000000
#define PI 2*acos(0)

int main (argc, argv)
     int argc;
     char *argv[];
{
    /* Rejection Sampling Alg. Parameters */
    // Target Density Parameters
    int c = 1, s = 2;
    float f = 1.5;
    // Scaling Constant
    float M = 5.2;
   
  /* SET random seed using CPU Clock  */   
  unsigned seed;
  seed  = time(NULL);       
  srand(seed);

  /* Allocate Space for Samples */
  int np;
  double * xrand;
  np = N/1;                   // Number of Samples by EACH process
  xrand = malloc( np*sizeof(double)); // Allocate Space
 
  FILE *ft ;
  ft = fopen ( "RejectionSampling.txt", "w" ) ;
        if ( ft == NULL )
        {                                                              
               fclose ( ft ) ;
               return 0;
        }
       
  // Rejection Sampling Variables
  int i = 0, j = 0;    /* Counters */
  double u, xi;
 
  /* Rejection Sampling Algorithm */
  while( i < np ){
     u = U01();     // Random Sample
     ++j;
     xi = N01()*s + c;
     if( u < px( xi, c, s, f)/( M*qx( xi, c, s)) ){
        *(xrand+i) = xi;
        fprintf( ft, "%f \n", xi);
        ++i;
     }
  }
 
  fclose ( ft);
  return 0;
}

HEADER FILES
#include <math.h>
#include <stdlib.h>
#define PI 2*acos(0)

double px( double x, double c, double s, double f);
double qx( double x, double c, double s);
double U01();
double N01();

SOURCE FILES
#include "MCMC.h"

/* Target Density */
double px( double x, double c, double s, double f){
       return fabs( exp( -(x-c)*(x-c)/( 2*s*s) )* cos( f*x) );
}
/* Proposal Density */
double qx( double x, double c, double s){      
       return 1/sqrt(2*PI*s*s)* exp(-(x-c)*(x-c)/( 2*s*s) );
}

/* Uniform Random Number Generator */
double U01(){
   return (double)rand()/(double)RAND_MAX;
}
  
/*   www.taygeta.com/random/gaussian.html
     BOX-MULLER Algortihm,  */
double N01(){
  double x1, x2, w, y1, y2;
  do{
     x1 = 2.0*U01() -1.0;
     x2 = 2.0*U01() -1.0;
     w = x1*x1 + x2*x2;
  } while( w >= 1.0 );
  w = sqrt( (-2.0*log( w))/w);
  y1 = x1*w;
  y2 = x2*w;
  return y1;
}  

Dev-c Simulation 1e8 Samples


Rejection Sampling and MPI - Parallel Programming

Datos personales

Computational Science UTEP