Friday, October 7, 2016

Components of Big Data - Hadoop System

In this blog i will explain important components which are part of Hadoop System. I will give very brief overview of these components.

Below diagram shows very high level components in Hadoop system.


  • Master Node (MN)
    • Name Node (NN)
      • It is a daemon process runs on Master Node.
      • Takes care of reading the data file to be analyzed.
      • Splits the data file based on block size configured, default is 64MB and 128 MB.
      • Distributes the split data file across multiple Data Node.
      • Maintains the index file to keep track of where the data has been distributed. Think this as "Table of Content" in a book.
      • It provides input to Job Tracker for location of the data files in Data Node.
      • This is one part of HDFS system in Hadoop.
    • Job Tracker (JT)
      • Job tracker is also a daemon process.
      • This is part of Processing Engine of Hadoop system.
      • It is responsible for running the program which will analyze the data and produce results.
      • Job Tracker communicates with NN to identify the location of the the data file. Once the data node locations are identified this process will move the program to those Data Node for execution.
      • JT will try its best to run the Data file analysis local to the Node, so as to process it faster.
      • But incase if can't assign task local to Task Tracker then next it will look for node available in same Rack.
      • Once Job Tracker receives output from the multiple Task Tracker, it has to run the program again to consolidate the output and generate the final output for the analysis.
      • Job Tracker important role is to monitor all the task which are running in Task Tracker, and start new job if something fails.
  • Slave Node (SN)
    • Data Node (DN)
      • This is a Daemon process runs on Slave Node.
      • There can be many data nodes as Number of Slave Node can be more than 1.
      • Task of this process is to receive the data sent by Name Node.
      • DN is responsible for manitaining the data received from NN in the system. It keep track of these data file.
      • Together NN and DN forms and manages HDFS.
    • Task Tracker(TT)
      • This is Daemon process is running on Data Node.
      • Program sent from JT are received by this process and stored in the Slave Node.
      • After receiving the file it will initiates the program to analyze the data file.
      • Once the analysis is complete, it will produce the result and share it back with JT.
      • Combined together JT and TT are called MapReduce.
      • Task Tracker keeps sending heartbeat signal to Job Tracker so that Job Tracker understands that process is running fine. 
      • Incase if TT fails to send heartbeat to JT, then JT will re-initiate that process again in other available TT.
  • HDFS
    • Hadoop Distributed File System
    • Two important concept in HDFS
      • Block Size
      • Fault Tolerance/Failure 
    • Name Node and Data Node creates and manages the HDFS.
    • Name Node is the master which takes care of splitting the file and distributing it across multiple data node.
    • HDFS is fail safe system, which ensure that data which is stored is never lost, rather i should say chances of loosing data is very minimal.
    • HDFS ensure fault tolerance by keeping copy of same data file in multiple Data Node. NN maintains 3 copies of data file so that if any one data node crahses it can consume the backup file.
  • MapReduce
    • Map Reduce comes under processing engine of Hadoop system.
    • It is programming model which enables running process in Parallel which can be distributed across multiple nodes.
    • Job Tracker and Task Tracker makes up this Processing Engine of Hadoop system.
    • There are basically two pieces of MapReduce. 
    • First part is Map which analyzes the data file and produces an output, which goes as an input to Second part called Reducer. Map task job is to sort and filter the data.
    • Reducer takes this processed out and creates a consolidated reports.
  • Secondary Name Node
    • This process runs in another system in hadoop cluster.
    • It kicks in when main Name Node fails or goes down. 
    • It keep interacting with NN at regular interval and creates a backup of the index file in separate system. 
    • Backup system can be the one where secondary name node is running or it can be other system which is in some other location or other RACK.
    • Its task is to recreate new Name Node by reading the FSImage and Edit Logs.

Tuesday, October 4, 2016

Load Testing of Rest Webservice Using SOAP UI

SOAP UI provides easy integration with different web services like SOAP and REST along with feature of Load Testing. Below I will show how to setup SOAP UI for doing load testing.

To start with you need to have SOAP UI installed in the system. In the example I will consume REST webservice.

Web Service end point is http://localhost:9081/UserService/users

Lets start with the load testing setup now.

Start SOAP UI tool and setup the REST webservice call and do a round of testing to ensure that tool is able to connect to the service and is able to get the response. Following screens shows the same.

Setup REST Project

Create New Project


Right click on the Project and click on create REST Project.


After providing the URL in above alert box, You will get below windows where you can see the various options like

  • Method
  • Endpoint
  • Resource
  • Input parameter

Once you provide all the necessary configuration required for your webservice do test and once you see the expected response in the output window, you are good to go for next step i.e. executing load testing.




Now to setup Load Testing, start by creating "Test Suite" and "Test Case". Click on the icon highlighted in Red in above screen shot. That will prompt you to enter Test Suite name and next it will prompt you to enter Test Case name. Provide these details respectively and proceed next.



Once the above steps are completed, you can validate these test case, by launching test suite by double clicking on the left navigation as shown below. For this example I clicked on "UserTestSuit", in the window click on Green Arrow button to launch test Suite, all process should be in green. You can add Assertions to validate the response, i will cover these configuration in future blogs.




Right click on the "Load Tests" menu in above screen and add New Load Test, you will get following screens.



In the Load Test Screen, SOAP UI provides various options to launch Load test.
  • Limit - Time for which you want to run the test.
  • Thread - No of threads 
  • Strategy - Various options, like 
    • Simple - For Simple load run, good for Base-lining application performance.
    • Variance - In this Strategy, load keeps varying over the specified period of Load Test. If you have given 100 as thread count and Variance is 0.5, then thread count will vary between 50 to 150 over the period load testing limit.
    • Burst - This strategy sends sudden burst of user loads, by hitting the service for specified time. You need to specify Burst Delay which is gap between each burst and Burst Duration which is time each burst will last.
    • Thread - In this one you can specify initial number of thread and increase it over the period of time and testing will end at the value specified in End Threads drop down.


Load test can be started by clicking on green arrow. Once the testing finishes, tool will shows you high level details in below grid.
Important headers will be TPS, Min, Max, Avg time. Error counts.

Also tool provides two graph next to Green Arrow start button. There you can see different matrix like threads, avg time, error count, tps etc. Data can be exported from this graph, which will give you entire run data, this can be achieved by clicking on export icon on the right side of window.


I hope this overview will help you to start with PT in SOAP UI, In upcoming blogs I will show how to pass parameter between multiple steps present in load testing.

thanks.

Saturday, October 1, 2016

Basic Authentication Setup Java Web Application

This blog shows how to setup Basic Authentication in Web application.

I am using Spring based Web Service to demonstrate the same. To start please ensure that you have spring application configured properly. On top of that i will show what changes needs to be be made to enable Basic Authentication.

Securing application is one of the important activity which developer and designer has to keep in mind while designing. Basic authentication can be one of the basic security mechanism which can be enabled to secure web application or web service.

Basic Authentication security is where application will expect the consumer to pass User and password in request header. In case if these values are not passed then spring framework will throw back Unauthorized error code.

Following are the steps which needs to be followed.

Step1:

Add the spring security jar files in the application. Following are the jar files.
  • spring-security-core.jar
  • spring-security-config.jar
  • spring-security-web.jar
Step2:

Add following line in "Web.xml" file to enable spring security filter. This filter is responsible for adding security to the url pattern mentioned.
Also add an entry for security xml file which we will configure in next step, this file contain basic authentication security configuration.

Code is highlighted below.

<servlet>
 <servlet-name>basicauth</servlet-name>
 <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
 <load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
 <servlet-name>basicauth</servlet-name>
 <url-pattern>/</url-pattern>
</servlet-mapping>
<listener>
 <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
</listener>
<context-param>
 <param-name>contextConfigLocation</param-name>
 <param-value>  
           /WEB-INF/basicauth-servlet.xml,  
           /WEB-INF/basicauth-security.xml
        </param-value>
</context-param>
<!-- Spring Security -->
<filter>
 <filter-name>springSecurityFilterChain</filter-name>
 <filter-class>org.springframework.web.filter.DelegatingFilterProxy</filter-class>
</filter>
<filter-mapping>
 <filter-name>springSecurityFilterChain</filter-name>
 <url-pattern>/*</url-pattern>
</filter-mapping>

Step3:

Create spring-security.xml file and add following code to enable the security for those url to which security needs to be enabled.

<?xml version="1.0" encoding="UTF-8" ?>
<beans xmlns="http://www.springframework.org/schema/beans"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
 xmlns:oauth="http://www.springframework.org/schema/security/oauth2"
 xmlns:context="http://www.springframework.org/schema/context"
 xmlns:sec="http://www.springframework.org/schema/security" 
 xmlns:mvc="http://www.springframework.org/schema/mvc"
 xsi:schemaLocation="http://www.springframework.org/schema/security/oauth2 
 http://www.springframework.org/schema/security/spring-security-oauth2-2.0.xsd
 http://www.springframework.org/schema/mvc 
 http://www.springframework.org/schema/mvc/spring-mvc-3.2.xsd
 http://www.springframework.org/schema/security 
 http://www.springframework.org/schema/security/spring-security-3.2.xsd 
 http://www.springframework.org/schema/beans
 http://www.springframework.org/schema/beans/spring-beans-4.1.xsd
 http://www.springframework.org/schema/context 
 http://www.springframework.org/schema/context/spring-context-4.1.xsd ">
<http auto-config="true"  use-expressions="true" xmlns="http://www.springframework.org/schema/security">
    <intercept-url pattern="/login" access="permitAll" />
    <intercept-url pattern="/**" access="hasRole('ROLE_USER')" />
    <http-basic />
</http>
<authentication-manager alias="authenticationManager" xmlns="http://www.springframework.org/schema/security">
  <authentication-provider >
    <user-service>
      <sec:user name="apiuser" password="password" authorities="ROLE_USER"/>
    </user-service>
  </authentication-provider>
</authentication-manager> 
</beans>

cccc

Here is the explanation of the tags:

<http> - Main tag which is responsible for creating proxy for all the url which this tags intercepts.
<intercept-url> - this tag is for specifying the url's which needs to be behind access control and which doesnot need to have access control. Using attribute pattern you can specify Url pattern and access attribute specifies what access check needs to be implemented.
<http-basic> - This tag informs proxy that it needs to alert user to enter user and password when url is accessed. This adds a BasicAuthenticationFilter and BasicAuthenticationEntryPoint to the configuration.

<authentication-manager> - here type of authentication is specified, in above example i have used configuration based authentication, which means that user details are hardcoded in the same xml files.

Step4:

Create Java file for testing the following implementation. Below is the web service component which is behind basic authentication security.

package com.infoblog;

import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.ResponseBody;


@Controller
@RequestMapping(value="/secure")
public class SecureController {

 @RequestMapping(value="/test", method=RequestMethod.GET)
 public @ResponseBody String secureFunction(){
  
  return "Success";
 }
 
}


Step5:

Start the server and test the functionality. Browser will prompt you to enter user details as shown below.





Components of Big Data - Hadoop System

In this blog i will explain important components which are part of Hadoop System. I will give very brief overview of these components. Be...