blog




  • Essay / Access control layer on top of Pig using XACML

    Apache Hadoop is an open source software framework for large-scale storage and processing of datasets on commodity hardware clusters . Hadoop, an Apache top-tier project, is built and used by a global community of contributors and users. Rather than relying on hardware to ensure high availability, the library is designed to detect and handle failures at the application layer itself. It provides a highly available service on top of a cluster of computers, each of which may be prone to failure. A small Hadoop cluster has a single master and multiple worker nodes. The master node consists of a JobTracker, a TaskTracker, a NameNode and a DataNode. A slave or worker node acts as both a DataNode and a TaskTracker, although it is possible to have data-only worker nodes and compute-only worker nodes. These are normally used only in non-standard applications. Hadoop requires Java Runtime Environment (JRE) 1.6 or later. Standard startup and shutdown scripts require ssh to be configured between cluster nodes. The Apache Hadoop framework is composed of Hadoop Common modules which contain libraries and utilities for other Hadoop modules, Hadoop MapReduce is a large-scale programming model. data processing, Hadoop Distributed File System (HDFS) is a distributed file system that stores data that provides very high overall bandwidth across the cluster and Hadoop YARN, a resource management platform that manages computing resources in clusters and uses them for user scheduling. applications.Hadoop Distributed File System is a distributed, scalable, and portable file system written in Java for the Hadoop framework. Each node in a Hadoop instance has a single name node; a cluster of data nodes forms the HDFS cluster as shown in the figure...... middle of document.......[12]More information about ApachePig can be found at http://hortonworks. com/hadoop/pig / on the web.[13] More information about XML and Security: Introduction to XACML-AccessControlPoliciesinXML can be found at https://community.emc.com/docs/DOC-7314 and http://dimacs.rutgers.edu/Workshops/Commerce/slides /crampton. pdf on the web. [14] For more information about the XACMLPolicy language, see http://wso2.com/library/articles/2011/10/understanding-xacml-policy-Language-xacml-Extended-assertion-markup-langue-. part-1/ on the Web.[15] More information about authorization and authentication in Hadoop is available at http://blog.cloudera.com/blog/2012/03/authorization-and-authentication-in-hadoop/ on the web.[16] More information about the AHDFSArchitecture can be found http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#NameNode+and+DataNodes on the web.