Thursday, December 5, 2013

HDFS Configuration

Guide to setup WSO2 SS for HDFS

Here we assume user is using the embedded LDAP shipped with the product being used for the setup. If the external LDAP needs to be configured in production environment then there should be additional configuration required in Kerberos side. Also this guide written for Linux environment in mind (Ubuntu to be precise)

Kerberos Configuration

1. To install Kerberos in Ubuntu environment

sudo apt-get install krb5-kdc krb5-admin-server

2. To configure the realms, edit the krb5.conf file (/etc/krb5.conf)


Take a back-up of original file. Please add only the below details.
[libdefaults]
        default_realm = WSO2.ORG
        default_tkt_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1
        default_tgs_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1
        permitted_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1
        allow_weak_crypto = true
        renew_lifetime = 10m
        ticket_lifetime = 10m


[realms]
        WSO2.ORG = {
                kdc = 127.0.0.1:8000
                admin_server = 1.1.1.1:9999
        }

[domain_realm]
        .wso2.org = WSO2.ORG
        wso2.org = WSO2.ORG

[login]
        krb4_convert = true
        krb4_get_tickets = false

Note: renew_lifetime, ticket_lifetime parameters should be set according to the requirement as these are the life time defined for the Tickets (TGT) generated from the client side. If you are using external LDAP then realms parameter should match with the external LDAP server settings.


The letter used on those parameters denotes as below. More information is available from the internet.

d - days
h - hours
m- minutes
s - seconds


3. Create keytab with service principles



To create a keytab with the below service principals we will be using the ktutil utility

  • admin/node0 - password: admin
  • host/node0 - password : node0

To cache a principle key from the ktutil prompt

ktutil: addent -password -p <your principle> -k 1 -e <encryption algo>


Write keytab for the service principle

ktutil : write_kt <keytab file name>
e.g.
user@hostname:~$ ktutil
ktutil:  addent -password -p admin/node0@WSO2.ORG -k 1 -e des-cbc-md5
Password for admin/node0@WSO2.ORG:<admin>

ktutil:  addent -password -p host/node0@WSO2.ORG -k 1 -e des-cbc-md5
Password for host/node0@WSO2.ORG:<node0>

ktutil:  wkt hdfs2.keytab

ktutil: quit

From the same disk location keytab file created with the name hdfs2.keytab.

Take a backup or rename the carbon.keytab file originally packed with the product.
$WSO2SS_HOME/repository/conf/etc/hadoop/keytabs

Please copy or move hdfs2.keytab file to the above location and rename as carbon.keytab


4. Setup WSO2 SS with directory service (Embedded LDAP) and KDC



Please note the below configurations already been added and enabled for the product. The user probably may not require to configure in the case of embedded LDAP.

A. To configure carbon.xml
$WSO2SS_HOME/repository/conf/carbon.xml
<EmbeddedLDAP>
 <!-- Port which embedded LDAP server runs -->
 <LDAPServerPort>10389</LDAPServerPort>
 <!-- Port which KDC (Kerberos Key Distribution Center) server runs -->
 <!--KDCServerPort>8000</KDCServerPort-->
 <KDCServerPort>8000</KDCServerPort>
</EmbeddedLDAP

B. Enable LDAP based user store and enable KDC for user and service principles

$WSO2SS_HOME/repository/conf/user-mgt.xml
<!-- Following user manager is used by Identity Server (IS) as its default user manager.
IS will do token replacement when building the product. Therefore do not change the syntax. 
If "kdcEnabled" parameter is true, IS will allow service principle management. 
Thus "ServicePasswordJavaRegEx", "ServiceNameJavaRegEx" properties control the service name 
format and service password formats. -->

<UserStoreManager class="org.wso2.carbon.user.core.ldap.ReadWriteLDAPUserStoreManager">
 <Property name="TenantManager">org.wso2.carbon.user.core.tenant.CommonHybridLDAPTenantManager</Property>
 <Property name="defaultRealmName">WSO2.ORG</Property>
 <Property name="kdcEnabled">true</Property>
 <Property name="Disabled">false</Property>                                
 <Property name="ConnectionURL">ldap://localhost:${Ports.EmbeddedLDAP.LDAPServerPort}</Property>
 <Property name="ConnectionName">uid=admin,ou=system</Property>
 <Property name="ConnectionPassword">admin</Property>
 <Property name="passwordHashMethod">SHA</Property>
 <Property name="UserNameListFilter">(objectClass=person)</Property>
 <Property name="UserEntryObjectClass">identityPerson</Property>
 <Property name="UserSearchBase">ou=Users,dc=wso2,dc=org</Property>
 <Property name="UserNameSearchFilter">(&amp;(objectClass=person)(uid=?))</Property>
 <Property name="UserNameAttribute">uid</Property>
 <Property name="PasswordJavaScriptRegEx">^[\S]{5,30}$</Property>
 <Property name="ServicePasswordJavaRegEx">^[\\S]{5,30}$</Property>
 <Property name="ServiceNameJavaRegEx">^[\\S]{2,30}/[\\S]{2,30}$</Property>
 <Property name="UsernameJavaScriptRegEx">^[\S]{3,30}$</Property>
 <Property name="UsernameJavaRegEx">[a-zA-Z0-9._-|//]{3,30}$</Property>
 <Property name="RolenameJavaScriptRegEx">^[\S]{3,30}$</Property>
 <Property name="RolenameJavaRegEx">[a-zA-Z0-9._-|//]{3,30}$</Property>
 <Property name="ReadGroups">true</Property>
 <Property name="WriteGroups">true</Property>
 <Property name="EmptyRolesAllowed">true</Property>
 <Property name="GroupSearchBase">ou=Groups,dc=wso2,dc=org</Property>
 <Property name="GroupNameListFilter">(objectClass=groupOfNames)</Property>
 <Property name="GroupEntryObjectClass">groupOfNames</Property>
 <Property name="GroupNameSearchFilter">(&amp;(objectClass=groupOfNames)(cn=?))</Property>
 <Property name="GroupNameAttribute">cn</Property>
 <Property name="SharedGroupNameAttribute">cn</Property>
 <Property name="SharedGroupSearchBase">ou=SharedGroups,dc=wso2,dc=org</Property>
 <Property name="SharedGroupEntryObjectClass">groupOfNames</Property>
 <Property name="SharedGroupNameListFilter">(objectClass=groupOfNames)</Property>
 <Property name="SharedGroupNameSearchFilter">(&amp;(objectClass=groupOfNames)(cn=?))</Property>
 <Property name="SharedTenantNameListFilter">(objectClass=organizationalUnit)</Property>
 <Property name="SharedTenantNameAttribute">ou</Property>
 <Property name="SharedTenantObjectClass">organizationalUnit</Property>
 <Property name="MembershipAttribute">member</Property>
 <Property name="UserRolesCacheEnabled">true</Property>
 <Property name="UserDNPattern">uid={0},ou=Users,dc=wso2,dc=org</Property>
 <Property name="RoleDNPattern">cn={0},ou=Groups,dc=wso2,dc=org</Property>
 <Property name="SCIMEnabled">true</Property>
 <Property name="MaxRoleNameListLength">100</Property>
 <Property name="MaxUserNameListLength">100</Property>
</UserStoreManager>

C. Enable embedded LDAP in WSO2 Carbon

$WSO2SS_HOME/repository/conf/cembedded-ldap.xml
<EmbeddedLDAP>
    <Property name="enable">true</Property>
    <Property name="port">${Ports.EmbeddedLDAP.LDAPServerPort}</Property>
    <Property name="instanceId">default</Property>
    <Property name="connectionPassword">admin</Property>
    <Property name="workingDirectory">.</Property>
    <Property name="AdminEntryObjectClass">wso2Person</Property>
    <Property name="allowAnonymousAccess">false</Property>
    <Property name="accessControlEnabled">true</Property>
    <Property name="denormalizeOpAttrsEnabled">false</Property>
    <Property name="maxPDUSize">2000000</Property>
    <Property name="saslHostName">localhost</Property>
    <Property name="saslPrincipalName">ldap/localhost@EXAMPLE.COM</Property>
</EmbeddedLDAP>

5. Add service principle


Service principles are required for host (host/node0) and admin (admin/node0) services. These principles are used to authenticate system components such as name-node(s), job-tracker, data-nodes and task-trackers mutually.

In the wso2carbon.sh file, change the “Ddisable.hdfs.startup” property to false

$WSO2SS_HOME/bin/wso2server.sh
------
------

#To monitor a Carbon server in remote JMX mode on linux host machines, set the below system property.
#   -Djava.rmi.server.hostname="your.IP.goes.here"

while [ "$status" = "$START_EXIT_STATUS" ]
do
    $JAVACMD \
    -Xbootclasspath/a:"$CARBON_XBOOTCLASSPATH" \
    -Xms256m -Xmx1024m -XX:MaxPermSize=256m \
    -XX:+HeapDumpOnOutOfMemoryError \
    -XX:HeapDumpPath="$CARBON_HOME/repository/logs/heap-dump.hprof" \
    $JAVA_OPTS \
    -Dcom.sun.management.jmxremote \
    -classpath "$CARBON_CLASSPATH" \
    -Djava.endorsed.dirs="$JAVA_ENDORSED_DIRS" \
    -Djava.io.tmpdir="$CARBON_HOME/tmp" \
    -Dcatalina.base="$CARBON_HOME/lib/tomcat" \
    -Dwso2.server.standalone=true \
    -Dcarbon.registry.root=/ \
    -Djava.command="$JAVACMD" \
    -Dcarbon.home="$CARBON_HOME" \
    -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager \
    -Dcarbon.config.dir.path="$CARBON_HOME/repository/conf" \
    -Djava.util.logging.config.file="$CARBON_HOME/repository/conf/etc/logging-bridge.properties" \
    -Dcomponents.repo="$CARBON_HOME/repository/components/plugins" \
    -Dconf.location="$CARBON_HOME/repository/conf"\
    -Dcom.atomikos.icatch.file="$CARBON_HOME/lib/transactions.properties" \
    -Dcom.atomikos.icatch.hide_init_file_path=true \
    -Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true \
    -Dcom.sun.jndi.ldap.connect.pool.authentication=simple  \
    -Dcom.sun.jndi.ldap.connect.pool.timeout=3000  \
    -Dorg.terracotta.quartz.skipUpdateCheck=true \
    -Djava.security.egd=file:/dev/./urandom \
    -Ddisable.hdfs.startup="false" \
    -Dfile.encoding=UTF8 \
    org.wso2.carbon.bootstrap.Bootstrap $*
    status=$?
done

Start the WSO2 Storage Server
WSO2SS_HOME/bin$sh wso2server.sh
Note: Ignore any errors at this point in time.

Access the Management Console configuration menu and create a new service principal for host/node0 with password node0.









6. Adding users and configure krb5PrincipalName of the embedded LDAP users

 

Install Apache Directory Studio or any other tools to connect to Embedded LDAP









































Change the krb5PrincipalName of the admin user to admin/node0@WSO2.ORG in LDAP and update the password with the same value.

























Adding Tenant Admin and Tenant users

Until the cache implementation please follow the below workaround to configure manually for tenant admin/users.

If tenant admin & users created also please create entries in the super tenant level as well.

i.e. Assume you created a tenant domain test.com, and admin as admin1. then logged in from tenant admin and created user tenant1. You will also need to log in as super tenant and create two users, admin1 and tenant1.

Then in the LDAP please go and change the krb5PrincipalName of the newly added entries as follows.

<username>_<Tenant domain>@WSO2.ORG

 7. Start datanode of hadoop file system

Currently namenode will up with the server start-up however datanode will not be starting. This has to be done manually.

Restart the WSO2 SS.
WSO2SS_HOME/bin$sh wso2server.sh

In a separate terminal

WSO2SS_HOME$HADOOP_SECURE_DN_USER=<OS Level User> sudo -E bin/hadoop datanode

8. Generate KDC client Ticket (TGT) to authenticate using kinit utility


Currently this has to be done manually. However in future this will be handle within the product itself meaning if your log-in to the management console will automatically generate a ticket for his/her exclusive use internally. For now irrespective of the user permission level when a user login to the management console and browse HDFS File System using a TGT generated for super-tenant user will be able to manage the entire space.

In a separate terminal

To generate a TGT for super-admin

$ kinit admin/node0
Password for admin/node0@WSO2.ORG: <admin>

To generate a TGT for a tenant-user

$ kinit tenant1_test.com
Password for t1user1_test.com@WSO2.ORG:

To
  • list the ticket: $klist
  • Renew a ticket: $kinit -R
  • Delete/Destroy a ticket: $kdestroy

9. Access the HDFS File System


Login to the Management Console and navigate to Home > Manage > hadoop File System > HDFS Explorer


Here you can do the file system(fs) operations such as create dir, rename, upload file, etc. According to the user permission it will listing down the tree structure. If a super-admin user logged in the user should be able to see his own structure and all the tenant domains to do all the fs operations. If it is a tenant-admin then he can only be able to manage his/her domain space. However if it is a tenant user then he can manage what he/her owns.






 











 







10. To test the admin services


HDFS Admin Services
    • HDFSAdmin
    • HdfsFileUploadDownloader
      Edit the HideAdminServiceWSDLs parameter and set it to false and restart the server.

      WSO2SS_HOME/repository/conf/carbon.xml
      <!-- If this parameter is set, the ?wsdl on an admin service will not give the admin service wsdl. -->
              <HideAdminServiceWSDLs>false</HideAdminServiceWSDLs>

       You can access the admin service wsdls

      Now you can use soapUI or any other tool to validate admin service functionality as well.

      Note: With the current implementation for the kerberos TGT generation this cannot be used. However with the caching implementation this can be use as well.

      11. hadoop file system operations using a terminal


      A. To format namenode

      WSO2SS_HOME$bin/hadoop namenode -format

      B. To format datanode

      WSO2SS_HOME$bin/hadoop datanode -format

      Warning: This might result in datanode may not start properly due to namespaceID. Whenever namenode is formatted it will generate a new ID which references in datanodes.

      When encounter an issue while starting the datanode like java.io.IOException: Incompatible namespaceIDs the refer this blog.


      C. hadoop shell commands (Ref 2)

      e.g.
      WSO2SS_HOME$bin/hadoop fs  -ls /user

      References:
        1. Basic concepts of Kerberos 
        2. hadoop


          No comments: