Content

Sunday, December 18, 2016

How to work with Github.com Fork Pattern?

Login to the github account. Make sure that you have access to the repository which you want to fork.

https://github.com/githubacountoffork/repositorytofork

Hit the Fork button. Once you click on the Fork button the repository would be forked to your github account.
https://github.com/yourgithubaccount/repositoryoffork

This fork would be the origin for you to push changes to the repository that you forked. Basically you are making changes to your local fork and then giving pull request to the original source where the actual repository is.

Clone the forked master repository from the github to your local development machine.
git clone https://github.com/yourgithubaccount/repositoryoffork


Add the remote upstream to say where to get the remote repository
git remote add upstream https://github.com/githubacountoffork/repositorytofork

Verify the new remote named 'upstream'. You should see both 'origin' and 'remote' pointing back to the server link.
git remote -v


As a best practice do not work on the local master, create a  branch for every story or task
git branch branch-name


Check the branch that you have created.
The * on the branch name indicates the current active branch.
git branch

On of the greatest feature of git from other repository is that you don't have multiple folders for different release of repositories like mainline, dev, release 1 etc. You just switch between branches using the same physical repository.
The downside is that you can work on only one branch at a time. We need to ensure that we are in the right branch all the time. To switch to any given branch use the following command
git checkout branch-name

Work on the files, using IDE of your choice, to check which files are modified issue
git status

When you are done doing the changes, add those files to git and issue a commit before working on new files for committing.
git add .

To selectively add the files, go to the individual folders run the add command  using the file name or folder name.
Do a Git status and check the message about 'pending commit'. The add command has only 'staged' the changes. The changes are not committed yet.
git status

The next step is to commit the change using the following changes.
Commit will take all the 'staged' files and commit to your local branch.
git commit -m 'commit message detailing the changes.'


Check status and make sure changes are committed.
git status

Push these changes to your fork, if the branch name is not present, this branch is automatically created
git push origin 'branch-name'

Now log on the github.com website.
From the drop down select the branch name that you just 'pushed'
Create a pull request targeted to the Origin repository
This will send a pull request to the reviewer who can review and merge the changes.
After the pull request is merged to the main master, we have the option to Delete a branch. Remember we have a local copy and a remote copy. So we have to delete both. Sometimes the remote fork branch copy can be deleted by the Pull Request taker. In this case there is no need to delete the remote branch on the fork.

To delete the local branch, move to the master branch before deleting your local branch
git checkout master

Force delete the local branch
git branch -D branch-name

If the branch is not deleted by the Pull Request (PR) on the Remote Branch
git push origin branch-name

Once manual step with this approach is that you need to synchronize the remote repository to your fork. This is not done automatically as sometimes people just want to work on that version of the fork only. This is not enforced by Github.com

Get all changes from remote
IMPORTANT: make sure that you have the correct working branch before you issue this command else your would replace your other branches and conflicts can occur.
git branch

Pull changes from the remote upstream branch
git pull upstream master

Push this changes to your local fork branch
git push origin master


How to switch default java version namely Java 7 to Java 8 in Mac OS X?

Once in a while Java versions on your developer environment needs to be upgraded and the new version needs to be the default across the different software that you would be using for the product or solution.


Before changing the default, find the current default in the system by typing

java -version

This should give something like 
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

Now check the other versions installed on the system

/usr/libexec/java_home -V

This should give something like
Matching Java Virtual Machines (2):
    1.8.0_45, x86_64: "Java SE 8" /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home
    1.7.0_79, x86_64: "Java SE 7" /Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home

Now change the version to Java 8

export JAVA_HOME='/usr/libexec/java_home -v 1.8.0_45'

Verify if this is changed by giving the same command again

java -version

This should now show the changed default
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)



Saturday, December 17, 2016

Source control management for products or projects

As with any software development, one need to know a source control for keeping the version of code for few of the following reason.

1. Keep code in safe place apart from your laptop
2. Have different versions of the same code
3. Collaborate with other developers
4. Have different releases and each have its own release cycles of Service/Feature Packs, Hot fixes etc.
5. Release management to separate development code vs shippable code

The choice right now in this age is github.com

Go to the website and create a new git hub account or handle.

You have two choices
1. Make all the repository as public in which case you don't have to pay for it but your code is open and anyone can see it.
2. Make the repository as private but incur a cost starting from $7/month for personal and above for other types of account.


Source control pattern recommendation

"Fork Pattern is the way forward"
We all know that we came from traditional software development and we want to keep our branches till death and our old habits die hard. There is no reason for all developers to be working off the same branch.

The new mantra is to have a release version defined.Each developer takes a branch for himself for every story or task they want to do, make the changes, give a pull request to be merged with the master and once merged you delete your branch. In this way there are no conflicts between branches, nothing to synchronize across developers and each developer just throws his branch and takes a new one to continue. Do make sure that the solution is modular so that each of them don't step into others shoes.

So go ahead and change your archaic philosophies to accommodate this, and throw those old habit people who wants all branches of codes to be kept to work on legacy software as they don't want to move on.



Note: The cost was as of Dec 2016



Sunday, December 11, 2016

How to install JSON editor on Eclipse Neon IDE for Java Developers?

I am not sure why Eclipse thinks that JSON editor is only for the Web Developer IDE; but until they realize that the new way of development is using JSON even on Java big data world, we have to install the JSON editor inside the Eclipse Neon IDE for Java Developers for us to be able to edit JSON documents.

On the Eclipse Neon IDE for Java Developers, click on "Help" - > "Install New Software"

On the Work with, click on "Add".

On the "Add Repository" dialog
name enter "eclipseneon"

on the location enter "http://download.eclipse.org/releases/neon"

click ok.
The would try to get all the software available on this location.

Once the software list is loaded, filter by "Eclipse Web Developer Tools"

Find the software and place a check mark, follow instructions to install this software.


Once done, Eclipse Neon would ask it to you restarted.

Now you can edit JSON documents in Eclipse IDE for Java Developers.


Note: As of this writing the Eclipse JSON editor has a blunder bug, which doesn't know how to handle arrays when it formats. We are surprised by the quality of deliverable not likely of Eclipse and not sure why this was not found.

A simpler editor is Json Tools 1.0.1 is the best in terms of formatting and handles large files. It does tend to be sluggish like any xml editor as the JSON files increases in size.


Update: Feb 2019
Do not use this JSON Editor or any JSON Editor for Eclipse as its buggy. Use the Visual Studio Code as listed here Changing JSON Editor in Eclipse

Saturday, October 22, 2016

How to enable SSH on your developer Mac OSX?

For most of the big data technologies, the able to to password less ssh to each other is a must.
In order to make these technologies work, you need to enable ssh in your Mac (El Capitan).

1. Click on System Preference
2. Click on Sharing
3. On the left hand side under "Service" enable "Remote Login"

How to Setup a 3 Node Apache Hbase 1.2.3 cluster in CentOS 7?

The following needs to be done before beginning  the Apache Hadoop cluster Setup.

1. Create 3 CentOS 7 Servers HBNODE1, HBNODE2 and HBNODE3 as discussed in How to install CentOS 7 on Virtual Machine using VMWare vSphere 6 client?

2. Make sure Java 7 is installed and configured as default as discussed in How to install Java 7 and Java 8 in CentOS 7?.

3. Create the bigdatauser, bigdataadmin and the bigdatagroup as discussed in How to create a user, group and enable him to do what a super user can in CentOS7?

4. Make sure the firewall is disabled and stopped as discussed in How to turn off firewall on CentOS 7? 

5. Change etc/hosts file so that all the IPs and the names of the servers are resolved as discussed in

6. Using the bigdatauser setup password less ssh across the 3 clusters namely HBNODE1, HBNODE2 and HBNODE3 as discussed in How to setup password less ssh between CentOS 7 cluster servers?


7. Install Apache Zookeeper clusters as discussed in How to setup a 3 Node Apache Zookeeper 3.4.6 cluster in CentOS 7? Make sure you do the same as in step 5 for these servers too.
8. Install Apache Hadoop clusters as discussed in How to Setup a 3 Node Apache Hadoop 2.7.3 cluster in CentOS 7? Make sure you do the same as in step 5 for these servers too.

For each of the Servers HBNODE1, HBNODE2 and HBNODE3 do the following.



For each of the Servers HBNODE1, HBNODE2 and HBNODE3 do the following.
 
Login using the bigdataadmin
 
#create a folder for hadoop under the /usr/local directory
cd /usr/local
sudo mkdir hbase
 
#change ownership to bigdatauser
sudo chown -R bigdatauser:bigdatagroup hbase

#Switch to bigdataauser
su bigdataauser

#move to a download folder and download hbase
wget http://www-eu.apache.org/dist/hbase/1.2.3/hbase-1.2.3-bin.tar.gz

#unzip the files
tar xzf hbase-1.2.3-bin.tar.gz

#move this to the common directory
mv hbase-1.2.3 /usr/local/hbase

#go to the hbase directory
cd /usr/local/hbase/hbase-1.2.3

#move to config directory
cd conf

#edit hbase-env.sh
vi hbase-env.sh

#change Java Home Path
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64/jre

#disable internal zookeeper
export HBASE_MANAGES_ZK=false

#save
wq

#edit the hbase-site.xml

vi hbase-site.xml

<configuration>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
    </description>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://hdnode1:9000/user/hadoop/hbase</value>
    <description>The directory shared by RegionServers.</description>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>zknode1,zknode2,zknode3</value>
    <description>The Zookeeper ensemble</description>
  </property>
</configuration>

#save
wq

#edit the regsionservers file only on the master node hbnode1

vi regionservers
hbnode2
hbnode3

#save
wq

#move to the root folder and start the HBase cluster from the master node hbnode1
cd /usr/local/hbase/hbase-1.2.3

bin/start-hbase.sh

#This would start the regions servers in other node too
#check for the following process 

ps aux | grep hbase

#HMaster on master hbnode1 and HRegionServer on other nodes.

#view the status of the cluster in the following URL
http://hbnode1:16010/master-status

This should display the nodes as well as other details like Zookeeper etc.


Wednesday, October 12, 2016

Linux systems folder structure - File System Hierarchy Standard (FHS)

The following link describes the Linux File System Hierarchy Standard structure that all developers should be aware of when using linux systems. This should also give an idea on where to place the softwares we develop for deployment. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/3/html/Reference_Guide/s1-filesystem-fhs.html
Pay attention to the following folders

/usr/
/usr/libexec
/usr/local
/var/lib



3.2. Overview of File System Hierarchy Standard (FHS)

Red Hat Enterprise Linux uses the Filesystem Hierarchy Standard (FHS) file system structure, which defines the names, locations, and permissions for many file types and directories.
The FHS document is the authoritative reference to any FHS-compliant file system, but the standard leaves many areas undefined or extensible. This section is an overview of the standard and a description of the parts of the file system not covered by the standard.
Compliance with the standard means many things, but the two most important are compatibility with other compliant systems and the ability to mount a /usr/ partition as read-only. This second point is important because the directory contains common executables and should not be changed by users. Also, since the /usr/ directory is mounted as read-only, it can be mounted from the CD-ROM or from another machine via a read-only NFS mount.

3.2.1. FHS Organization

The directories and files noted here are a small subset of those specified by the FHS document. Refer to the latest FHS document for the most complete information.
The complete standard is available online at http://www.pathname.com/fhs/.

3.2.1.1. The /boot/ Directory

The /boot/ directory contains static files required to boot the system, such as the Linux kernel. These files are essential for the system to boot properly.
WarningWarning
Do not remove the /boot/ directory. Doing so will render the system unbootable.

3.2.1.2. The /dev/ Directory

The /dev/ directory contains file system entries which represent devices that are attached to the system. These files are essential for the system to function properly.

3.2.1.3. The /etc/ Directory

The /etc/ directory is reserved for configuration files that are local to the machine. No binaries are to be put in /etc/. Any binaries that were once located in /etc/ should be placed into/sbin/ or /bin/.
The X11/ and skel/ directories are subdirectories of the /etc/ directory:
/etc
  |- X11/
  |- skel/
The /etc/X11/ directory is for X Window System configuration files such as XF86Config. The /etc/skel/ directory is for "skeleton" user files, which are used to populate a home directory when a user is first created.

3.2.1.4. The /lib/ Directory

The /lib/ directory should contain only those libraries needed to execute the binaries in /bin/ and /sbin/. These shared library images are particularly important for booting the system and executing commands within the root file system.

3.2.1.5. The /mnt/ Directory

The /mnt/ directory is for temporarily mounted file systems, such as CD-ROMs and 3.5 diskettes.

3.2.1.6. The /opt/ Directory

The /opt/ directory provides storage for large, static application software packages.
A package placing files in the /opt/ directory creates a directory bearing the same name as the package. This directory, in turn, holds files that otherwise would be scattered throughout the file system, giving the system administrator an easy way to determine the role of each file within a particular package.
For example, if sample is the name of a particular software package located within the /opt/ directory, then all of its files are placed in directories inside the /opt/sample/ directory, such as /opt/sample/bin/ for binaries and /opt/sample/man/ for manual pages.
Large packages that encompass many different sub-packages, each of which accomplish a particular task, are also located in the /opt/ directory, giving that large package a way to organize itself. In this way, our sample package may have different tools that each go in their own sub-directories, such as /opt/sample/tool1/ and /opt/sample/tool2/, each of which can have their own bin/man/, and other similar directories.

3.2.1.7. The /proc/ Directory

The /proc/ directory contains special files that either extract information from or send information to the kernel.
Due to the great variety of data available within /proc/ and the many ways this directory can be used to communicate with the kernel, an entire chapter has been devoted to the subject. For more information, please refer to Chapter 5 The proc File System.

3.2.1.8. The /sbin/ Directory

The /sbin/ directory stores executables used by the root user. The executables in /sbin/ are only used at boot time and perform system recovery operations. Of this directory, the FHS says:
/sbin contains binaries essential for booting, restoring, recovering, and/or repairing the system in addition to the binaries in /bin. Programs executed after /usr/ is known to be mounted (when there are no problems) are generally placed into /usr/sbin. Locally-installed system administration programs should be placed into /usr/local/sbin.
At a minimum, the following programs should be in /sbin/:
arp, clock,
halt, init, 
fsck.*, grub
ifconfig, lilo, 
mingetty, mkfs.*, 
mkswap, reboot, 
route, shutdown, 
swapoff, swapon

3.2.1.9. The /usr/ Directory

The /usr/ directory is for files that can be shared across multiple machines. The /usr/ directory is often on its own partition and is mounted read-only. At minimum, the following directories should be subdirectories of /usr/:
/usr
  |- bin/
  |- dict/
  |- doc/
  |- etc/
  |- games/
  |- include/
  |- kerberos/
  |- lib/
  |- libexec/     
  |- local/
  |- sbin/
  |- share/
  |- src/
  |- tmp -> ../var/tmp/
  |- X11R6/
Under the /usr/ directory, the bin/ directory contains executables, dict/ contains non-FHS compliant documentation pages, etc/ contains system-wide configuration files, games is for games, include/ contains C header files, kerberos/ contains binaries and other Kerberos-related files, and lib/ contains object files and libraries that are not designed to be directly utilized by users or shell scripts. The libexec/ directory contains small helper programs called by other programs, sbin/ is for system administration binaries (those that do not belong in the /sbin/ directory), share/ contains files that are not architecture-specific, src/ is for source code, and X11R6/ is for the X Window System (XFree86 on Red Hat Enterprise Linux).

3.2.1.10. The /usr/local/ Directory

The FHS says:
The /usr/local hierarchy is for use by the system administrator when installing software locally. It needs to be safe from being overwritten when the system software is updated. It may be used for programs and data that are shareable among a group of hosts, but not found in /usr.
The /usr/local/ directory is similar in structure to the /usr/ directory. It has the following subdirectories, which are similar in purpose to those in the /usr/ directory:
/usr/local
       |- bin/
       |- doc/
       |- etc/
       |- games/
       |- include/
       |- lib/
       |- libexec/
       |- sbin/
       |- share/
       |- src/
In Red Hat Enterprise Linux, the intended use for the /usr/local/ directory is slightly different from that specified by the FHS. The FHS says that /usr/local/ should be where software that is to remain safe from system software upgrades is stored. Since software upgrades can be performed safely with Red Hat Package Manager (RPM), it is not necessary to protect files by putting them in /usr/local/. Instead, the /usr/local/ directory is used for software that is local to the machine.
For instance, if the /usr/ directory is mounted as a read-only NFS share from a remote host, it is still possible to install a package or program under the /usr/local/ directory.

3.2.1.11. The /var/ Directory

Since the FHS requires Linux to mount /usr/ as read-only, any programs that write log files or need spool/ or lock/ directories should write them to the /var/ directory. The FHS states/var/ is for:
...variable data files. This includes spool directories and files, administrative and logging data, and transient and temporary files.
Below are some of the directories found within the /var/ directory:
/var
  |- account/
  |- arpwatch/
  |- cache/
  |- crash/
  |- db/
  |- empty/
  |- ftp/
  |- gdm/
  |- kerberos/
  |- lib/
  |- local/
  |- lock/
  |- log/
  |- mail -> spool/mail/
  |- mailman/
  |- named/
  |- nis/
  |- opt/
  |- preserve/
  |- run/
  +- spool/
       |- at/
       |- clientmqueue/
       |- cron/
       |- cups/
       |- lpd/
       |- mail/
       |- mqueue/
       |- news/
       |- postfix/ 
       |- repackage/
       |- rwho/
       |- samba/ 
       |- squid/
       |- squirrelmail/
       |- up2date/ 
       |- uucppublic/
       |- vbox/
  |- tmp/
  |- tux/
  |- www/
  |- yp/
System log files such as messages/ and lastlog/ go in the /var/log/ directory. The /var/lib/rpm/ directory contains RPM system databases. Lock files go in the /var/lock/directory, usually in directories for the program using the file. The /var/spool/ directory has subdirectories for programs in which data files are stored.