Archive for August 2012 | Monthly archive page

The future of Git is bright. It displaces probably every Code-Versioning System (CVS) system out there, and its open sourced (GPL2)

So what is being displaced by Git? Mercurial svn cvs

So what is great about Git? It’s flexibility, and ability to manage code in any code-management workflow that you can think of. Some examples are:

Local development (for Individuals) Hub-Spoke (for Teams) MegaHub-Hub-Spoke ( for multiple Teams) Spoke-Spoke (for peer-to-peer development) Spoke-Spoke-Hub-Spoke-Spoke (for peer-to-peer and Teams) and the combinations go on…

So what unique concepts drives the unique distributed development capability of Git? These are:

Efficient key-value file storage (http://git-scm.com/book/en/Git-Internals-Git-Objects) Efficient and precise history / log tracking Strong performance even on large repositories

But of the key strengths of Git is the adoption by the linux community. Git is the brainchild of Linus Tovalds (the creator of Linux).

SAP re-launches the <a href=”http://www.sap.com/solutions/technology/in-memory-computing-platform/hana/overview/index.epx”>HANA</a> (<strong>H</strong>igh-performance <strong>AN</strong>alytic <strong>A</strong>ppliance) platform in 2012 and looks to this as the “game changing” technology for BI/DW/analytics. But is it?

Driven by the corporate demand for real time analytics, the HANA platform seeks to put data into memory and dramatically improve performance. This will help address the demand for big data, predictive capabilities, and text-mining capabilities.

But doesn’t this sounds like the typical rhetoric from computing vendors that previously addressed technology issues by recommending the addition of more CPU, or RAM, or disk space. SAP HANA is delivered as a software appliance focused on the underlying infrastructure for SAP Business Objects. This <a href=”http://download.sap.com/download.epd?context=B576F8D167129B337CD171865DFF8973EBDC14E3C34A18AF1CF17ED596163658ABE46C2191175A1415B54F1837F5F0A13487B903339C6F98″>white paper</a> suggests alot of scoping is centred around hardware and infrastructure design.

HANA makes incredulous claims that traditional BI/DW folks would falter to whisper. The one that stands out is the “Combination of OLAP and OLTP” into the one database. Ouch! Feel the wrath of the stakeholders of business operations. Another claim is running analytics in “mixed operations”. Double ouch!

It’s already challenging enough to get DW/BI solutions deployed without affecting operations. BI folks have constantly advocated separate infrastructure for analytics, with the ETL window¬† as the firewall between systems. The same ETL window has also created delays for realtime analytics. To advocate moving the BI/DW infrastructure back into operations is going to be a challenge. Yes, it facilitates “closer to real-time”, but its going to be a challenge to make it work politically.

For other BI/DW vendors, this solution would be unfeasible, but because SAP also happens to the largest ERP application platform on the planet, they definitely have a good shot at consolidating their ERP and HANA’s BI analytics. Google, Facebook and the large online behemoths already do it. So why not?!

This is indeed exciting, and its definitely time to take a closer look at SAP HANA.

&nbsp;

&nbsp;

If you thought “Big Data” was already quite unmanageable, IEEE predicts a 1500% (x15) growth in data by 2015. That is 3 years from now.

On a similar scale, IEEE also suggests that terabit networks should be implemented soon to cater for demand in network traffic by 2015. This is up by x40-1000 times from today’s gigabit networks.

This probably also suggests that demand for data processing and delivery will need to increase by a similar scale. To some 10-40 times.

What products and skills will power the delivery of services for “Humungous Data”?

New Data systems – like GFS, BigTables, Hadoop, Hive, MapReduce New Data patterns – No-SQL Cloud computing – A must for elastic computing vs BYO data centres Open data systems skills – unless you plan to pay for expensive database licenses. Web Services – to tie it all together Agile Architecture – often under-rated, but is increasingly important to focus corporate development. Agile Security – also under-rated, but is increasingly important.

With corporations already struggling to manage data growth and demand, will this mean a growth of x15 in data staffing, or will a data specialist have to be x15 times more productive. I believe its a combination of both. New tools will make the data professional more effective. At the same time because of the lack of training and skills transfer, there will always be a need for the human bridge.

 

 

The future is indeed exciting.

Kudos to Brittany Wenger from Lakewood Ranch, USA for winning Google’s Science Fair Grand prize.Using a 6-node Artificial Neural Network (see her slides), and alot of cloud computing power, Brittany has managed to train the neural network to detect maligned breast tumors with an accuracy of 99.11%

Now, what is notable is that this girl is 17 years old. I was talking to some parents recently about how the amount of new knowledge being generated today is in the exponential scale. What this means is that they next generation of kids will have to learn more and in less time. Now, I am sure neural networks have been implemented by geniuses far younger than 17 years.

The comparison I would like to make here is that I learnt neural networks at age 20 (and with minimal successful commercial application), and as Britanny has a successful implementation of a neural network at age 17, I would now say that:

My kids will probably be implementing neural networks at age 14-15 Artificial intelligence is going to be more commonplace in the future.